Reconstruction of itineraries from annotated text with an informed spanning tree algorithm

Authors: Ludovic Moncla, Mauro Gaio, Javier Nogueras-Iso, Sébastien Mustière
Year: 2016
Venue: International Journal of Geographic Information Science (IJGIS), Volume 30 Issue 6, 2016
Considerable amounts of geographical data are still collected not in form of GIS data but just as natural language texts. This paper proposes an approach for the automatic geocoding of itineraries described in natural language. This approach needs as an input a text annotated with part-of-speech and geo-semantic tags. The proposed method is divided into three main steps. First, we build a complete graph where vertices represent locations, and all vertices are connected to each other by undirected edges. We assign a weight to all the edges of the complete graph using a multi-criteria analysis approach. Then we compute a minimum spanning tree to obtain an undirected acyclic graph connecting all vertices. And finally, we transform this graph into a partially directed acyclic graph in order to identify the sequence of waypoints and build an approximation of a plausible footprint of the itinerary described. Additionally, the rationale of the proposed approach has been verified with a set of experiments on a corpus of hiking descriptions.