Autores: | Davide Buscaldi, Paolo Rosso |
URL: | http://users.dsic.upv.es/grupos/nle/resources/geo-wn/download.html |
Contacto: | Davide Buscaldi <davide.buscaldiuniv-orleans.fr>, Paolo Rosso <prossodsic.upv.es> |
Descripción
Geo-WordNet 3.0 connects WordNet synsets with their geographical coordinates (latitude and longitude). In the new 3.0 version, the source of geographical data was Geonames (http://www.geonames.org). Therefore, it was possible to assign to every synset a Geonames ID, together with the coordinates. Geo-WordNet is constituted of a plain text file where every line contains the following fields: <synset offset> <geonames ID> <latitude> <longitude>, separated by a tabulation character. The synsetoffsets used in Geo-WordNet 3.0 correspond to those included in WordNet 3.0.
Funcionalidad
Geo-WordNet extends WordNet with geographical data, allowing all WordNet-based applications to associate spatial information to un-structured texts.
Tecnología
Geo-WordNet 3.0 has been developed in Java using the MIT Java WordNet Interface (MIT JWI), with data from WordNet 3.0 and Geonames. Geographical data were loaded into a PostgreSQL database.
Requisitos técnicos
The only requirement is WordNet 3.0 as a reference for synset offsets.
Módulos
The distribution is a tar.gz file including a folder named “GeoWN3.0″, which contains the following files: 00README.txt, LICENSE, mapping.dat, mapping.txt, not_mapped.txt. Main data are contained in the “mapping.dat” file, while “mapping.txt” and “not_mapped.txt” contain explications on the associations between WordNet synsets and Geonames IDs.
Innovación
Geo-WordNet is the only resource associating geographical information to synsets.
Desarrollo
MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i). Developed as part of the Ph.D. Thesis of Davide Buscaldi “Toponym Disambiguation in Information Retrieval”, Universidad Politécnica de Valencia, 2010
Publicaciones
- Buscaldi, D., Rosso, P. Geo-WordNet: Automatic Georeferencing of WordNet. In: Proc. 5th Int. Conf. on Language Resources and Evaluation, LREC-2008, Marrakech, Morocco, May 2008.
- Buscaldi D., Rosso P. Using GeoWordNet for Geographical Information Retrieval. In: Revised Selected Papers CLEF-2008, Springer-Verlag, LNCS(5706), pp. 863-866.
- Buscaldi D., Toponym Disambiguation in Information Retrieval, Ph.D. Thesis, Universidad Politécnica de Valencia, 2010.