Autores: | Laura Alonso |
URL: | http://russell.famaf.unc.edu.ar/~laura/shallowdisc4summ/discmar |
Contacto: | Laura Alonso Alemany. Facultad de Matemática, Astronomía y Física. Universidad Nacional de Córdoba, Argentina. |
Descripción
This is the seminal discourse marker lexicon used in the thesis Representing discourse for automatic text summarization via shallow NLP techniques. The discourse markers listed here were the primary source of evidence to draw the semantic maps to obtain an inventory of basic discursive meanings. This lexicon is also the basis for the implementations of a discourse segmenter and for the discourse analysis exploited by the e-mail summarizer Carpanta. The lexicon is parallel in three languages: Catalan, Spanish and English. Therefore, in this starting version of the lexicon we have only included those discourse markers that have a near-synonym in one of the other languages. The lexicon is formed by 84 discourse markers, representing different discursive meanings. Some discourse markers have been assigned to more or less than one meaning per dimension, because they are ambiguous or underspecified, respectively. In this lexicon, discourse markers are characterized by their structural (continuation or elaboration) and semantic (revision, cause, equality, context) meanings, and they are also associated to a morphosyntactic class (part of speech, PoS), one of adverbial (A), phrasal (P) or conjunctive (C).
Funcionalidad
The semantic information associated to discourse markers can be integrated into any tool that exploits these lexical items as source of evidence of discursive structure in texts. It has been integrated in a segmenter in discursive units and in some automatic summarizers.
Tecnología
The lexicon is in raw text.
Requisitos técnicos
Módulos
Innovación
It constitutes a lexico-semantic resource that, to our knowledge, was not existing for Spanish and Catalan.
Desarrollo
This lexicon represents part of the work carried out in writing the doctoral dissertation “Representing discourse for automatic text summarization via shallow NLP” by Laura Alonso.
Publicaciones
- Laura Alonso i Alemany, Ezequiel Andújar Hinojosa and Robert Sola Salvatierra, (2004), A framework for feature-based description of low level discourse, in Discourse Annotation, workshop at the ACL’04.
- Laura Alonso, Jennafer Shih, Irene Castellón, Lluís Padró, (2003), An Analytic Account of Discourse Markers for Shallow NLP, in The Meaning and Implementation of Discourse Particles , whorkshop held as part of the Fifteenth European Summer School in Logic, Language and Information, ESSLLI’03, 18-19 August, Vienna, Austria.