Autores: | M. Antònia Martí, Mariona Taulé, Lluís Màrquez and Manuel Bertran (CLiC-UB) |
URL: | http://clic.ub.edu/ancora |
Contacto: | M. Antònia Martí <amarti |
Descripción
AnCora-DEP-Ca is the AnCora-Ca multilevel annotated corpus of Catalan in dependency-based representation, consisting of 500,000 words approximately.
Funcionalidad
AnCora-DEP-Ca can be used as source of information for inducing grammars, developing, improving and/or evaluating syntactic parsers and algorithms for semantic role labelling, dependency-based. This corpus is used in the CoNLL Shared Task 2009: Syntactic and Semantic Dependencies in Multiple Languages, where the core of the task is to predict syntactic and semantic dependencies and their labelling.
Tecnología
Data stored in XML format
Requisitos técnicos
Módulos
Innovación
At present AnCora-DEP-Ca is the largest corpus multilevel annotated available in dependency format freely downloaded.
Desarrollo
The development of AnCora-DEP-Ca has been funded by the following projects: CESS-ECE (HUM2004-21127) and Lang2World (TIN2006-15265-C06-06, and the funding given by the Catalan Secretary of Linguistic Policy.
Publicaciones
Civit, M., M.A. Martí & N. Bufí (2006) ‘Cat3LB and Cast3LB: from Constituents to dependencies’, Springer Verlag, Advances in Natural Language Processing (LNAI, 4139), pp. 141-153. Berlin, ISSN: 0302-9743.