Autores: | Alberto Barrón-Cedeño |
URL: | http://users.dsic.upv.es/grupos/nle/resources/abc/download-panpc10.html |
Contacto: | Paolo Rosso <prossodsic.upv.es> |
Descripción
This corpus contains documents in which artificial plagiarism has been inserted automatically: 8.4 GB, 162,000 cases of plagiarism
Funcionalidad
It allows carrying out experiments on plagiarism detection.
Tecnología
-
Requisitos técnicos
-
Módulos
-
Innovación
The only publicly available corpus for plagiarism detection.
Desarrollo
- This is the corpus developed for the 2nd competition on plagiarism detection.
- MICINN research project TEXT-ENTERPRISE 2.0 TIN2009-13391-C04-03 (Plan I+D+i).
- Developed as part of the Ph.D. Thesis of Alberto Barrón-Cedeño (writing-up phase).
Publicaciones
Potthast M., Barrón-Cedeño A., Stein B., Rosso P. An Evaluation Framework for Plagiarism Detection. In: Proc. of the 23rd International Conference on Computational Linguistics, COLING-2010, Beijing, China, August 23-27, 2010