SQUASH is a modular question answering system for the Spanish language. It enhances traditional search engine functionality by providing precise answers in real time to questions in natural language like “¿Cuándo se firmó el Tratado de Maastrich? (When was the Maastricht treaty signed?)”. It reduces significantly the time a user must spend searching for precise information in textual databases.
The system is composed of rules to select the type of information needed by a question and generate a suitable query for an information retrieval system. It also includes information extraction components to select and rank from documents appropriate sentences and answers.
The system integrates technology for question analysis, information extraction and information retrieval.
The system is implemented in Java and requires modules for Information Retrieval (IR) and Language Analysis. Several IR systems have been integrated (Lucene, Xapian and Google API). Currently Daedalus STILUS is used for Language Analysis.
Modules for preprocessing information includes language analysis and indexing libraries. Modules for online querying perform question classification, question analysis, query generation, sentence retrieval, answer extraction and answer ranking.
SQUASH is the result of advances in natural language processing (technological push) and the need of fast semantic search engines to alleviate information overload (market pull). The system provides precise answer from Spanish in real time.
SQUASH has been the result of the joint collaboration of a multidisciplinary team of researchers from the Universidad Carlos III de Madrid, Universidad Politécnica de Madrid and from the Universidad Autónoma de Madrid. It has been developed during more than 4 years, from the experience gained in several funded research projects. It has been independently evaluated in CLEF (Cross Lingual Evaluation Forum).
- de Pablo-Sánchez, C., González-Ledesma, A., Martínez-Fernández, J., Guirao, J., Martínez, P. and Moreno, A. “MIRACLE’s Cross-Lingual Question Answering Experiments with Spanish as a Target Language,” Accessing Multilingual Information Repositories (), 2006, pp. 488–491.
- de Pablo-Sánchez, C., González-Ledesma, A., Moreno, A. and Vicente, M. T. “MIRACLE experiments in QA@CLEF 2006 in Spanish: main task, real-time QA and exploratory QA using Wikipedia (wiQA),” Evaluation of Multilingual and Multi-modal Information Retrieval (4730/2007), 2007, pp. 463-472.