INformation, FILtering, Evaluation

INFILE: INformation, Filtering, Evaluation

LREC2010 Workshop on Evaluation of Information Filtering Systems in a CompetitiveIntelligence Framework Workshop, 22 May 2010, Malta

CALL FOR PAPERS

Evaluation of Information Filtering Systems in a Competitive Intelligence Framework Workshop 22 May 2010 Held in conjunction with LREC 2010, Malta

Aims

The constant growth of publicly available information has entailed a consequently similar growth in the research areas devoted to the automated organization of this information. In particular, Information Filtering systems have been developed to tackle the issues of many applications of business intelligence, from mail categorization to news routing and technology watch. However, the theoretically challenging models of these systems have seldom been evaluated in real usage context. Indeed, standard evaluation benchmarks usually introduce artefacts that simplify the task. For instance, in the context of competitive intelligence, systems must filter documents without any global information on a !Hcollection!I but can use feedback from the user, and must be efficient enough to deal with a real-time document stream. Furthermore, with the increase of mondialized access and availability of the information, sources may be found in many languages, and the multilinguality issue must also be considered. TREC Adaptive Filtering Track and INFILE track at CLEF for multilingual information filtering have tried to propose evaluation frameworks closer to the usage. The goal of this workshop is to study different aspects of Information Filtering Evaluation and to bring together researchers from the community of Information Filtering to develop new evaluation frameworks and confront current models with these new evaluation models. Submissions are expected to propose new insights on evaluation methodologies and resources for Information Filtering or to present Information Filtering models that will meet the requirements of a real usage evaluation (in particular, researchers that submit papers presenting Information Filtering Models are encouraged to evaluate their model using an existing benchmark such as the INFILE benchmark, available at http://www.infile.org).

Topics

Both theoretical and practical research papers are welcome from both research and industrial communities addressing the main workshop theme, in any aspect including:

  • Resources and methodologies for Evaluation of Information Filtering Systems in real usage context
  • New models for Information Filtering, that tackle one or several issues of this kind of evaluation: efficiency, adaptivity, multilinguality, etc.
  • Results of the Evaluation of Information Filtering Systems
  • User studies of the use of Information Filtering Systems in Competitive Intelligence
Submission information

Papers will be submitted to the workshop via the START LREC Conference Manager, under https://www.softconf.com/lrec2010/InFile2010/ Authors should submit a PDF file of no more than 10 pages, following the LREC conference formatting details. Papers will be reviewed by three members of the Program Committee, Accepted papers will be published in the workshop Proceedings. When submitting a paper from the START page, authors will be asked to provide essential information about resources (in a broad sense, i.e. also technologies, standards, evaluation kits, etc.) that have been used for the work described in the paper or are a new result of your research. For further information on this new iniative, please refer to http://www.lrec-conf.org/lrec2010/?LREC2010-Map-of-Language-Resources.

Important dates
  • Paper submission deadline (extended) 12 March 2010
  • notification of acceptance 25 March 2010
  • Final version of accepted paper 4 April 2010
  • Workshop half-day 22 May 2010
Workshop Organizers
  • Romaric Besançon CEA LIST email: romaric.besancon_AT_cea.fr
  • Djamel Mostefa ELDA email: mostefa_AT_elda.org
Program Committee
  • Romaric Besançon, CEA LIST
  • Stéphane Chaudiron Universiti Lille 3
  • Khalid Choukri, ELDA
  • Christian Fluhr, HossurTech
  • Meriama Laïb, CEA LIST
  • Djamel Mostefa, ELDA
  • Ismail Timimi, Universiti Lille 3

End of Call for Paper

2009 evaluation campaign

Participation

June 2, 2009 The Interactive filtering task is about to start Please download the new client software infileClient-1.2.tgz to be installed on your side. Please check the Interactive Filtering evaluation protocol.

April 2, 2009 The Batch Filtering evaluation task has just started. If you didn't receive the evaluation data, please contact the organizers.

If you want to join the Batch Filtering or Adaptative Filtering evaluation tasks, please register through the CLEF website.

REGISTRATION is now open. If you are interested in participating in INFILE 2009, please visit the CLEF Registration page for instructions and registration form. Please remember to fill in the registration form and send 2 copies of the End-User agreement to the CLEF coordinator.

The 2009 call for participation is available.

INFILE welcomes participation of any institution, academic an industrial.

The participation is free of charge and participants can keep and use the development and evaluation data for free after the evaluations for research and development purposes.

Background

INFILE (INformation, FILtering, Evaluation), sponsored by the French National Research Agency, is a cross-language adaptive filtering evaluation campaign organized by the CEA LIST (M. Laïb, R. Besançon), ELDA (D. Mostefa, K. Choukri), and the University of Lille 3 (S. Chaudiron, I. Timimi).

The goal of the INFILE project is to organize evaluation campaigns for monolingual and multilingual information filtering systems based on close-to-real-usage conditions for intelligence applications. Both methodology and metrics are discussed within a group of experts, set up at the beginning of the project.

The campaign is directed to R&D laboratories and software publishers that would like to evaluate their technology, in a real-use context and according to the needs of information routing for technology watch. This project is not limited to French participants.

The languages under consideration in the evaluation campaign are English, Arabic and French, either in monolingual (or multilingual) filtering, or in interlingual filtering.

The documents (input information) is from two different information domains: scientific and technical, on the one hand, and journalistic, on the other. These two information types correspond to a context of technology watch for the first one, and to a context of general intelligence for the second one (watching for political information, for image information, or the follow-up of operations such as mergings and acquisitions, etc.). The evaluation campaign takes into account the results from the discussions between the organizers and the community of researchers, the software publishers and the previous evaluation campaign. During the first phase, a dry run will carried out in order to ensure the good functioning of the system evaluation.

Two corpora are developed with profiles and the associated field results.

Current evaluation campaign

INFILE 2009 is a pilot track in CLEF ( Cross-Language Evaluation Forum) and is scientifically endorsed by NIST TREC (Text Retrieval Evaluation Conference).

In 2009 two evaluation tasks are considered: interactive filtering and batch filtering.

Tasks

Interactive filtering task

For this task, a set of documents is automatically sent to each of the systems being tested by the evaluating system. The assignment of each document to 0, 1 or several profiles must be returned automatically. The evaluating system will return the errors, thus allowing the system to improve its performance.

This process will be repeated a number of times in order for any improvement in the system's performance to be visualized and assessed.

A client-server architecture has been developed to support this protocol.

Batch filtering task

For the batch task, the whole corpus of documents and the set of profiles are provided to the participants and the systems are expected to give back the results of the filtering system.

Dissemination

The results will be computed and communicated to the participants for discussion. The results of the campaign and the new evaluation methods will be presented in the framework of a workshop at the end of the project and will be published anonymously.

At the end of the project, an evaluation kit will be made available to the community. With this evaluation kit, new teams will be able to assess and compare their system's results with those of the participants, in the same conditions as during the evaluation campaign.

Resources

Corpus

The InFile corpus is made of newswires provided by the Agence France Presse (AFP) for research purpose.

For InFile, we selected 3 languages, (Arabic, English and French) and a 3 years period (2004-2006) which represents a collection of about one and half millions newswires for around 10 GB. Newswires are available in three languages, Arabic, English and French but are not translations from a language to another. News articles are encoded in XML format and follow the News Markup Language (NewsML) specifications.

Here are some examples for Arabic, English and French.

Only 100 000 documents of each language are used for the filtering test, in order to cope with the time constraints of an interactive filtering process. These documents correspond to the set of relevant documents for the topics completed by a set of non-relevant documents as shown in the figure below

Test collection construction

Development data

Schedule

  • January 2009: Registration Opens.
  • April 01st to May 30th, 2009 : session of Batch Filtering.
  • June 01st to June 30th, 2009 : session of Adaptive Filtering.
  • July 15th, 2009: Communication of Individual Results.
  • August 30th, 2009 : Submission of Paper for CLEF.
  • September 30th to 2 October, 2009: CLEF workshop in Corfu, Greece

Contact

If you wish to participate in the INFILE Campaign or if you are interested in participating in the group of experts of this project, please contact us

Publications

  • R. Besançon, D. Mostefa, I. Timimi, S. Chaudiron, M. Laib, K. Choukri,”Arabic, English and Frensh : three Languages in a Filtering Systems Evaluation Project”, MEDAR, Cairo (Egypt), April 2009.
  • Chaudiron S., Timimi I., Besançon R., Mostefa D., Laib M., Choukri K. (2009), « L'évaluation d'un système de filtrage automatique de l'information en contexte de veille : cadre réflexif et premiers retours », (to appear) In the Systèmes d'Information et Intelligence Economique (SII'2009), Hammamet, Tunisia, February 12-14 2009.
  • Besançon R., Chaudiron S., Mostefa D., Hamon O., Timimi I., Choukri K.(2008) Overview of the CLEF 2008 INFILE Pilot Track In Working Notes of the Cross Language Evaluation Forum (CLEF 2008), Aarhus, Sept. 2008.
  • Chaudiron S., Timimi I. (2008), « Information Filtering as a Knowledge Organization process: techniques and evaluation » In the International Society for Knowledge Organization (ISKO'08), Montreal, August 5-8 2008.
  • Chaudiron S., Besançon R., Mostefa D., Timimi I., Laib M., Choukri K. (2008) InFile : une campagne d'évaluation des logiciels de filtrage d'information textuelle In Actes du Colloque International en traductologie et TAL, Oran, Algeria, June 2008.
  • Besançon R., Chaudiron S., Mostefa D., Timimi I. Choukri K. (2008) The InFile project: a crosslingual filtering systems evaluation campaign In Proceedings of LREC 2008, Marrakech, May 2008.
 
start.txt · Last modified: 2010/03/03 11:38 by djamel
 
Recent changes RSS feed Creative Commons License Donate Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki