Website for CLEF-CLSR 2006 launched! Click here for last year's website.

About the Track

The goal of the CLEF Cross-Language Speech Retrieval (CL-SR) track is to develop and evaluate systems for ranked retrieval of spontaneous converational speech.

In 2005, the track built a reusable test collection for searching spontaneous conversational English speech using queries in five languages (Czech, English, French, German and Spanish), speech recognition for spoken words, manually and automatically assigned controlled vocabulary descriptors for concepts, dates and locations, manually assigned person names, and hand-written segment summaries (2005 CLEF CL-SR Track Report).

The 2006 CL-SR track may extend that collection to include additional English speech, and perhaps additional resources (e.g., word lattices and more accurate speech recognition). A second test collection of Czech speech (with a no-boundary evaluation condition) will also be created. Multilingual topic sets with 25 topics will be created for each language in 2006. The track is coordinated by the University of Maryland and Dublin City University.

Participation in the track is very easy - at a minimum, teams can treat it as a simple CLIR (or even monolingual IR) task. One of our central goals is to create a community that has both interest in and experience with IR in collections of spontaneous conversational speech.


Doug Oard  oard at umd.edu
Gareth Jones  Gareth.Jones at computing.dcu.ie

Mailing List

Available at clef-clsr (at) umiacs.umd.edu to facilitate the discussion of ideas and the communication of information about the track to all interested participants.