A manually written three-sentence segment summary.
Topics
25 topics, written in the usual CLEF title, description, and narrative format. Topics will be available in Czech, English, French, German, Russian, and Spanish. Other topic languages can be created upon request (if translators with the needed language skills are available).
Additional Resources
In order to facilitate broad participation, the basic test collection is formatted in the same way as a typical CLEF ad hoc test collection. The following additional resources will also be available to support system development:
Around 40 representative training topics, with relevance judgments for the same collection of interviews that will be used in the evaluation.
Scripts for generating alternative relevance judgments for the training topics that can be used to support detailed failure analysis.
Scripts for generating richer metadata for each segment using synonymy, part-whole, and is-a thesaurus relationships. This capability can be used with the automatically assigned thesaurus categories or (for constrastive runs) with the manually assigned thesaurus categories.
For each interview, a collection of metadata that describes that interview. This includes basic biographical details (e.g., interviewee name and birthdate) and half-page free text summary of the interviewis also offered. Since this metadata is created manually, it may be used only for contrastive runs (i.e., not for the one required run).