Browse DORAS
Browse Theses
Search
Latest Additions
Creative Commons License
Except where otherwise noted, content on this site is licensed for use under a:

DCU at WikipediaMM 2009: Document expansion from Wikipedia abstracts

Min, Jinming and Wilkins, Peter and Leveling, Johannes and Jones, Gareth J.F. (2009) DCU at WikipediaMM 2009: Document expansion from Wikipedia abstracts. In: CLEF 2009: Workshop on Cross-Language Information Retrieval and Evaluation, 30 Sept. - 2 Oct. 2009, Corfu, Greece.

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
204Kb

Abstract

In this paper, we describe our participation in the WikipediaMM task at CLEF 2009. Our main efforts concern the expansion of the image metadata from the Wikipedia abstracts collection DBpedia. Since the metadata is short for retrieval by query words, we decided to expand the metadata using a typical query expansion method. In our experiments, we use the Rocchio algorithm for document expansion. Our best run is in the 26th rank of all 57 runs which is under our expectation, and we think that the main reason is that our document expansion method uses all the words from the metadata documents which contain words which are unrelated to the content of the images. Compared with our text retrieval baseline, our best document expansion run improves MAP by 11.17%. As one of our conclusions, we think that the document expansion can play an effective factor in the image metadata retrieval task. Our content-based image retrieval uses the same approach as in our participation in ImageCLEF 2008.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Query formulation; Relevance feedback; Document Expansion
Subjects:Computer Science > Information retrieval
DCU Faculties and Centres:Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16042
Deposited On:17 Jun 2011 15:02 by Shane Harper. Last Modified 27 Oct 2011 11:50

Download statistics

Archive Staff Only: edit this record