Skip to main content
DORAS
DCU Online Research Access Service
Login (DCU Staff Only)
Exploring sentence level query expansion in language modeling based information retrieval

Ganguly, Debasis ORCID: 0000-0003-0050-7138, Leveling, Johannes ORCID: 0000-0003-0603-4191 and Jones, Gareth J.F. ORCID: 0000-0003-2923-8365 (2010) Exploring sentence level query expansion in language modeling based information retrieval. In: the 8th International Conference on Natural Language Processing ICON 2010, 8-11 Dec. 2010, Kharagpur, India..

Full text available as:

[img]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
194kB

Abstract

We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can improve performance even for topics with low initial retrieval precision where standard BRF fails.

Item Type:Conference or Workshop Item (Paper)
Event Type:Conference
Refereed:Yes
Uncontrolled Keywords:Blind Relevance Feedback; BRF; query expansion
Subjects:Computer Science > Information retrieval
DCU Faculties and Centres:Research Initiatives and Centres > Centre for Next Generation Localisation (CNGL)
Research Initiatives and Centres > National Centre for Language Technology (NCLT)
DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Use License:This item is licensed under a Creative Commons Attribution-NonCommercial-Share Alike 3.0 License. View License
ID Code:16038
Deposited On:17 Jun 2011 13:20 by Shane Harper . Last Modified 25 Oct 2018 10:44

Downloads

Downloads per month over past year

Archive Staff Only: edit this record

Altmetric
- Altmetric
+ Altmetric
  • Student Email
  • Staff Email
  • Student Apps
  • Staff Apps
  • Loop
  • Disclaimer
  • Privacy
  • Contact Us