Exploring sentence level query expansion in language modeling based information retrieval
Ganguly, DebasisORCID: 0000-0003-0050-7138, Leveling, JohannesORCID: 0000-0003-0603-4191 and Jones, Gareth J.F.ORCID: 0000-0003-2923-8365
(2010)
Exploring sentence level query expansion in language modeling based information retrieval.
In: the 8th International Conference on Natural Language Processing ICON 2010, 8-11 Dec. 2010, Kharagpur, India..
We introduce two novel methods for query expansion in information retrieval (IR). The basis of these methods is to add the most similar sentences extracted from
pseudo-relevant documents to the original query. The first method adds a fixed number of sentences to the original query, the second a progressively decreasing number of sentences. We evaluate these methods on the English and Bengali test collections from the FIRE workshops. The major
findings of this study are that: i) performance is similar for both English and Bengali; ii) employing a smaller context (similar sentences) yields a considerably higher
mean average precision (MAP) compared to extracting terms from full documents (up to 5.9% improvemnent in MAP for
English and 10.7% for Bengali compared to standard Blind Relevance Feedback (BRF); iii) using a variable number of sentences for query expansion performs better and shows less variance in the best MAP for different parameter settings; iv) query expansion based on sentences can
improve performance even for topics with low initial retrieval precision where standard BRF fails.