Tempo-lexical context driven word embedding for cross-session search
task extraction
Sen, Procheta, Ganguly, DebasisORCID: 0000-0003-0050-7138 and Jones, Gareth J.F.ORCID: 0000-0003-2923-8365
(2018)
Tempo-lexical context driven word embedding for cross-session search
task extraction.
In: 16th Annual Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 1-6 June 2018, New Orleans, LA, USA.
Search task extraction in information retrieval
is the process of identifying search intents over
a set of queries relating to the same topical information need. Search tasks may potentially
span across multiple search sessions. Most existing research on search task extraction has
focused on identifying tasks within a single
session, where the notion of a session is defined by a fixed length time window. By contrast, in this work we seek to identify tasks
that span across multiple sessions. To identify tasks, we conduct a global analysis of
a query log in its entirety without restricting
analysis to individual temporal windows. To
capture inherent task semantics, we represent
queries as vectors in an abstract space. We
learn the embedding of query words in this
space by leveraging the temporal and lexical
contexts of queries. To evaluate the effectiveness of the proposed query embedding, we
conduct experiments of clustering queries into
tasks with a particular interest of measuring
the cross-session search task recall. Results of
our experiments demonstrate that task extraction effectiveness, including cross-session recall, is improved significantly with the help of
our proposed method of embedding the query
terms by leveraging the temporal and templexical contexts of queries.