Properties of optimally weighted data fusion in CBMIR
Wilkins, Peter, Smeaton, Alan F.ORCID: 0000-0003-1028-8389 and Ferguson, Paul
(2010)
Properties of optimally weighted data fusion in CBMIR.
In: SIGIR 2010 - 33rd international ACM SIGIR conference on Research and development in information retrieval, 19-23 July 2010, Geneva, Switzerland.
ISBN 978-1-4503-0153-4
Content-Based Multimedia Information Retrieval (CBMIR)
systems which leverage multiple retrieval experts (En ) of-
ten employ a weighting scheme when combining expert re-
sults through data fusion. Typically however a query will
comprise multiple query images (Im ) leading to potentially
N × M weights to be assigned. Because of the large number
of potential weights, existing approaches impose a hierarchy
for data fusion, such as uniformly combining query image
results from a single retrieval expert into a single list and
then weighting the results of each expert. In this paper we
will demonstrate that this approach is sub-optimal and leads
to the poor state of CBMIR performance in benchmarking
evaluations. We utilize an optimization method known as
Coordinate Ascent to discover the optimal set of weights
(|En | · |Im |) which demonstrates a dramatic difference be-
tween known results and the theoretical maximum. We find
that imposing common combinatorial hierarchies for data fu-
sion will half the optimal performance that can be achieved.
By examining the optimal weight sets at the topic level, we
observe that approximately 15% of the weights (from set
|En | · |Im |) for any given query, are assigned 70%-82% of the total weight mass for that topic. Furthermore we discover
that the ideal distribution of weights follows a log-normal
distribution. We find that we can achieve up to 88% of the
performance of fully optimized query using just these 15% of
the weights. Our investigation was conducted on TRECVID
evaluations 2003 to 2007 inclusive and ImageCLEFPhoto
2007, totalling 181 search topics optimized over a combined
collection size of 661,213 images and 1,594 topic images.
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval.
.
Association for Computing Machinery. ISBN 978-1-4503-0153-4