Hoang-Bao, Le, Allie, Tran, Binh, Nguyen Thanh, Liting, Zhou
ORCID: 0000-0002-7778-8743 and Cathal, Gurrin
ORCID: 0000-0003-4395-7702
(2025)
Vision Projector: Improving Zero-Shot Composed Image Retrieval at Inference.
In: CBMI Conference, 22-24 Oct. 2025, Dublin, Ireland.
Abstract
Composed Image Retrieval (CIR) involves retrieving a target image based on a query composed of a reference image and a textual modification. Zero-Shot CIR extends this task by removing the need for labeled triplets during training. Most state-of-the-art (SOTA) methods share a common structure: a vision-language encoder followed by a matching module using Transformers or contrastive learning. Instead of increasing data or model complexity, we wonder that: Can we improve retrieval performance at inference time? To answer this, we propose the Vision Projector (VP)-a lightweight, plug-and-play module that enhances visual representations without retraining. Integrated directly into MagicLens, VP consistently improves performance across CIRR, FashionIQ, and CIRCO. Notably, it boosts MagicLens by 18% on CIRCO, despite not using its strongest variant. Code is available at: https://github.com/baohl00/VisionProjector_ZSCIR.
Metadata
| Item Type: | Conference or Workshop Item (Paper) |
|---|---|
| Event Type: | Conference |
| Refereed: | Yes |
| Uncontrolled Keywords: | Composed image retrieval, zero-shot, vision projector |
| Subjects: | Computer Science > Information retrieval Computer Science > Multimedia systems Computer Science > Visualization |
| DCU Faculties and Centres: | Research Institutes and Centres > ADAPT |
| Published in: | Proceedings of the he 2025 IEEE International Conference on Content-Based Multimedia Indexing (IEEE CBMI). . IEEE. |
| Publisher: | IEEE |
| Official URL: | https://www.cbmi2025.org/ |
| Copyright Information: | Authors |
| ID Code: | 32450 |
| Deposited On: | 31 Mar 2026 10:22 by Hoang Bao Le . Last Modified 31 Mar 2026 10:22 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial-No Derivative Works 4.0 1MB |
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record