Nguyen, Thao-Nhu ORCID: 0000-0003-1356-9434, Gurrin, Cathal ORCID: 0000-0003-2903-3968, Li, Zongyao, Satoshi, Yamazaki and Liu, Jianquan (2024) A Parallel Transformer Framework for Video Moment Retrieval. ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval . pp. 460-468.
Abstract
In the realm of video understanding, Video Moment Retrieval (VMR) is an important yet challenging task that aims to locate the boundary of a moment of interest within a long untrimmed video. Existing VMR methods often focus on the visual content extracted from the video only (or frame sequences), however, the rich semantic information at the object level that describes the image’s content has not been explored yet. To overcome those limitations, we propose PaTF, an attention-based Parallel Transformer Framework that enriches the feature representations by exploring both low-level visual cues and high-level relational contexts of video-query pairs. Our framework consists of two parallel transformers: one for the visual-textual stream and the other for the semantic-textual stream. The visual-textual stream extracts the links between global visual features and textual information, while the semantic-textual stream emphasises the relations between objects via scene graph representations. Furthermore, our comprehensive experiment conducted on the Charades-STA dataset demonstrates that the proposed framework outperforms the state-of-the-art methods by a large margin, 5% and 7% at Recall@1 with IoU = 0.5 and IoU = 0.7, respectively.
Metadata
Item Type: | Article (Published) |
---|---|
Refereed: | Yes |
Uncontrolled Keywords: | Video Moment Retrieval; Scene graph; Vision-language Transformer; Video Retrieval; Video Temporal Localisation |
Subjects: | Computer Science > Information retrieval Computer Science > Digital video |
DCU Faculties and Centres: | DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing |
Official URL: | https://dl.acm.org/doi/10.1145/3652583.3658096 |
ID Code: | 30558 |
Deposited On: | 06 Dec 2024 15:40 by Thao-Nhu Nguyen . Last Modified 06 Dec 2024 15:40 |
Documents
Full text available as:
Preview |
PDF
- Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0 15MB |
Metrics
Altmetric Badge
Dimensions Badge
Downloads
Downloads
Downloads per month over past year
Archive Staff Only: edit this record