Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

A Parallel Transformer Framework for Video Moment Retrieval

Nguyen, Thao-Nhu orcid logoORCID: 0000-0003-1356-9434, Gurrin, Cathal orcid logoORCID: 0000-0003-2903-3968, Li, Zongyao, Satoshi, Yamazaki and Liu, Jianquan (2024) A Parallel Transformer Framework for Video Moment Retrieval. ICMR '24: Proceedings of the 2024 International Conference on Multimedia Retrieval . pp. 460-468.

Abstract
In the realm of video understanding, Video Moment Retrieval (VMR) is an important yet challenging task that aims to locate the boundary of a moment of interest within a long untrimmed video. Existing VMR methods often focus on the visual content extracted from the video only (or frame sequences), however, the rich semantic information at the object level that describes the image’s content has not been explored yet. To overcome those limitations, we propose PaTF, an attention-based Parallel Transformer Framework that enriches the feature representations by exploring both low-level visual cues and high-level relational contexts of video-query pairs. Our framework consists of two parallel transformers: one for the visual-textual stream and the other for the semantic-textual stream. The visual-textual stream extracts the links between global visual features and textual information, while the semantic-textual stream emphasises the relations between objects via scene graph representations. Furthermore, our comprehensive experiment conducted on the Charades-STA dataset demonstrates that the proposed framework outperforms the state-of-the-art methods by a large margin, 5% and 7% at Recall@1 with IoU = 0.5 and IoU = 0.7, respectively.
Metadata
Item Type:Article (Published)
Refereed:Yes
Uncontrolled Keywords:Video Moment Retrieval; Scene graph; Vision-language Transformer; Video Retrieval; Video Temporal Localisation
Subjects:Computer Science > Information retrieval
Computer Science > Digital video
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Official URL:https://dl.acm.org/doi/10.1145/3652583.3658096
ID Code:30558
Deposited On:06 Dec 2024 15:40 by Thao-Nhu Nguyen . Last Modified 06 Dec 2024 15:40
Documents

Full text available as:

[thumbnail of 3652583.3658096.pdf]
Preview
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution 4.0
15MB
Metrics

Altmetric Badge

Dimensions Badge

Downloads

Downloads

Downloads per month over past year

Archive Staff Only: edit this record