Login (DCU Staff Only)
Login (DCU Staff Only)

DORAS | DCU Research Repository

Explore open access research and scholarly works from DCU

Advanced Search

Incorporating spatio-temporal information in Frustum-ConvNet for improved 3D object detection in instrumented vehicles

Munirathnam G., Venkatesh orcid logoORCID: 0000-0002-4393-9267, O'Connor, Noel E. orcid logoORCID: 0000-0002-4033-9135 and Little, Suzanne orcid logoORCID: 0000-0003-3281-3471 (2022) Incorporating spatio-temporal information in Frustum-ConvNet for improved 3D object detection in instrumented vehicles. In: 10th European Workshop on Visual Information Processing (EUVIP), 11-14 Sept 2022, Lisbon Portugal. ISBN 978-1-6654-6623-3

Environmental perception is a key task for autonomous vehicles to ensure intelligent planning and safe decision-making. Most current state-of-the-art perceptual meth- ods in vehicles, and in particular for 3D object detection, are based on a single-frame reference. However, these methods do not effectively utilise temporal information associated with the objects or the scene from the input data sequences. The work presented in this paper corroborates the use of spatial and temporal information through multi-frame, lidar, point cloud data to leverage spatio-temporal contextual information and improve the accuracy of 3D object detection. The study also gathers more insights into the effect of inducing temporal information into a network and the overall performance of the deep learning model. We consider the Frustum-ConvNet architecture as the baseline model and propose methods to incorporate spatio-temporal information using convolutional-LSTMs to detect the 3D object detection using lidar data. We also propose to employ an attention mechanism with temporal encoding to stimulate the model to focus on salient feature points within the region proposals. The results from this study shows the inclusion of temporal information considerably improves the true positive metric specifically the orientation error of the 3D bounding box from 0.819 to 0.784 and 0.294 to 0.111 for cars and pedestrian classes respectively on the customized subset of nuScenes training dataset. The overall nuScenes detection score (NDS) is improved from 0.822 to 0.837 compared to the baseline.
Item Type:Conference or Workshop Item (Paper)
Event Type:Workshop
Uncontrolled Keywords:Computer Vision, Autonomous Driving; Perception System; Instrumented Vehicles; lidar, radar, KITTI; nuScenes; Object Detection; Spatio-Temporal Information; Convolutional LSTMs.
Subjects:Computer Science > Artificial intelligence
Computer Science > Image processing
Computer Science > Machine learning
DCU Faculties and Centres:DCU Faculties and Schools > Faculty of Engineering and Computing > School of Computing
Research Institutes and Centres > INSIGHT Centre for Data Analytics
Published in: 2022 10th European Workshop on Visual Information Processing (EUVIP). . IEEE. ISBN 978-1-6654-6623-3
Official URL:https://dx.doi.org/10.1109/EUVIP53989.2022.9922815
Copyright Information:© 2022 IEEE
ID Code:27612
Deposited On:07 Sep 2022 11:09 by Venkatesh Gurram Munirathnam . Last Modified 21 Jun 2023 12:52

Full text available as:

[thumbnail of EUVIP_2022_paperID_61.pdf]
PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
Creative Commons: Attribution-Noncommercial 4.0


Downloads per month over past year

Archive Staff Only: edit this record