Unified Deep Architectures for Real-Time Object Detection and Semantic Reasoning in Autonomous Vehicles

Vishal Aher; Satish Jondhale; Balasaheb  Agarkar; Sachin Chaudhari

doi:10.35882/jeeemi.v7i4.813

Vishal Aher Department of Electronics and Telecommunication, Sanjivani College of Engineering, Kopargaon Maharashtra, India https://orcid.org/0009-0000-7042-3920
Satish Jondhale Department of Electronics and Telecommunication, Sanjivani College of Engineering, Kopargaon Maharashtra, India https://orcid.org/0000-0003-2908-5610
Balasaheb Agarkar Department of Electronics and Telecommunication, Sanjivani College of Engineering, Kopargaon Maharashtra, India https://orcid.org/0000-0002-2775-8095
Sachin Chaudhari Department of Electronics and Telecommunication, Sanjivani College of Engineering, Kopargaon Maharashtra, India https://orcid.org/0009-0005-8856-8905

DOI: https://doi.org/10.35882/jeeemi.v7i4.813

Keywords: YOLOv8, PointPillars, Autonomous vehicles, Computer vision, Semantic segmentation, DeepSORT, mAP

Abstract

The development of autonomous vehicles (AVs) has revolutionized the transportation industry, promising to boost mobility, lessen traffic, and increase safety on roads. However, the complexity of the driving environment and the requirement for real-time processing of vast amounts of sensor data present serious difficulties for AV systems. Various computer vision approaches, such as object detection, lane detection, and traffic sign recognition, have been investigated by researchers in order to overcome these issues. This research presents an integrated approach to autonomous vehicle perception, combining real-time object detection, semantic segmentation, and classification within a unified deep learning architecture. Our approach leverages the strengths of existing frameworks, including MultiNet’s real-time semantic reasoning capabilities, the fast-encoding methods of PointPillars to identify objects from point clouds, as well as the reliable one-stage monocular 3D object detection system. The offered model tries to improve computational efficiency and accuracy by utilizing a shared encoder and task-specific decoders that perform classification, detection, and segmentation concurrently. The architecture is evaluated against challenging datasets, illustrating outstanding achievements in terms of speed and accuracy, suitable for real-time applications in autonomous driving. This integration promises significant advancements in the perception systems of autonomous vehicles a providing in-depth knowledge of the vehicle’s environment through efficient concepts of deep learning techniques. In our model, we used Yolov8, MultiNet, and during training got accuracy 93.5%, precision 92.7 %, recall 82.1% and mAP 72.9%.

Downloads

Download data is not yet available.

References

Y. Han, H. Zhang, H. Li, Y. Jin, C. Lang, Y. Li, Collaborative perception in autonomous driving: Methods, datasets, and challenges, IEEE Intelligent Transportation Systems Magazine (2023).

D. Parekh, N. Poddar, A. Rajpurkar, M. Chahal, N. Kumar, G. P. Joshi,

W. Cho, A review on autonomous vehicles: Progress, methods and challenges, Electronics 11 (14) (2022) 2162.

N. Sanil, V. Rakesh, R. Mallapur, M. R. Ahmed, et al., Deep learning techniques for obstacle detection and avoidance in driverless cars, in: 2020 International Conference on Artificial Intelligence and Signal Processing (AISP), IEEE, 2020, pp. 1–4.

W. He, Z. Liu, S. Wang, Research on the application of recognition and detection technology in automatic driving, Highlights in Science, Engineering and Technology 94 (2024) 504–509.

M. Teichmann, M. Weber, M. Zoellner, R. Cipolla, R. Urtasun, Multinet: Real-time joint semantic reasoning for autonomous driving, in: 2018 IEEE intelligent vehicles symposium (IV), IEEE, 2018, pp. 1013–1020.

A. H. Lang, S. Vora, H. Caesar, L. Zhou, J. Yang, O. Beijbom, Pointpillars: Fast encoders for object detection from point clouds, in: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 12697–12705.

D. Feng et al., “Deep Multi-Modal Object Detection and Semantic Segmentation for Autonomous Driving: Datasets, Methods, and Challenges,” IEEE Transactions on Intelligent Transportation Systems, vol. 22, no. 3, pp. 1341–1360, Mar. 2021, doi: 10.1109/TITS.2020.2972974.

F. Liu, “Image Object Detection Algorithm for Autonomous Vehicles,” 2024, pp. 225–233. doi: 10.2991/978-94-6463-512-6_26.

F. S. Alsubaei, F. N. Al-Wesabi, and A. M. Hilal, “Deep Learning-Based Small Object Detection and Classification Model for Garbage Waste Management in Smart Cities and IoT Environment,” Applied Sciences, vol. 12, no. 5, p. 2281, Feb. 2022, doi: 10.3390/app12052281.

D. PS and U. V, “A Novel Hybrid Deep Learning Approach For 3D Object Detection And Tracking In Autonomous Driving,” Computer Science, vol. 25, no. 3, Oct. 2024, doi: 10.7494/csci.2024.25.3.5597.

Y. Parmar, S. Natarajan, and G. Sobha, “DeepRange: deep‐learning‐based object detection and ranging in autonomous driving,” IET Intelligent Transport Systems, vol. 13, no. 8, pp. 1256–1264, Aug. 2019, doi: 10.1049/iet-its.2018.5144.

S. Abdigapporov, S. Miraliev, V. Kakani, and H. Kim, “Joint Multiclass Object Detection and Semantic Segmentation for Autonomous Driving,” IEEE Access, vol. 11, pp. 37637–37649, 2023, doi: 10.1109/ACCESS.2023.3266284.

“YOLOv1 to YOLOv10: A Comprehensive Review of YOLO Variants and Their Application in Medical Image Detection,” Journal of Artificial Intelligence Practice, vol. 7, no. 3, 2024, doi: 10.23977/jaip.2024.070314.

T. Bui, G. Wang, G. Wei, and Q. Zeng, “Vehicle Multi-Object Detection and Tracking Algorithm Based on Improved You Only Look Once 5s Version and DeepSORT,” Applied Sciences, vol. 14, no. 7, p. 2690, Mar. 2024, doi: 10.3390/app14072690.

S. Guo et al., “A Review of Deep Learning-Based Visual Multi-Object Tracking Algorithms for Autonomous Driving,” Applied Sciences, vol. 12, no. 21, p. 10741, Oct. 2022, doi: 10.3390/app122110741.

R. Walambe, A. Marathe, K. Kotecha, and G. Ghinea, “Lightweight Object Detection Ensemble Framework for Autonomous Vehicles in Challenging Weather Conditions,” Comput Intell Neurosci, vol. 2021, no. 1, Jan. 2021, doi: 10.1155/2021/5278820.

R. Wang et al., “A Real‐Time Object Detector for Autonomous Vehicles Based on YOLOv4,” Comput Intell Neurosci, vol. 2021, no. 1, Jan. 2021, doi: 10.1155/2021/9218137.

T. Sharma et al., “Deep Learning-Based Object Detection and Classification for Autonomous Vehicles in Different Weather Scenarios of Quebec, Canada,” IEEE Access, vol. 12, pp. 13648–13662, 2024, doi: 10.1109/ACCESS.2024.3354076.

P. Padmane, T. Dasare, P. Deshkar, N. Dasgupta, A. Kale, and B. Hamdard, “A Review on Real Time Object Detection Using Deep Learning,” Int J Res Appl Sci Eng Technol, vol. 11, no. 4, pp. 1215–1217, Apr. 2023, doi: 10.22214/ijraset.2023.50281.

A. Kishore Kumar and V. Palanisamy, “Detection of lanes, obstacles and drivable areas for self-driving cars using multifusion perception metrics,” Journal of Autonomous Intelligence, vol. 7, no. 3, Jan. 2024, doi: 10.32629/jai.v7i3.1059.

Z. Guo, Y. Huang, X. Hu, H. Wei, and B. Zhao, “A Survey on Deep Learning Based Approaches for Scene Understanding in Autonomous Driving,” Electronics (Basel), vol. 10, no. 4, p. 471, Feb. 2021, doi: 10.3390/electronics10040471.

Y. Dai, D. Kim, and K. Lee, “An Advanced Approach to Object Detection and Tracking in Robotics and Autonomous Vehicles Using YOLOv8 and LiDAR Data Fusion,” Electronics (Basel), vol. 13, no. 12, p. 2250, Jun. 2024, doi: 10.3390/electronics13122250.

G. Sistu et al., “NeurAll: Towards a Unified Visual Perception Model for Automated Driving,” in 2019 IEEE Intelligent Transportation Systems Conference (ITSC), IEEE, Oct. 2019, pp. 796–803. doi: 10.1109/ITSC.2019.8917043.

M. Liu, S. Luo, K. Han, B. Yuan, R. F. DeMara, and Y. Bai, “An Efficient Real-Time Object Detection Framework on Resource-Constricted Hardware Devices via Software and Hardware Co-design,” in 2021 IEEE 32nd International Conference on Application-specific Systems, Architectures and Processors (ASAP), IEEE, Jul. 2021, pp. 77–84. doi: 10.1109/ASAP52443.2021.00020.

Z. Dai, Z. Guan, Q. Chen, Y. Xu, and F. Sun, “Enhanced Object Detection in Autonomous Vehicles through LiDAR—Camera Sensor Fusion,” World Electric Vehicle Journal, vol. 15, no. 7, p. 297, Jul. 2024, doi: 10.3390/wevj15070297.

R. Murendeni, A. Mwanza, and I. C. Obagbuwa, “Using a YOLO Deep Learning Algorithm to Improve the Accuracy of 3D Object Detection by Autonomous Vehicles,” World Electric Vehicle Journal, vol. 16, no. 1, p. 9, Dec. 2024, doi: 10.3390/wevj16010009.

C. K. -, “Autonomous Vehicles: Applications of Deep Reinforcement Learning,” International Journal For Multidisciplinary Research, vol. 6, no. 1, Feb. 2024, doi: 10.36948/ijfmr.2024.v06i01.13792.

O. A. Fawole and D. B. Rawat, “Recent Advances in 3D Object Detection for Self-Driving Vehicles: A Survey,” AI, vol. 5, no. 3, pp. 1255–1285, Jul. 2024, doi: 10.3390/ai5030061.

N. U. A. Tahir, Z. Zhang, M. Asim, J. Chen, and M. ELAffendi, “Object Detection in Autonomous Vehicles under Adverse Weather: A Review of Traditional and Deep Learning Approaches,” Algorithms, vol. 17, no. 3, p. 103, Feb. 2024, doi: 10.3390/a17030103.

P. Azevedo and V. Santos, “YOLO-Based Object Detection and Tracking for Autonomous Vehicles Using Edge Devices,” 2023, pp. 297–308. doi: 10.1007/978-3-031-21065-5_25.

J. Feng, F. Wang, S. Feng, and Y. Peng, “A Multibranch Object Detection Method for Traffic Scenes,” Comput Intell Neurosci, vol. 2019, pp. 1–16, Nov. 2019, doi: 10.1155/2019/3679203.

S. Sun, Y. Yin, X. Wang, D. Xu, W. Wu, and Q. Gu, “Fast object detection based on binary deep convolution neural networks,” CAAI Trans Intell Technol, vol. 3, no. 4, pp. 191–197, Dec. 2018, doi: 10.1049/trit.2018.1026.

S. ARTHAM, S. Borde, and S. Shekhar, “Deep Learning for Autonomous Vehicle Object Detection,” Oct. 30, 2023. doi: 10.21203/rs.3.rs-3506149/v1.

M. Sukkar, M. Shukla, D. Kumar, V. C. Gerogiannis, A. Kanavos, and B. Acharya, “Enhancing Pedestrian Tracking in Autonomous Vehicles by Using Advanced Deep Learning Techniques,” Information, vol. 15, no. 2, p. 104, Feb. 2024, doi: 10.3390/info15020104.

S. Shaikh, J. Chopade, and G. Kharate, “Object Classification and Tracking Using Scaled P8 YOLOv4 Lite Model,” Periodica Polytechnica Electrical Engineering and Computer Science, vol. 67, no. 1, pp. 102–111, Jan. 2023, doi: 10.3311/PPee.20685.

R. Kadu and S. Pawar, “Advanced Bi-CNN for Detection of Knee Osteoarthritis using Joint Space Narrowing Analysis,” Journal of Electronics, Electromedical Engineering, and Medical Informatics, vol. 7, no. 1, pp. 80–90, Nov. 2024, doi: 10.35882/jeeemi.v7i1.574.

S. Ren, K. He, R. Girshick, and J. Sun, “Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks,” IEEE Trans Pattern Anal Mach Intell, vol. 39, no. 6, pp. 1137–1149, Jun. 2017, doi: 10.1109/TPAMI.2016.2577031.

W. Liu et al., “SSD: Single Shot MultiBox Detector,” Dec. 2015, doi: 10.1007/978-3-319-46448-0_2.

A. Bochkovskiy, C.-Y. Wang, and H.-Y. M. Liao, “YOLOv4: Optimal Speed and Accuracy of Object Detection,” Apr. 2020, [Online]. Available: http://arxiv.org/abs/2004.10934

M. Tan, R. Pang, and Q. V. Le, “EfficientDet: Scalable and Efficient Object Detection,” Nov. 2019, [Online]. Available: http://arxiv.org/abs/1911.09070

Viraktamath, S. V., Yavagal, M., & Byahatti, R. (2021). Object detection and classification using YOLOv3. International Journal of Engineering Research & Technology (IJERT), 10(02), 197-202.

Warule, P., Chandratre, S., Mishra, S. P., & Deb, S. (2024). Detection of the common cold from speech signals using transformer model and spectral features. Biomedical Signal Processing and Control, 93, 106158.

Wang, T., Yang, F., & Tsui, K. L. (2020). Real-time detection of railway track component via one-stage deep learning networks. Sensors, 20(15), 4325.

Mazumdar, A., & Rawat, A. S. (2019, September). Learning and recovery in the ReLU model. In 2019 57th Annual Allerton Conference on Communication, Control, and Computing (Allerton) (pp. 108-115). IEEE

Warule, P., Mishra, S. P., & Deb, S. (2023). Time-frequency analysis of speech signal using Chirplet transform for automatic diagnosis of Parkinson’s disease. Biomedical Engineering Letters, 13(4), 613-623

Warule, P., Mishra, S. P., & Deb, S. (2023). Time-frequency analysis of speech signal using wavelet synchrosqueezing transform for automatic detection of Parkinson's disease. IEEE Sensors Letters, 7(10), 1-4.

X. Chen, H. Ma, J. Wan, B. Li, T. Xia, Multi-view 3d object detection network for autonomous driving, in: Proceedings of the IEEE conference on Computer Vision and Pattern Recognition, 2017, pp. 1907–1915.

C. Sakaridis, D. Dai, L. Van Gool, Semantic foggy scene understanding with synthetic data, International Journal of Computer Vision 126 (2018) 973–992.

A. Geiger, P. Lenz, R. Urtasun, Are we ready for autonomous driving? the kitti vision benchmark suite, in: 2012 IEEE conference on computer vision and pattern recognition, IEEE, 2012, pp. 3354–3361.