Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic ForecastingSource: Journal of Autonomous Vehicles and Systems:;2021:;volume( 001 ):;issue: 002::page 021006-1DOI: 10.1115/1.4052529Publisher: The American Society of Mechanical Engineers (ASME)
Abstract: Visible camera-based semantic segmentation and semantic forecasting are important perception tasks in autonomous driving. In semantic segmentation, the current frame’s pixel-level labels are estimated using the current visible frame. In semantic forecasting, the future frame’s pixel-level labels are predicted using the current and the past visible frames and pixel-level labels. While reporting state-of-the-art accuracy, both of these tasks are limited by the visible camera’s susceptibility to varying illumination, adverse weather conditions, sunlight and headlight glare, etc. In this work, we propose to address these limitations using the deep sensor fusion of the visible and the thermal camera. The proposed sensor fusion framework performs both semantic forecasting as well as an optimal semantic segmentation within a multistep iterative framework. In the first or forecasting step, the framework predicts the semantic map for the next frame. The predicted semantic map is updated in the second step, when the next visible and thermal frame is observed. The updated semantic map is considered as the optimal semantic map for the given visible-thermal frame. The semantic map forecasting and updating are iteratively performed over time. The estimated semantic maps contain the pedestrian behavior, the free space, and the pedestrian crossing labels. The pedestrian behavior is categorized based on their spatial, motion, and dynamic orientation information. The proposed framework is validated using the public KAIST dataset. A detailed comparative analysis and ablation study is performed using pixel-level classification and intersection-over-union (IOU) error metrics. The results show that the proposed framework can not only accurately forecast the semantic segmentation map but also accurately update them.
|
Collections
Show full item record
contributor author | John, Vijay | |
contributor author | Mita, Seiichi | |
contributor author | Lakshmanan, Annamalai | |
contributor author | Boyali, Ali | |
contributor author | Thompson, Simon | |
date accessioned | 2022-02-06T05:37:05Z | |
date available | 2022-02-06T05:37:05Z | |
date copyright | 10/13/2021 12:00:00 AM | |
date issued | 2021 | |
identifier issn | 2690-702X | |
identifier other | javs_1_2_021006.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4278404 | |
description abstract | Visible camera-based semantic segmentation and semantic forecasting are important perception tasks in autonomous driving. In semantic segmentation, the current frame’s pixel-level labels are estimated using the current visible frame. In semantic forecasting, the future frame’s pixel-level labels are predicted using the current and the past visible frames and pixel-level labels. While reporting state-of-the-art accuracy, both of these tasks are limited by the visible camera’s susceptibility to varying illumination, adverse weather conditions, sunlight and headlight glare, etc. In this work, we propose to address these limitations using the deep sensor fusion of the visible and the thermal camera. The proposed sensor fusion framework performs both semantic forecasting as well as an optimal semantic segmentation within a multistep iterative framework. In the first or forecasting step, the framework predicts the semantic map for the next frame. The predicted semantic map is updated in the second step, when the next visible and thermal frame is observed. The updated semantic map is considered as the optimal semantic map for the given visible-thermal frame. The semantic map forecasting and updating are iteratively performed over time. The estimated semantic maps contain the pedestrian behavior, the free space, and the pedestrian crossing labels. The pedestrian behavior is categorized based on their spatial, motion, and dynamic orientation information. The proposed framework is validated using the public KAIST dataset. A detailed comparative analysis and ablation study is performed using pixel-level classification and intersection-over-union (IOU) error metrics. The results show that the proposed framework can not only accurately forecast the semantic segmentation map but also accurately update them. | |
publisher | The American Society of Mechanical Engineers (ASME) | |
title | Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic Forecasting | |
type | Journal Paper | |
journal volume | 1 | |
journal issue | 2 | |
journal title | Journal of Autonomous Vehicles and Systems | |
identifier doi | 10.1115/1.4052529 | |
journal fristpage | 021006-1 | |
journal lastpage | 021006-9 | |
page | 9 | |
tree | Journal of Autonomous Vehicles and Systems:;2021:;volume( 001 ):;issue: 002 | |
contenttype | Fulltext |