Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic Forecasting

John, Vijay; Mita, Seiichi; Lakshmanan, Annamalai; Boyali, Ali; Thompson, Simon

Source: Journal of Autonomous Vehicles and Systems:;2021:;volume( 001 ):;issue: 002::page 021006-1

Author:

John, Vijay

Mita, Seiichi

Lakshmanan, Annamalai

Boyali, Ali

Thompson, Simon

DOI: 10.1115/1.4052529

Publisher: The American Society of Mechanical Engineers (ASME)

Abstract: Visible camera-based semantic segmentation and semantic forecasting are important perception tasks in autonomous driving. In semantic segmentation, the current frame’s pixel-level labels are estimated using the current visible frame. In semantic forecasting, the future frame’s pixel-level labels are predicted using the current and the past visible frames and pixel-level labels. While reporting state-of-the-art accuracy, both of these tasks are limited by the visible camera’s susceptibility to varying illumination, adverse weather conditions, sunlight and headlight glare, etc. In this work, we propose to address these limitations using the deep sensor fusion of the visible and the thermal camera. The proposed sensor fusion framework performs both semantic forecasting as well as an optimal semantic segmentation within a multistep iterative framework. In the first or forecasting step, the framework predicts the semantic map for the next frame. The predicted semantic map is updated in the second step, when the next visible and thermal frame is observed. The updated semantic map is considered as the optimal semantic map for the given visible-thermal frame. The semantic map forecasting and updating are iteratively performed over time. The estimated semantic maps contain the pedestrian behavior, the free space, and the pedestrian crossing labels. The pedestrian behavior is categorized based on their spatial, motion, and dynamic orientation information. The proposed framework is validated using the public KAIST dataset. A detailed comparative analysis and ablation study is performed using pixel-level classification and intersection-over-union (IOU) error metrics. The results show that the proposed framework can not only accurately forecast the semantic segmentation map but also accurately update them.

Download: (973.7Kb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic Forecasting

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4278404

Collections

Journal of Autonomous Vehicles and Systems

Show full item record

contributor author	John, Vijay
contributor author	Mita, Seiichi
contributor author	Lakshmanan, Annamalai
contributor author	Boyali, Ali
contributor author	Thompson, Simon
date accessioned	2022-02-06T05:37:05Z
date available	2022-02-06T05:37:05Z
date copyright	10/13/2021 12:00:00 AM
date issued	2021
identifier issn	2690-702X
identifier other	javs_1_2_021006.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4278404
description abstract	Visible camera-based semantic segmentation and semantic forecasting are important perception tasks in autonomous driving. In semantic segmentation, the current frame’s pixel-level labels are estimated using the current visible frame. In semantic forecasting, the future frame’s pixel-level labels are predicted using the current and the past visible frames and pixel-level labels. While reporting state-of-the-art accuracy, both of these tasks are limited by the visible camera’s susceptibility to varying illumination, adverse weather conditions, sunlight and headlight glare, etc. In this work, we propose to address these limitations using the deep sensor fusion of the visible and the thermal camera. The proposed sensor fusion framework performs both semantic forecasting as well as an optimal semantic segmentation within a multistep iterative framework. In the first or forecasting step, the framework predicts the semantic map for the next frame. The predicted semantic map is updated in the second step, when the next visible and thermal frame is observed. The updated semantic map is considered as the optimal semantic map for the given visible-thermal frame. The semantic map forecasting and updating are iteratively performed over time. The estimated semantic maps contain the pedestrian behavior, the free space, and the pedestrian crossing labels. The pedestrian behavior is categorized based on their spatial, motion, and dynamic orientation information. The proposed framework is validated using the public KAIST dataset. A detailed comparative analysis and ablation study is performed using pixel-level classification and intersection-over-union (IOU) error metrics. The results show that the proposed framework can not only accurately forecast the semantic segmentation map but also accurately update them.
publisher	The American Society of Mechanical Engineers (ASME)
title	Deep Visible and Thermal Camera-Based Optimal Semantic Segmentation Using Semantic Forecasting
type	Journal Paper
journal volume	1
journal issue	2
journal title	Journal of Autonomous Vehicles and Systems
identifier doi	10.1115/1.4052529
journal fristpage	021006-1
journal lastpage	021006-9
page	9
tree	Journal of Autonomous Vehicles and Systems:;2021:;volume( 001 ):;issue: 002
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive