Temporal- and Appearance-Guided Object Detection in Construction Machines Considering Out-of-Distribution DataSource: Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002::page 04024057-1DOI: 10.1061/JCCEE5.CPENG-5590Publisher: American Society of Civil Engineers
Abstract: The automation in the construction machine field requires a robust understanding of their surroundings and should be able to localize and classify surrounding objects robustly. State-of-the-art object detection algorithms are usually deep learning–based approaches that take red–green–blue (RGB) images as the only input. However, recent findings highlighted the limitation that these deep learning–based approaches may not perform robustly due to the bias introduced by the training set, which is also a common problem in existing construction machine data sets. As a result, the object detection performance on out-of-distribution (OOD) data is significantly worse than on the training set. This may cause severe accidents and unexpected economic losses on smart construction sites. To address this issue, this study proposes a novel object detection algorithm, called “Temporal- and Appearance-Guided Object Detection” (TAG), which optimally extracts information from temporal information (optical flow) and appearance information (RGB) to improve object detection accuracy and robustness despite OOD data. To evaluate the performance on various OOD data sets, a custom construction machine data set generation system is created that enables nonoverlapping training and testing distributions. Tests with the simulated data set and the real-world data set are performed considering the diversity of the working conditions and the challenge of the OOD data. Compared with existing typical alternative solutions, the results show strong empirical evidence that the proposed construction machine object detection algorithm significantly increases the robustness and generalization capability in dynamic cases without compromising performance in static cases. The Temporal- and Appearance-Guided Object Detection (TAG) algorithm presented in this research has immediate practical implications for improving the safety and efficiency of autonomous construction machines. The strategic choice of a 10-Hz sampling frequency in our custom data set aligns with the capabilities of common cameras in the field. This frequency strikes a balance that prevents overly small amplitudes that could hinder motion-based detection. The integration of temporal and appearance information by TAG proves instrumental in overcoming the challenges associated with out-of-distribution (OOD) data. The annotation of an additional color label for each construction machine object, a task that requires minimal supplemental effort due to color constancy, facilitates the proposed biased RGB analysis in real-world video data sets. In real-world applications, TAG significantly enhances construction site safety by improving object detection accuracy, especially in dynamic scenarios. Its adaptability to diverse environments, demonstrated by nonoverlapping training and testing distributions, makes it an important tool for autonomous construction machines. With improved reliability and performance, TAG ensures the seamless operation of construction machines in real-world scenarios, contributing to a safer and more efficient future for smart construction sites.
|
Collections
Show full item record
contributor author | Kaiwen Wang | |
contributor author | Bobo Helian | |
contributor author | Volker Fischer | |
contributor author | Marcus Geimer | |
date accessioned | 2025-04-20T10:07:23Z | |
date available | 2025-04-20T10:07:23Z | |
date copyright | 11/23/2024 12:00:00 AM | |
date issued | 2025 | |
identifier other | JCCEE5.CPENG-5590.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4304032 | |
description abstract | The automation in the construction machine field requires a robust understanding of their surroundings and should be able to localize and classify surrounding objects robustly. State-of-the-art object detection algorithms are usually deep learning–based approaches that take red–green–blue (RGB) images as the only input. However, recent findings highlighted the limitation that these deep learning–based approaches may not perform robustly due to the bias introduced by the training set, which is also a common problem in existing construction machine data sets. As a result, the object detection performance on out-of-distribution (OOD) data is significantly worse than on the training set. This may cause severe accidents and unexpected economic losses on smart construction sites. To address this issue, this study proposes a novel object detection algorithm, called “Temporal- and Appearance-Guided Object Detection” (TAG), which optimally extracts information from temporal information (optical flow) and appearance information (RGB) to improve object detection accuracy and robustness despite OOD data. To evaluate the performance on various OOD data sets, a custom construction machine data set generation system is created that enables nonoverlapping training and testing distributions. Tests with the simulated data set and the real-world data set are performed considering the diversity of the working conditions and the challenge of the OOD data. Compared with existing typical alternative solutions, the results show strong empirical evidence that the proposed construction machine object detection algorithm significantly increases the robustness and generalization capability in dynamic cases without compromising performance in static cases. The Temporal- and Appearance-Guided Object Detection (TAG) algorithm presented in this research has immediate practical implications for improving the safety and efficiency of autonomous construction machines. The strategic choice of a 10-Hz sampling frequency in our custom data set aligns with the capabilities of common cameras in the field. This frequency strikes a balance that prevents overly small amplitudes that could hinder motion-based detection. The integration of temporal and appearance information by TAG proves instrumental in overcoming the challenges associated with out-of-distribution (OOD) data. The annotation of an additional color label for each construction machine object, a task that requires minimal supplemental effort due to color constancy, facilitates the proposed biased RGB analysis in real-world video data sets. In real-world applications, TAG significantly enhances construction site safety by improving object detection accuracy, especially in dynamic scenarios. Its adaptability to diverse environments, demonstrated by nonoverlapping training and testing distributions, makes it an important tool for autonomous construction machines. With improved reliability and performance, TAG ensures the seamless operation of construction machines in real-world scenarios, contributing to a safer and more efficient future for smart construction sites. | |
publisher | American Society of Civil Engineers | |
title | Temporal- and Appearance-Guided Object Detection in Construction Machines Considering Out-of-Distribution Data | |
type | Journal Article | |
journal volume | 39 | |
journal issue | 2 | |
journal title | Journal of Computing in Civil Engineering | |
identifier doi | 10.1061/JCCEE5.CPENG-5590 | |
journal fristpage | 04024057-1 | |
journal lastpage | 04024057-10 | |
page | 10 | |
tree | Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002 | |
contenttype | Fulltext |