contributor author | Wilfredo Torres Calderon | |
contributor author | Dominic Roberts | |
contributor author | Mani Golparvar-Fard | |
date accessioned | 2022-01-30T22:50:08Z | |
date available | 2022-01-30T22:50:08Z | |
date issued | 1/1/2021 | |
identifier other | (ASCE)CP.1943-5487.0000937.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4269712 | |
description abstract | In recent years, computer vision algorithms have shown to effectively leverage visual data from jobsites for video-based activity analysis of construction equipment. However, earthmoving operations are restricted to site work and surrounding terrain, and the presence of other structures, particularly in urban areas, limits the number of viewpoints from which operations can be recorded. These considerations lower the degree of intra-activity and interactivity category variability to which said algorithms are exposed, hindering their potential for generalizing effectively to new jobsites. Secondly, training computer vision algorithms is also typically reliant on large quantities of hand-annotated ground truth. These annotations are burdensome to obtain and can offset the cost-effectiveness incurred from automating activity analysis. The main contribution of this paper is a means of inexpensively generating synthetic data to improve the capabilities of vision-based activity analysis methods based on virtual, kinematically articulated three-dimensional (3D) models of construction equipment. The authors introduce an automated synthetic data generation method that outputs a two-dimensional (2D) pose corresponding to simulated excavator operations that vary according to camera position with respect to the excavator and activity length and behavior. The presented method is validated by training a deep learning–based method on the synthesized 2D pose sequences and testing on pose sequences corresponding to real-world excavator operations, achieving 75% precision and 71% recall. This exceeds the 66% precision and 65% recall obtained when training and testing the deep learning–based method on the real-world data via cross-validation. Limited access to reliable amounts of real-world data incentivizes using synthetically generated data for training vision-based activity analysis algorithms. | |
publisher | ASCE | |
title | Synthesizing Pose Sequences from 3D Assets for Vision-Based Activity Analysis | |
type | Journal Paper | |
journal volume | 35 | |
journal issue | 1 | |
journal title | Journal of Computing in Civil Engineering | |
identifier doi | 10.1061/(ASCE)CP.1943-5487.0000937 | |
journal fristpage | 04020052 | |
journal lastpage | 04020052-17 | |
page | 17 | |
tree | Journal of Computing in Civil Engineering:;2021:;Volume ( 035 ):;issue: 001 | |
contenttype | Fulltext | |