Vision-Based Body Pose Estimation of Excavator Using a Transformer-Based Deep-Learning ModelSource: Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002::page 04024064-1DOI: 10.1061/JCCEE5.CPENG-6079Publisher: American Society of Civil Engineers
Abstract: Devoted to safety, efficiency, and productivity management on construction sites, a deep-learning method termed transformer-based mechanical equipment pose network (TransMPNet) is proposed in this research to work on images for the body pose estimation of excavators in effective and efficient ways. TransMPNet contains data processing, an ensemble model coupled with DenseNet201, an improved transformer module, a loss function, and evaluation metrics to perform feature processing and learning for accurate results. To verify the effectiveness and efficiency of the method, a publicly available image database of excavator body poses is adopted for experimental testing and validation. The results indicate that TransMPNet provides excellent performance with a mean-square error (MSE) of 218.626, a root-MSE (RMSE) of 14.786, an average normalized error (NE) of 26.289×10−3, and an average area under the curve (AUC) of 74.487×10−3, and it significantly outperforms other state-of-the-art methods such as the cascaded pyramid network (CPN) and the stacked hourglass network (SHG) in terms of evaluation metrics. Accordingly, TransMPNet contributes to excavator body pose estimation, thereby providing more effective and accurate results with great potential for practical application in on-site construction management.
|
Collections
Show full item record
| contributor author | Ankang Ji | |
| contributor author | Hongqin Fan | |
| contributor author | Xiaolong Xue | |
| date accessioned | 2025-04-20T10:10:04Z | |
| date available | 2025-04-20T10:10:04Z | |
| date copyright | 12/31/2024 12:00:00 AM | |
| date issued | 2025 | |
| identifier other | JCCEE5.CPENG-6079.pdf | |
| identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4304124 | |
| description abstract | Devoted to safety, efficiency, and productivity management on construction sites, a deep-learning method termed transformer-based mechanical equipment pose network (TransMPNet) is proposed in this research to work on images for the body pose estimation of excavators in effective and efficient ways. TransMPNet contains data processing, an ensemble model coupled with DenseNet201, an improved transformer module, a loss function, and evaluation metrics to perform feature processing and learning for accurate results. To verify the effectiveness and efficiency of the method, a publicly available image database of excavator body poses is adopted for experimental testing and validation. The results indicate that TransMPNet provides excellent performance with a mean-square error (MSE) of 218.626, a root-MSE (RMSE) of 14.786, an average normalized error (NE) of 26.289×10−3, and an average area under the curve (AUC) of 74.487×10−3, and it significantly outperforms other state-of-the-art methods such as the cascaded pyramid network (CPN) and the stacked hourglass network (SHG) in terms of evaluation metrics. Accordingly, TransMPNet contributes to excavator body pose estimation, thereby providing more effective and accurate results with great potential for practical application in on-site construction management. | |
| publisher | American Society of Civil Engineers | |
| title | Vision-Based Body Pose Estimation of Excavator Using a Transformer-Based Deep-Learning Model | |
| type | Journal Article | |
| journal volume | 39 | |
| journal issue | 2 | |
| journal title | Journal of Computing in Civil Engineering | |
| identifier doi | 10.1061/JCCEE5.CPENG-6079 | |
| journal fristpage | 04024064-1 | |
| journal lastpage | 04024064-20 | |
| page | 20 | |
| tree | Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002 | |
| contenttype | Fulltext |