Automatically Categorizing Construction Accident Narratives Using the Deep-Learning Model with a Class-Imbalance Treatment TechniqueSource: Journal of Construction Engineering and Management:;2024:;Volume ( 150 ):;issue: 009::page 04024107-1DOI: 10.1061/JCEMD4.COENG-14515Publisher: American Society of Civil Engineers
Abstract: Learning from prior incidents is crucial for improving safety, particularly in the construction industry where fatalities and injuries are frequent. High-precision classification of construction accident narratives is a laborious, time-consuming process that requires substantial domain expertise. However, automatic text classification had fallen short of expectations due to a lack of high-quality data sets, inadequate semantic interpretation, and primitive model architecture. To address these issues, this study developed a state-of-the-art text classification (TC) model to extract construction knowledge and classify construction accident narratives into predefined categories. The architecture of the TC deep-learning model was built based on the pretrained instruction-based omnifarious representations (INSTRUCTOR). A class-imbalance treatment (CIT) technique incorporating focal loss and weighted random sampling was embedded to make the model concentrate on hard samples and minority classes. The retrained and fine-tuned INSTRUCTOR-CIT model achieved an F1 score of 82.22% for the benchmark data set containing 1,000 accident narratives from the Occupational Health and Safety Administration (OSHA). Impressively, on a larger benchmark data set of 4,770 OSHA accident narratives labeled by another official system, the model achieved an F1 score of 94.84%, highlighting its generality. Furthermore, the experimental results demonstrated that our model was superior to existing methods with less preprocessing and higher accuracy. Finally, the contribution to construction project management was discussed to enhance unstructured data management in the construction industry. The findings of this study contribute to effective management practices and assist construction professionals focus on value-added tasks such as decision making and corrective action planning.
|
Show full item record
contributor author | Qing Shuang | |
contributor author | Xishan Liu | |
contributor author | Zhaojing Wang | |
contributor author | Xinxin Xu | |
date accessioned | 2024-12-24T10:21:49Z | |
date available | 2024-12-24T10:21:49Z | |
date copyright | 9/1/2024 12:00:00 AM | |
date issued | 2024 | |
identifier other | JCEMD4.COENG-14515.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4298781 | |
description abstract | Learning from prior incidents is crucial for improving safety, particularly in the construction industry where fatalities and injuries are frequent. High-precision classification of construction accident narratives is a laborious, time-consuming process that requires substantial domain expertise. However, automatic text classification had fallen short of expectations due to a lack of high-quality data sets, inadequate semantic interpretation, and primitive model architecture. To address these issues, this study developed a state-of-the-art text classification (TC) model to extract construction knowledge and classify construction accident narratives into predefined categories. The architecture of the TC deep-learning model was built based on the pretrained instruction-based omnifarious representations (INSTRUCTOR). A class-imbalance treatment (CIT) technique incorporating focal loss and weighted random sampling was embedded to make the model concentrate on hard samples and minority classes. The retrained and fine-tuned INSTRUCTOR-CIT model achieved an F1 score of 82.22% for the benchmark data set containing 1,000 accident narratives from the Occupational Health and Safety Administration (OSHA). Impressively, on a larger benchmark data set of 4,770 OSHA accident narratives labeled by another official system, the model achieved an F1 score of 94.84%, highlighting its generality. Furthermore, the experimental results demonstrated that our model was superior to existing methods with less preprocessing and higher accuracy. Finally, the contribution to construction project management was discussed to enhance unstructured data management in the construction industry. The findings of this study contribute to effective management practices and assist construction professionals focus on value-added tasks such as decision making and corrective action planning. | |
publisher | American Society of Civil Engineers | |
title | Automatically Categorizing Construction Accident Narratives Using the Deep-Learning Model with a Class-Imbalance Treatment Technique | |
type | Journal Article | |
journal volume | 150 | |
journal issue | 9 | |
journal title | Journal of Construction Engineering and Management | |
identifier doi | 10.1061/JCEMD4.COENG-14515 | |
journal fristpage | 04024107-1 | |
journal lastpage | 04024107-15 | |
page | 15 | |
tree | Journal of Construction Engineering and Management:;2024:;Volume ( 150 ):;issue: 009 | |
contenttype | Fulltext |