Chinese Named Entity Recognition for Bridge Damage and Defects Based on Text Mining and Natural Language Pretraining Models

Jiaqi Liu; Weijie Li; Fangchang Li; Xuefeng Zhao

Source: Journal of Construction Engineering and Management:;2025:;Volume ( 151 ):;issue: 006::page 04025060-1

Author:

DOI: 10.1061/JCEMD4.COENG-16019

Publisher: American Society of Civil Engineers

Abstract: Bridge inspection reports are a vital source of data for bridge management and maintenance, encompassing essential structural information indispensable for damage evaluation and decision-making. However, in the process of automatically extracting unstructured textual data and identifying damage entities, because the same type of bridge damage entity often corresponds to multiple structural components, and strong correlations along with prominent nested features exist among entities, general named entity recognition (NER) methods have limited effectiveness. To address these issues, this study introduces a novel method for NER of damage and defects in bridge inspection, leveraging text mining and pretrained natural language models. First, the study constructs a specialized corpus of bridge damage and defects from a large number of bridge inspection reports, and fine-grained entity annotations are performed on sentences describing damage and defects. Next, the study proposes an advanced bridge damage entity recognition model, which integrates pretrained natural language models with deep learning models. The model leverages the Bidirectional Encoder Representations from Transformers (BERT) pretrained model to extract vector features from Chinese characters in damage-related sentences. It then utilizes a bidirectional long short-term memory (BiLSTM) network to capture sequential patterns of multitype entity labels. Finally, it integrates conditional random fields (CRF) to enforce label constraints, generating the optimal label sequence. The model is validated through experiments using the constructed Chinese bridge inspection damage and defect named entity corpus. Experimental results demonstrate that the model proposed in this study surpasses other mainstream NER models, achieving an F1 score of 98.31% and successfully identifying seven categories of fine-grained bridge damage entities. This study not only enhances the automation of extracting information from damage-related bridge inspection text sentences but also establishes a solid foundation for building knowledge graphs in the bridge domain, advancing the development of intelligent bridge management.

Download: (1.562Mb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Chinese Named Entity Recognition for Bridge Damage and Defects Based on Text Mining and Natural Language Pretraining Models

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4307293

Collections

Journal of Construction Engineering and Management

Show full item record

contributor author	Jiaqi Liu
contributor author	Weijie Li
contributor author	Fangchang Li
contributor author	Xuefeng Zhao
date accessioned	2025-08-17T22:41:04Z
date available	2025-08-17T22:41:04Z
date copyright	6/1/2025 12:00:00 AM
date issued	2025
identifier other	JCEMD4.COENG-16019.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4307293
description abstract	Bridge inspection reports are a vital source of data for bridge management and maintenance, encompassing essential structural information indispensable for damage evaluation and decision-making. However, in the process of automatically extracting unstructured textual data and identifying damage entities, because the same type of bridge damage entity often corresponds to multiple structural components, and strong correlations along with prominent nested features exist among entities, general named entity recognition (NER) methods have limited effectiveness. To address these issues, this study introduces a novel method for NER of damage and defects in bridge inspection, leveraging text mining and pretrained natural language models. First, the study constructs a specialized corpus of bridge damage and defects from a large number of bridge inspection reports, and fine-grained entity annotations are performed on sentences describing damage and defects. Next, the study proposes an advanced bridge damage entity recognition model, which integrates pretrained natural language models with deep learning models. The model leverages the Bidirectional Encoder Representations from Transformers (BERT) pretrained model to extract vector features from Chinese characters in damage-related sentences. It then utilizes a bidirectional long short-term memory (BiLSTM) network to capture sequential patterns of multitype entity labels. Finally, it integrates conditional random fields (CRF) to enforce label constraints, generating the optimal label sequence. The model is validated through experiments using the constructed Chinese bridge inspection damage and defect named entity corpus. Experimental results demonstrate that the model proposed in this study surpasses other mainstream NER models, achieving an F1 score of 98.31% and successfully identifying seven categories of fine-grained bridge damage entities. This study not only enhances the automation of extracting information from damage-related bridge inspection text sentences but also establishes a solid foundation for building knowledge graphs in the bridge domain, advancing the development of intelligent bridge management.
publisher	American Society of Civil Engineers
title	Chinese Named Entity Recognition for Bridge Damage and Defects Based on Text Mining and Natural Language Pretraining Models
type	Journal Article
journal volume	151
journal issue	6
journal title	Journal of Construction Engineering and Management
identifier doi	10.1061/JCEMD4.COENG-16019
journal fristpage	04025060-1
journal lastpage	04025060-12
page	12
tree	Journal of Construction Engineering and Management:;2025:;Volume ( 151 ):;issue: 006
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive