Fusion of Convolution Neural Network and Visual Transformer for Lithology Identification Using Tunnel Face Images

Jianjun Tong; Lulu Xiang; Allen A. Zhang; Xingwang Miao; Mingnian Wang; Pei Ye

Source: Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002::page 04024056-1

Author:

DOI: 10.1061/JCCEE5.CPENG-5997

Publisher: American Society of Civil Engineers

Abstract: This study proposes an intelligent method for recognizing the lithology of a tunnel working face by combining a convolutional neural network and visual transformer. First, an efficient method for collecting high-resolution images of the tunnel working face after construction blasting is developed. Based on relevant geological data, the lithology labels of the tunnel face images are manually prepared. A data augmentation technique is then applied to expand the number of original image samples. Given the established sets of tunnel face images and corresponding lithology labels, the performances of ResNet18 and VIT-4 (which contains four transformer encoding layers) developed in this paper in identifying lithology is compared and analyzed. Subsequently, the efficiencies of using ResNet18 and VIT-4 in both parallel and successive manners is evaluated. The experimental results show that the accuracies of ResNet18 and VIT-4 are 95.7% and 95.4%, respectively. However, stacking ResNet18 and VIT-4 in a parallel manner achieves significantly improved performance in lithology recognition, with an accuracy rate of 98.3%. In contrast, the performance achieved from combining ResNet18 and VIT-4 in a serial manner depends on their structures. Achieving optimal classification performance hinges on minimizing the number of convolution blocks in ResNet18 and concatenating appropriate transformer blocks. The highest accuracy achieved by the method for deploying ResNet18 and VIT-4 in a serial manner using the optimal network structure is 98.5%.

Download: (5.710Mb)
Show Full MetaData Hide Full MetaData
Get RIS
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

Fusion of Convolution Neural Network and Visual Transformer for Lithology Identification Using Tunnel Face Images

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4304685

Collections

Journal of Computing in Civil Engineering

Show full item record

contributor author	Jianjun Tong
contributor author	Lulu Xiang
contributor author	Allen A. Zhang
contributor author	Xingwang Miao
contributor author	Mingnian Wang
contributor author	Pei Ye
date accessioned	2025-04-20T10:25:11Z
date available	2025-04-20T10:25:11Z
date copyright	11/22/2024 12:00:00 AM
date issued	2025
identifier other	JCCEE5.CPENG-5997.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4304685
description abstract	This study proposes an intelligent method for recognizing the lithology of a tunnel working face by combining a convolutional neural network and visual transformer. First, an efficient method for collecting high-resolution images of the tunnel working face after construction blasting is developed. Based on relevant geological data, the lithology labels of the tunnel face images are manually prepared. A data augmentation technique is then applied to expand the number of original image samples. Given the established sets of tunnel face images and corresponding lithology labels, the performances of ResNet18 and VIT-4 (which contains four transformer encoding layers) developed in this paper in identifying lithology is compared and analyzed. Subsequently, the efficiencies of using ResNet18 and VIT-4 in both parallel and successive manners is evaluated. The experimental results show that the accuracies of ResNet18 and VIT-4 are 95.7% and 95.4%, respectively. However, stacking ResNet18 and VIT-4 in a parallel manner achieves significantly improved performance in lithology recognition, with an accuracy rate of 98.3%. In contrast, the performance achieved from combining ResNet18 and VIT-4 in a serial manner depends on their structures. Achieving optimal classification performance hinges on minimizing the number of convolution blocks in ResNet18 and concatenating appropriate transformer blocks. The highest accuracy achieved by the method for deploying ResNet18 and VIT-4 in a serial manner using the optimal network structure is 98.5%.
publisher	American Society of Civil Engineers
title	Fusion of Convolution Neural Network and Visual Transformer for Lithology Identification Using Tunnel Face Images
type	Journal Article
journal volume	39
journal issue	2
journal title	Journal of Computing in Civil Engineering
identifier doi	10.1061/JCCEE5.CPENG-5997
journal fristpage	04024056-1
journal lastpage	04024056-17
page	17
tree	Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 002
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive