Vector-Quantized Variational Teacher and Multimodal Collaborative Student for Crack Segmentation via Knowledge DistillationSource: Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 003::page 04025030-1DOI: 10.1061/JCCEE5.CPENG-6339Publisher: American Society of Civil Engineers
Abstract: This paper proposes a novel method for real-time crack segmentation in infrastructure inspection that achieves state-of-the-art performance. This approach leverages knowledge distillation, in which a vector-quantized variational autoencoder (VQ-VAE) acts as the “teacher” that extracts informative representations and learns codebook, and a multimodal collaborative student (MCS) utilizes the learned codebook for improved crack segmentation. This framework, incorporating the Teacher’s Codebook Cheating (TCC), achieves high accuracy and efficiency. With minimal parameters (0.59 million), the model demonstrates significant improvements in crack segmentation speed and precision, achieving a Dice score of 93.19, Intersection over Union (IOU) of 0.8723, and mean pixel accuracy of 97.52. Notably, the model processes frames at an impressive 89.3 frames per second (FPS), outperforming all other state-of-the-art methods despite using a smaller input size of 128×128×3; nevertheless, its efficiency stems from its simplicity, with only 0.59 million parameters, making it well-suited for resource-constrained environments. These results highlight the effectiveness of our method for real-time crack segmentation, paving the way for more automated and accessible infrastructure inspection.
|
Collections
Show full item record
contributor author | Shi Qiu | |
contributor author | Qasim Zaheer | |
contributor author | S. Muhammad Ahmed Hassan Shah | |
contributor author | Chengbo Ai | |
contributor author | Jin Wang | |
contributor author | You Zhan | |
date accessioned | 2025-08-17T22:36:09Z | |
date available | 2025-08-17T22:36:09Z | |
date copyright | 5/1/2025 12:00:00 AM | |
date issued | 2025 | |
identifier other | JCCEE5.CPENG-6339.pdf | |
identifier uri | http://yetl.yabesh.ir/yetl1/handle/yetl/4307172 | |
description abstract | This paper proposes a novel method for real-time crack segmentation in infrastructure inspection that achieves state-of-the-art performance. This approach leverages knowledge distillation, in which a vector-quantized variational autoencoder (VQ-VAE) acts as the “teacher” that extracts informative representations and learns codebook, and a multimodal collaborative student (MCS) utilizes the learned codebook for improved crack segmentation. This framework, incorporating the Teacher’s Codebook Cheating (TCC), achieves high accuracy and efficiency. With minimal parameters (0.59 million), the model demonstrates significant improvements in crack segmentation speed and precision, achieving a Dice score of 93.19, Intersection over Union (IOU) of 0.8723, and mean pixel accuracy of 97.52. Notably, the model processes frames at an impressive 89.3 frames per second (FPS), outperforming all other state-of-the-art methods despite using a smaller input size of 128×128×3; nevertheless, its efficiency stems from its simplicity, with only 0.59 million parameters, making it well-suited for resource-constrained environments. These results highlight the effectiveness of our method for real-time crack segmentation, paving the way for more automated and accessible infrastructure inspection. | |
publisher | American Society of Civil Engineers | |
title | Vector-Quantized Variational Teacher and Multimodal Collaborative Student for Crack Segmentation via Knowledge Distillation | |
type | Journal Article | |
journal volume | 39 | |
journal issue | 3 | |
journal title | Journal of Computing in Civil Engineering | |
identifier doi | 10.1061/JCCEE5.CPENG-6339 | |
journal fristpage | 04025030-1 | |
journal lastpage | 04025030-22 | |
page | 22 | |
tree | Journal of Computing in Civil Engineering:;2025:;Volume ( 039 ):;issue: 003 | |
contenttype | Fulltext |