YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    Deep Learning Image Captioning in Construction Management: A Feasibility Study

    Source: Journal of Construction Engineering and Management:;2022:;Volume ( 148 ):;issue: 007::page 04022049
    Author:
    Bo Xiao
    ,
    Yiheng Wang
    ,
    Shih-Chung Kang
    DOI: 10.1061/(ASCE)CO.1943-7862.0002297
    Publisher: ASCE
    Abstract: Deep learning image captioning methods are able to generate one or several natural sentences to describe the contents of construction images. By deconstructing these sentences, the construction object and activity information can be retrieved integrally for automated scene analysis. However, the feasibility of deep learning image captioning in construction remains unclear. To fill this gap, this research investigates the feasibility of deep learning image captioning methods in construction management. First, a linguistic schema for annotating construction machine images was established, and a captioning data set was developed. Then, six deep learning image captioning methods from the computer vision community were selected and tested on the construction captioning data set. In the sentence-level evaluation, the transformer-self-critical sequence training (Tsfm-SCST) method has obtained the best performance among six methods with the bilingual evaluation (BLEU)-1 score of 0.606, BLEU-2 of 0.506, BLEU-3 of 0.427, BLEU-4 of 0.349, metric for evaluation of translation with explicit ordering (METEOR) of 0.287, recall-oriented understudy for gisting evaluation (ROUGE) of 0.585, consensus-based image description evaluation (CIDEr) of 1.715, and semantic propositional image caption evaluation (SPICE) score of 0.422. In the element-level evaluation, the Tsfm-SCST method achieved an average precision of 91.1%, recall of 83.3%, and an F1 score of 86.6% for recognition of construction machine objects by deconstructing the generated sentences. This research indicates that deep learning image captioning is feasible as a method of generating accurate and precise text descriptions from construction images, with potential applications in construction scene analysis and image documentation.
    • Download: (2.676Mb)
    • Show Full MetaData Hide Full MetaData
    • Get RIS
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      Deep Learning Image Captioning in Construction Management: A Feasibility Study

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4286109
    Collections
    • Journal of Construction Engineering and Management

    Show full item record

    contributor authorBo Xiao
    contributor authorYiheng Wang
    contributor authorShih-Chung Kang
    date accessioned2022-08-18T12:09:41Z
    date available2022-08-18T12:09:41Z
    date issued2022/04/22
    identifier other%28ASCE%29CO.1943-7862.0002297.pdf
    identifier urihttp://yetl.yabesh.ir/yetl1/handle/yetl/4286109
    description abstractDeep learning image captioning methods are able to generate one or several natural sentences to describe the contents of construction images. By deconstructing these sentences, the construction object and activity information can be retrieved integrally for automated scene analysis. However, the feasibility of deep learning image captioning in construction remains unclear. To fill this gap, this research investigates the feasibility of deep learning image captioning methods in construction management. First, a linguistic schema for annotating construction machine images was established, and a captioning data set was developed. Then, six deep learning image captioning methods from the computer vision community were selected and tested on the construction captioning data set. In the sentence-level evaluation, the transformer-self-critical sequence training (Tsfm-SCST) method has obtained the best performance among six methods with the bilingual evaluation (BLEU)-1 score of 0.606, BLEU-2 of 0.506, BLEU-3 of 0.427, BLEU-4 of 0.349, metric for evaluation of translation with explicit ordering (METEOR) of 0.287, recall-oriented understudy for gisting evaluation (ROUGE) of 0.585, consensus-based image description evaluation (CIDEr) of 1.715, and semantic propositional image caption evaluation (SPICE) score of 0.422. In the element-level evaluation, the Tsfm-SCST method achieved an average precision of 91.1%, recall of 83.3%, and an F1 score of 86.6% for recognition of construction machine objects by deconstructing the generated sentences. This research indicates that deep learning image captioning is feasible as a method of generating accurate and precise text descriptions from construction images, with potential applications in construction scene analysis and image documentation.
    publisherASCE
    titleDeep Learning Image Captioning in Construction Management: A Feasibility Study
    typeJournal Article
    journal volume148
    journal issue7
    journal titleJournal of Construction Engineering and Management
    identifier doi10.1061/(ASCE)CO.1943-7862.0002297
    journal fristpage04022049
    journal lastpage04022049-14
    page14
    treeJournal of Construction Engineering and Management:;2022:;Volume ( 148 ):;issue: 007
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian