YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    Comparing Natural Language Processing Methods to Cluster Construction Schedules

    Source: Journal of Construction Engineering and Management:;2021:;Volume ( 147 ):;issue: 010::page 04021136-1
    Author:
    Ying Hong
    ,
    Haiyan Xie
    ,
    Gary Bhumbra
    ,
    Ioannis Brilakis
    DOI: 10.1061/(ASCE)CO.1943-7862.0002165
    Publisher: ASCE
    Abstract: The names of construction activities are the only unstructured data attribute in construction schedules, and they often guide construction execution. Activity names are devised to communicate between stakeholders, and therefore often are written using inconsistent terminologies across repetitive activities with omitted contextual information. This presents a challenge for machine learning systems when learning patterns from construction schedules. This paper compared the performance of state-of-the-art text-related clustering methods in identifying repetitive activities. This was achieved by creating a ground truth data set on the basis of the standard construction work classification, and then comparing the precision, recall, and F1 score of latent semantic analysis (LSA), latent Dirichlet allocation (LDA), word2vec, and fastText algorithms to group activity names in 27 construction schedules. Results indicated that the F1 score of LSA outperformed LDA (0.84% versus 0.88%), whereas the results of language models–based clustering depended on the quality of word embedding and the paired clustering method. This study provides insight into how to preprocess activity names of construction schedules for further artificial intelligence (AI)-based quantitative analysis. Methodologies described in this study will help researchers who work on natural language–related research in construction (e.g., safety and contract management) to better capture the feature of words, rather than only counting the word frequencies.
    • Download: (641.7Kb)
    • Show Full MetaData Hide Full MetaData
    • Get RIS
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      Comparing Natural Language Processing Methods to Cluster Construction Schedules

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4272010
    Collections
    • Journal of Construction Engineering and Management

    Show full item record

    contributor authorYing Hong
    contributor authorHaiyan Xie
    contributor authorGary Bhumbra
    contributor authorIoannis Brilakis
    date accessioned2022-02-01T21:46:33Z
    date available2022-02-01T21:46:33Z
    date issued10/1/2021
    identifier other%28ASCE%29CO.1943-7862.0002165.pdf
    identifier urihttp://yetl.yabesh.ir/yetl1/handle/yetl/4272010
    description abstractThe names of construction activities are the only unstructured data attribute in construction schedules, and they often guide construction execution. Activity names are devised to communicate between stakeholders, and therefore often are written using inconsistent terminologies across repetitive activities with omitted contextual information. This presents a challenge for machine learning systems when learning patterns from construction schedules. This paper compared the performance of state-of-the-art text-related clustering methods in identifying repetitive activities. This was achieved by creating a ground truth data set on the basis of the standard construction work classification, and then comparing the precision, recall, and F1 score of latent semantic analysis (LSA), latent Dirichlet allocation (LDA), word2vec, and fastText algorithms to group activity names in 27 construction schedules. Results indicated that the F1 score of LSA outperformed LDA (0.84% versus 0.88%), whereas the results of language models–based clustering depended on the quality of word embedding and the paired clustering method. This study provides insight into how to preprocess activity names of construction schedules for further artificial intelligence (AI)-based quantitative analysis. Methodologies described in this study will help researchers who work on natural language–related research in construction (e.g., safety and contract management) to better capture the feature of words, rather than only counting the word frequencies.
    publisherASCE
    titleComparing Natural Language Processing Methods to Cluster Construction Schedules
    typeJournal Paper
    journal volume147
    journal issue10
    journal titleJournal of Construction Engineering and Management
    identifier doi10.1061/(ASCE)CO.1943-7862.0002165
    journal fristpage04021136-1
    journal lastpage04021136-11
    page11
    treeJournal of Construction Engineering and Management:;2021:;Volume ( 147 ):;issue: 010
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian