YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    •   YE&T Library
    • ASCE
    • Journal of Construction Engineering and Management
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    Cross-Lingual Information Retrieval from Multilingual Construction Documents Using Pretrained Language Models

    Source: Journal of Construction Engineering and Management:;2024:;Volume ( 150 ):;issue: 006::page 04024041-1
    Author:
    Jungyeon Kim
    ,
    Sehwan Chung
    ,
    Seokho Chi
    DOI: 10.1061/JCEMD4.COENG-14273
    Publisher: ASCE
    Abstract: The growth of the global construction market has attracted international companies to participate in overseas projects. Overseas projects are extremely dynamic with numerous uncertainties, raising the need to collect information about construction in host countries. Due to the vast amounts of text data in the construction industry, an automated method, specifically information retrieval, is required to find the necessary information. Previous studies have suggested automated methods to review various construction documents. However, these studies required substantial manual effort and mainly focused on only one language, resulting in loss of vital information because it is buried in documents written in the host country’s language. To address these limitations, this study proposes a cross-lingual information retrieval (CLIR) framework using pretrained Bidirectional Encoder Representations from Transformers (BERT) models to retrieve information from multilingual construction documents. The proposed framework employs language models (i.e., monolingual, multilingual, and cross-lingual) and trains these models on a construction data set to enhance their ability in construction-specific text. The framework achieved reliable performance of retrieval, even with minimal additional training using domain-specific data. The results indicate that training on the domain data set raises the level of retrieval, increasing the mean reciprocal rank of a specific task by up to 0.2128. With the employment of a monolingual model with machine translation, CLIR in a specific domain could be performed effectively without the need for a labeled data set. The suggested CLIR framework offers a practical alternative for dealing with construction documents in overseas projects, reducing time and cost while improving risk identification and mitigation.
    • Download: (2.338Mb)
    • Show Full MetaData Hide Full MetaData
    • Get RIS
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      Cross-Lingual Information Retrieval from Multilingual Construction Documents Using Pretrained Language Models

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4297465
    Collections
    • Journal of Construction Engineering and Management

    Show full item record

    contributor authorJungyeon Kim
    contributor authorSehwan Chung
    contributor authorSeokho Chi
    date accessioned2024-04-27T22:46:30Z
    date available2024-04-27T22:46:30Z
    date issued2024/06/01
    identifier other10.1061-JCEMD4.COENG-14273.pdf
    identifier urihttp://yetl.yabesh.ir/yetl1/handle/yetl/4297465
    description abstractThe growth of the global construction market has attracted international companies to participate in overseas projects. Overseas projects are extremely dynamic with numerous uncertainties, raising the need to collect information about construction in host countries. Due to the vast amounts of text data in the construction industry, an automated method, specifically information retrieval, is required to find the necessary information. Previous studies have suggested automated methods to review various construction documents. However, these studies required substantial manual effort and mainly focused on only one language, resulting in loss of vital information because it is buried in documents written in the host country’s language. To address these limitations, this study proposes a cross-lingual information retrieval (CLIR) framework using pretrained Bidirectional Encoder Representations from Transformers (BERT) models to retrieve information from multilingual construction documents. The proposed framework employs language models (i.e., monolingual, multilingual, and cross-lingual) and trains these models on a construction data set to enhance their ability in construction-specific text. The framework achieved reliable performance of retrieval, even with minimal additional training using domain-specific data. The results indicate that training on the domain data set raises the level of retrieval, increasing the mean reciprocal rank of a specific task by up to 0.2128. With the employment of a monolingual model with machine translation, CLIR in a specific domain could be performed effectively without the need for a labeled data set. The suggested CLIR framework offers a practical alternative for dealing with construction documents in overseas projects, reducing time and cost while improving risk identification and mitigation.
    publisherASCE
    titleCross-Lingual Information Retrieval from Multilingual Construction Documents Using Pretrained Language Models
    typeJournal Article
    journal volume150
    journal issue6
    journal titleJournal of Construction Engineering and Management
    identifier doi10.1061/JCEMD4.COENG-14273
    journal fristpage04024041-1
    journal lastpage04024041-15
    page15
    treeJournal of Construction Engineering and Management:;2024:;Volume ( 150 ):;issue: 006
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian