YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • ASCE
    • Journal of Computing in Civil Engineering
    • View Item
    •   YE&T Library
    • ASCE
    • Journal of Computing in Civil Engineering
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    NLP-Based Approach to Semantic Classification of Heterogeneous Transportation Asset Data Terminology

    Source: Journal of Computing in Civil Engineering:;2017:;Volume ( 031 ):;issue: 006
    Author:
    Tuyen Le
    ,
    H. David Jeong
    DOI: 10.1061/(ASCE)CP.1943-5487.0000701
    Publisher: American Society of Civil Engineers
    Abstract: The inconsistency of data terminology has imposed big challenges on integrating transportation project data from distinct sources. Differences in meaning of data elements may lead to miscommunication between data senders and receivers. Semantic relations between terms in digital dictionaries, such as ontologies, can enable the semantics of a data element to be transparent and unambiguous to computer systems. However, because of the lack of effective automated methods, identifying these relations is labor intensive and time consuming. This paper presents a novel integrated methodology that leverages multiple computational techniques to extract heterogeneous American-English data terms used in different highway agencies and their semantic relations from design manuals and other technical specifications. The proposed method implements natural language processing (NLP) to detect data elements from text documents and uses machine learning to determine the semantic relatedness among terms using their occurrence statistics in a corpus. The study also consists of developing an algorithm that classifies semantically related terms into three different lexical groups including synonymy, hyponymy, and meronymy. The key merit in this technique is that the detection of semantic relations uses only linguistic information in texts and does not depend on other existing hand-coded semantic resources. A case study was undertaken that implemented the proposed method on a 16-million-word corpus of roadway design manuals to extract and classify roadway data items. The developed classifier was evaluated using a human-encoded test set, and the results show an overall performance of 92.76% in precision and 81.02% recall.
    • Download: (383.9Kb)
    • Show Full MetaData Hide Full MetaData
    • Get RIS
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      NLP-Based Approach to Semantic Classification of Heterogeneous Transportation Asset Data Terminology

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4241012
    Collections
    • Journal of Computing in Civil Engineering

    Show full item record

    contributor authorTuyen Le
    contributor authorH. David Jeong
    date accessioned2017-12-16T09:17:23Z
    date available2017-12-16T09:17:23Z
    date issued2017
    identifier other%28ASCE%29CP.1943-5487.0000701.pdf
    identifier urihttp://138.201.223.254:8080/yetl1/handle/yetl/4241012
    description abstractThe inconsistency of data terminology has imposed big challenges on integrating transportation project data from distinct sources. Differences in meaning of data elements may lead to miscommunication between data senders and receivers. Semantic relations between terms in digital dictionaries, such as ontologies, can enable the semantics of a data element to be transparent and unambiguous to computer systems. However, because of the lack of effective automated methods, identifying these relations is labor intensive and time consuming. This paper presents a novel integrated methodology that leverages multiple computational techniques to extract heterogeneous American-English data terms used in different highway agencies and their semantic relations from design manuals and other technical specifications. The proposed method implements natural language processing (NLP) to detect data elements from text documents and uses machine learning to determine the semantic relatedness among terms using their occurrence statistics in a corpus. The study also consists of developing an algorithm that classifies semantically related terms into three different lexical groups including synonymy, hyponymy, and meronymy. The key merit in this technique is that the detection of semantic relations uses only linguistic information in texts and does not depend on other existing hand-coded semantic resources. A case study was undertaken that implemented the proposed method on a 16-million-word corpus of roadway design manuals to extract and classify roadway data items. The developed classifier was evaluated using a human-encoded test set, and the results show an overall performance of 92.76% in precision and 81.02% recall.
    publisherAmerican Society of Civil Engineers
    titleNLP-Based Approach to Semantic Classification of Heterogeneous Transportation Asset Data Terminology
    typeJournal Paper
    journal volume31
    journal issue6
    journal titleJournal of Computing in Civil Engineering
    identifier doi10.1061/(ASCE)CP.1943-5487.0000701
    treeJournal of Computing in Civil Engineering:;2017:;Volume ( 031 ):;issue: 006
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian