YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • ASCE
    • Journal of Transportation Engineering, Part A: Systems
    • View Item
    •   YE&T Library
    • ASCE
    • Journal of Transportation Engineering, Part A: Systems
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    Handling Imbalanced Data for Real-Time Crash Prediction: Application of Boosting and Sampling Techniques

    Source: Journal of Transportation Engineering, Part A: Systems:;2021:;Volume ( 147 ):;issue: 003::page 04020165-1
    Author:
    Amin Ariannezhad
    ,
    Abolfazl Karimpour
    ,
    Xiao Qin
    ,
    Yao-Jan Wu
    ,
    Yasamin Salmani
    DOI: 10.1061/JTEPBS.0000499
    Publisher: ASCE
    Abstract: With a growing number of intelligent transportation system sensors and the networkwide deployment of those across the nation’s roadway facilities, current research and practices should concentrate on more proactive safety strategies. In recent years, real-time traffic data collected from ITS sensors have been utilized to develop crash prediction models. Real-time crash prediction models can be used to identify hazardous traffic conditions that might cause a crash. This study aims to examine how employing data mining techniques that account for imbalanced data could improve the predictive capability of real-time crash prediction models. The term imbalanced data refers to a condition where the number of observations in each class is not equally distributed among the data set (noncrash cases outnumber crash cases). To decrease the within-class variation of imbalanced data, the data were split into two traffic-state data sets: free-flow speed (FFS) and congestion. Three models, including logistic regression as the baseline, random forest (RF) with random undersampling, and Adaptive Boosting (AdaBoost), were estimated with each data set. The results were compared with the models that were estimated using the complete set of data. Model comparisons indicated that all three models achieved significantly better predictive results with the congested and FFS data sets as opposed to the data set containing all crashes and that, while in some cases the results of the undersampled RF model were slightly better than those of AdaBoost, both models outperformed the logistic regression model. The results of this study demonstrated that using models to deal with imbalanced data and lowering the variation of imbalanced data could substantially improve crash prediction accuracy. The findings could help traffic agencies to practically implement and deploy crash prediction models for real-time applications and develop crash prevention strategies accordingly.
    • Download: (966.6Kb)
    • Show Full MetaData Hide Full MetaData
    • Get RIS
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      Handling Imbalanced Data for Real-Time Crash Prediction: Application of Boosting and Sampling Techniques

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4270816
    Collections
    • Journal of Transportation Engineering, Part A: Systems

    Show full item record

    contributor authorAmin Ariannezhad
    contributor authorAbolfazl Karimpour
    contributor authorXiao Qin
    contributor authorYao-Jan Wu
    contributor authorYasamin Salmani
    date accessioned2022-02-01T00:02:57Z
    date available2022-02-01T00:02:57Z
    date issued3/1/2021
    identifier otherJTEPBS.0000499.pdf
    identifier urihttp://yetl.yabesh.ir/yetl1/handle/yetl/4270816
    description abstractWith a growing number of intelligent transportation system sensors and the networkwide deployment of those across the nation’s roadway facilities, current research and practices should concentrate on more proactive safety strategies. In recent years, real-time traffic data collected from ITS sensors have been utilized to develop crash prediction models. Real-time crash prediction models can be used to identify hazardous traffic conditions that might cause a crash. This study aims to examine how employing data mining techniques that account for imbalanced data could improve the predictive capability of real-time crash prediction models. The term imbalanced data refers to a condition where the number of observations in each class is not equally distributed among the data set (noncrash cases outnumber crash cases). To decrease the within-class variation of imbalanced data, the data were split into two traffic-state data sets: free-flow speed (FFS) and congestion. Three models, including logistic regression as the baseline, random forest (RF) with random undersampling, and Adaptive Boosting (AdaBoost), were estimated with each data set. The results were compared with the models that were estimated using the complete set of data. Model comparisons indicated that all three models achieved significantly better predictive results with the congested and FFS data sets as opposed to the data set containing all crashes and that, while in some cases the results of the undersampled RF model were slightly better than those of AdaBoost, both models outperformed the logistic regression model. The results of this study demonstrated that using models to deal with imbalanced data and lowering the variation of imbalanced data could substantially improve crash prediction accuracy. The findings could help traffic agencies to practically implement and deploy crash prediction models for real-time applications and develop crash prevention strategies accordingly.
    publisherASCE
    titleHandling Imbalanced Data for Real-Time Crash Prediction: Application of Boosting and Sampling Techniques
    typeJournal Paper
    journal volume147
    journal issue3
    journal titleJournal of Transportation Engineering, Part A: Systems
    identifier doi10.1061/JTEPBS.0000499
    journal fristpage04020165-1
    journal lastpage04020165-10
    page10
    treeJournal of Transportation Engineering, Part A: Systems:;2021:;Volume ( 147 ):;issue: 003
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian