YaBeSH Engineering and Technology Library

    • Journals
    • PaperQuest
    • YSE Standards
    • YaBeSH
    • Login
    View Item 
    •   YE&T Library
    • AMS
    • Journal of Climate
    • View Item
    •   YE&T Library
    • AMS
    • Journal of Climate
    • View Item
    • All Fields
    • Source Title
    • Year
    • Publisher
    • Title
    • Subject
    • Author
    • DOI
    • ISBN
    Advanced Search
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Archive

    On the Application of Cluster Analysis to Growing Season Precipitation Data in North America East of the Rockies

    Source: Journal of Climate:;1995:;volume( 008 ):;issue: 004::page 897
    Author:
    Gong, Xiaofeng
    ,
    Richman, Michael B.
    DOI: 10.1175/1520-0442(1995)008<0897:OTAOCA>2.0.CO;2
    Publisher: American Meteorological Society
    Abstract: Cluster analysis (CA) has been applied to geophysical research for over two decades although its popularity has increased dramatically over the past few years. To date, systematic methodological reviews have not appeared in geophysical literature. In this paper, after a review of a large number of applications on cluster analysis, an intercomparison of various cluster techniques was carried out on a well-studied dataset (7-day precipitation data from 1949 to 1987 in central and eastern North America). The cluster methods tested were single linkage, complete linkage, average linkage between groups, average linkage within a new group, Ward's method, k means, the nucleated agglomerative method, and the rotated principal component analysis. Three different dissimilarity measures (Euclidean distance, inverse correlation, and theta angle) and three initial partition methods were also tested on the hierarchical and nonhierarchical methods, respectively. Twenty-two of the 23 cluster algorithms yielded natural grouping solutions. Monte Carlo simulations were undertaken to examine the reliability of the cluster solutions. This was done by bootstrap resampling from the full dataset with four different sample size, then testing significance by the t test and the minimum significant difference test. Results showed that nonhierarchical methods outperformed hierarchical methods. The rotated principal component methods were found to be the most accurate methods, the nucleated agglomerative method was found to be superior to all other hard cluster methods, and Ward's method performed best among the hierarchical methods. Single linkage always yielded ?chaining? solutions and, therefore, had poor matches to the input data. Of the three distance measures tested, Euclidean distance appeared to generate slightly more accurate solutions compared with the inverse correlation. The theta angle was quite variable in its accuracy. Tests of the initial partition method revealed a sensitivity of k- means CA to the selection of the seed points. The spatial patterns of cluster analysis applied to the full dataset were found to differ for various CA methods, thereby creating some questions on how to interpret the resulting spatial regionalizations. Several methods were shown to incorrectly place geographically separated portions of the domain into a single cluster. The authors termed this type of result ?aggregation error.? It was found to be most problematic at small sample sizes and more severe for specific distance measures. The choice of clustering technique and dissimilarity measure/initial partition may indeed significantly affect the results of cluster analysis. Cluster analysis accuracy was also found to be linearly to logarithmically related to the sample size. This relationship was statistically significant. Several methods, such as Ward's, k means, and the nucleated agglomerative were found to reach a higher level of accuracy at a lower sample size compared with other CA methods tested. The level of accuracy reached by the rotated principal component clustering compared with the other methods tested suggests that application of a hard and nonoverlapping clustering methodology to fuzzy and overlapping geophysical data results in a substantial degradation in the regionalizations presented.
    • Download: (2.824Mb)
    • Show Full MetaData Hide Full MetaData
    • Item Order
    • Go To Publisher
    • Price: 5000 Rial
    • Statistics

      On the Application of Cluster Analysis to Growing Season Precipitation Data in North America East of the Rockies

    URI
    http://yetl.yabesh.ir/yetl1/handle/yetl/4182090
    Collections
    • Journal of Climate

    Show full item record

    contributor authorGong, Xiaofeng
    contributor authorRichman, Michael B.
    date accessioned2017-06-09T15:25:28Z
    date available2017-06-09T15:25:28Z
    date copyright1995/04/01
    date issued1995
    identifier issn0894-8755
    identifier otherams-4332.pdf
    identifier urihttp://onlinelibrary.yabesh.ir/handle/yetl/4182090
    description abstractCluster analysis (CA) has been applied to geophysical research for over two decades although its popularity has increased dramatically over the past few years. To date, systematic methodological reviews have not appeared in geophysical literature. In this paper, after a review of a large number of applications on cluster analysis, an intercomparison of various cluster techniques was carried out on a well-studied dataset (7-day precipitation data from 1949 to 1987 in central and eastern North America). The cluster methods tested were single linkage, complete linkage, average linkage between groups, average linkage within a new group, Ward's method, k means, the nucleated agglomerative method, and the rotated principal component analysis. Three different dissimilarity measures (Euclidean distance, inverse correlation, and theta angle) and three initial partition methods were also tested on the hierarchical and nonhierarchical methods, respectively. Twenty-two of the 23 cluster algorithms yielded natural grouping solutions. Monte Carlo simulations were undertaken to examine the reliability of the cluster solutions. This was done by bootstrap resampling from the full dataset with four different sample size, then testing significance by the t test and the minimum significant difference test. Results showed that nonhierarchical methods outperformed hierarchical methods. The rotated principal component methods were found to be the most accurate methods, the nucleated agglomerative method was found to be superior to all other hard cluster methods, and Ward's method performed best among the hierarchical methods. Single linkage always yielded ?chaining? solutions and, therefore, had poor matches to the input data. Of the three distance measures tested, Euclidean distance appeared to generate slightly more accurate solutions compared with the inverse correlation. The theta angle was quite variable in its accuracy. Tests of the initial partition method revealed a sensitivity of k- means CA to the selection of the seed points. The spatial patterns of cluster analysis applied to the full dataset were found to differ for various CA methods, thereby creating some questions on how to interpret the resulting spatial regionalizations. Several methods were shown to incorrectly place geographically separated portions of the domain into a single cluster. The authors termed this type of result ?aggregation error.? It was found to be most problematic at small sample sizes and more severe for specific distance measures. The choice of clustering technique and dissimilarity measure/initial partition may indeed significantly affect the results of cluster analysis. Cluster analysis accuracy was also found to be linearly to logarithmically related to the sample size. This relationship was statistically significant. Several methods, such as Ward's, k means, and the nucleated agglomerative were found to reach a higher level of accuracy at a lower sample size compared with other CA methods tested. The level of accuracy reached by the rotated principal component clustering compared with the other methods tested suggests that application of a hard and nonoverlapping clustering methodology to fuzzy and overlapping geophysical data results in a substantial degradation in the regionalizations presented.
    publisherAmerican Meteorological Society
    titleOn the Application of Cluster Analysis to Growing Season Precipitation Data in North America East of the Rockies
    typeJournal Paper
    journal volume8
    journal issue4
    journal titleJournal of Climate
    identifier doi10.1175/1520-0442(1995)008<0897:OTAOCA>2.0.CO;2
    journal fristpage897
    journal lastpage931
    treeJournal of Climate:;1995:;volume( 008 ):;issue: 004
    contenttypeFulltext
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian
     
    DSpace software copyright © 2002-2015  DuraSpace
    نرم افزار کتابخانه دیجیتال "دی اسپیس" فارسی شده توسط یابش برای کتابخانه های ایرانی | تماس با یابش
    yabeshDSpacePersian