The Pairwise Similarity Partitioning Algorithm: A Method for Unsupervised Partitioning of Geoscientific and Other Datasets Using Arbitrary Similarity Metrics

Grant W. Petty

Source: Artificial Intelligence for the Earth Systems:;2022:;volume( 001 ):;issue: 004

Author:

DOI: 10.1175/AIES-D-22-0005.1

Publisher: American Meteorological Society

Abstract: A simple yet flexible and robust algorithm is described for fully partitioning an arbitrary dataset into compact, nonoverlapping groups or classes, sorted by size, based entirely on a pairwise similarity matrix and a user-specified similarity threshold. Unlike many clustering algorithms, there is no assumption that natural clusters exist in the dataset, although clusters, when present, may be preferentially assigned to one or more classes. The method also does not require data objects to be compared within any coordinate system but rather permits the user to define pairwise similarity using almost any conceivable criterion. The method therefore lends itself to certain geoscientific applications for which conventional clustering methods are unsuited, including two nontrivial and distinctly different datasets presented as examples. In addition to identifying large classes containing numerous similar dataset members, it is also well suited for isolating rare or anomalous members of a dataset. The method is inductive in that prototypes identified in representative subset of a larger dataset can be used to classify the remainder.

Download: (3.931Mb)
Show Full MetaData Hide Full MetaData
Item Order
Go To Publisher
Price: 5000 Rial
Statistics

The Pairwise Similarity Partitioning Algorithm: A Method for Unsupervised Partitioning of Geoscientific and Other Datasets Using Arbitrary Similarity Metrics

URI

http://yetl.yabesh.ir/yetl1/handle/yetl/4290392

Collections

Artificial Intelligence for the Earth Systems

contributor author	Grant W. Petty
date accessioned	2023-04-12T18:52:23Z
date available	2023-04-12T18:52:23Z
date copyright	2022/10/27
date issued	2022
identifier other	AIES-D-22-0005.1.pdf
identifier uri	http://yetl.yabesh.ir/yetl1/handle/yetl/4290392
description abstract	A simple yet flexible and robust algorithm is described for fully partitioning an arbitrary dataset into compact, nonoverlapping groups or classes, sorted by size, based entirely on a pairwise similarity matrix and a user-specified similarity threshold. Unlike many clustering algorithms, there is no assumption that natural clusters exist in the dataset, although clusters, when present, may be preferentially assigned to one or more classes. The method also does not require data objects to be compared within any coordinate system but rather permits the user to define pairwise similarity using almost any conceivable criterion. The method therefore lends itself to certain geoscientific applications for which conventional clustering methods are unsuited, including two nontrivial and distinctly different datasets presented as examples. In addition to identifying large classes containing numerous similar dataset members, it is also well suited for isolating rare or anomalous members of a dataset. The method is inductive in that prototypes identified in representative subset of a larger dataset can be used to classify the remainder.
publisher	American Meteorological Society
title	The Pairwise Similarity Partitioning Algorithm: A Method for Unsupervised Partitioning of Geoscientific and Other Datasets Using Arbitrary Similarity Metrics
type	Journal Paper
journal volume	1
journal issue	4
journal title	Artificial Intelligence for the Earth Systems
identifier doi	10.1175/AIES-D-22-0005.1
tree	Artificial Intelligence for the Earth Systems:;2022:;volume( 001 ):;issue: 004
contenttype	Fulltext

YaBeSH Engineering and Technology Library

Archive