Home / Regular Issue / JTAS Vol. 26 (4) Oct. 2018 / JST-1003-2018

 

Statistical Estimators as an Alternative to Standard Deviation in Weighted Euclidean Distance Cluster Analysis

Paul Inuwa Dalatu and Habshah Midi

Pertanika Journal of Tropical Agricultural Science, Volume 26, Issue 4, October 2018

Keywords: Clustering, estimators, K-Means, simulation, weighted

Published on: 24 Oct 2018

Clustering is basically one of the major sources of primary data mining tools. It makes researchers understand the natural grouping of attributes in datasets. Clustering is an unsupervised classification method with the major aim of partitioning, where objects in the same cluster are similar, and objects which belong to different clusters vary significantly, with respect to their attributes. However, the classical Standardized Euclidean distance, which uses standard deviation to down weight maximum points of the ith features on the distance clusters, has been criticized by many scholars that the method produces outliers, lack robustness, and has 0% breakdown points. It also has low efficiency in normal distribution. Therefore, to remedy the problem, we suggest two statistical estimators which have 50% breakdown points namely the Sn and Qn estimators, with 58% and 82% efficiency, respectively. The proposed methods evidently outperformed the existing methods in down weighting the maximum points of the ith features in distance-based clustering analysis.

ISSN 1511-3701

e-ISSN 2231-8542

Article ID

JST-1003-2018

Download Full Article PDF

Share this article

Recent Articles