Implementasi Pendekatan Rule-Of-Thumb untuk Optimasi Algoritma K-Means Clustering

M Nishom, M Yoka Fathoni

Abstract


In the big data era, the clustering of data or so-called clustering has attracted great interest or attention from researchers in conducting various studies, many grouping algorithms have been proposed in recent times. However, as technology evolves, data volumes continue to grow and data formats are increasingly varied, thus making massive data grouping into a huge and challenging task. To overcome this problem, various research related methods for data grouping have been done, among them is K-Means. However, this method still has some shortcomings, among them is the sensitivity issue in determining the value of cluster (K). In this paper we discuss the implementation of the rule-of-thumb approach and the normalization of data on the K-Means method to determine the number of clusters or K values dynamically in the data groupings. The results show that the implementation of the approach has a significant impact (related to time, number of iterations, and no outliers) in the data grouping.

Full Text:

References


F. Fanny, Y. Muliono, and F. Tanzil, “A Review of News Classification using k-NN, Naive Bayes and Support Vector Machine Classifiers,” J. Pengemb. IT, vol. 3, pp. 55–60, 2018.

K. Singh, D. Malik, and N. Sharma, “Evolving limitations in K-means algorithm in data mining and their removal,” vol. 12, no. April, pp. 105–109, 2011.

A. R. B. and Y. Kiyoki, “A Pillar Algorithm for K-Means Optimization by Distance Maximization for Initial Centroid Designation,” Comput. Intell. Data Min., pp. 61–88, 2009.

G. Zhang, C. Zhang, and H. Zhang, “Improved K-means Algorithm Based on Density Canopy,” Knowledge-Based Syst., 2018.

S. Khanmohammadi, N. Adibeig, and S. Shanehbandy, “An improved overlapping k-means clustering method for medical applications,” Expert Syst. Appl., vol. 67, pp. 12–18, 2017.

G. Gan and M. K. P. Ng, “K-Means Clustering With Outlier Removal,” Pattern Recognit. Lett., vol. 90, pp. 8–14, 2017.

M. Capó, A. Pérez, and J. A. Lozano, “An efficient approximation to the K-means clustering for massive data,” Knowledge-Based Syst., vol. 117, pp. 56–69, 2017.

S. L. and Q. Yunfeng, “Optimization of the Distributed K-means Clustering Algorithm Based on Set Pair Analysis,” Int. Congr. Image Signal Process., pp. 1593–1598, 2015.

M. O. S. and E. Torunski, “A Parallel K-Medoids Algorithm for Clustering based on MapReduce,” Int. Conf. Mach. Learn. Appl., pp. 502–507, 2016.

S. Shahrivari and S. Jalili, “Single-pass and linear-time k-means clustering based on MapReduce,” Inf. Syst., vol. 60, pp. 1–12, 2016.

Widiarini and R. Satria Wahonono, “Algoritma Cluster Dinamik Untuk Optimasi Cluster Pada Algoritma K-Means Dalam Pemetaan Nasabah Potensial Algoritma Cluster Dinamik Untuk Optimasi Cluster Pada Algoritma K-Means Dalam,” J. Intell. Syst., vol. 1, no. 1, pp. 32–35, 2015.

C. D. Manning, Introduction to Information Retrieval. Cambridge: Cambridge University Press, 2008.

X. Wu, B. Wu, J. Sun, S. Qiu, and X. Li, “A hybrid fuzzy K-harmonic means clustering algorithm,” Appl. Math. Model., no. November, 2014.

D. T. Larose, Discovering Knowledge in Data: An Introduction to Data Mining. Canada: Wiley-Interscience, 2014.

E. Prasetyo, Data Mining, Mengolah Data Menjadi Informasi Menggunakan Matlab. Yogyakarta: ANDI, 2014.

J. Oyelade, O. Oladipupo, and I. C. Obagbuwa, “Application of k Means Clustering algorithm for prediction of Students Academic Performance,” Int. J. Comput. Sci. Inf. Secur., vol. 7, no. 1, pp. 292–295, 2010.




DOI: http://dx.doi.org/10.30591/jpit.v3i2.909

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

Terindeks oleh :

 

 http://ejournal.poltektegal.ac.id/public/site/images/informatika/Google_Scholar_logo.png

 

 

 

 

   ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Tim Redaksi JURNAL INFORMATIKA : JURNAL PENGEMBANGAN IT

Program Studi D4 Teknik Informatika
Politeknik Harapan Bersama Tegal
Jl. Mataram No.09 Pesurungan Lor Kota Tegal

Telp. +62283 - 352000

Email :
informatika.ejournal@poltektegal.ac.id

   

Copyright: JPIT (Jurnal Informatika: Jurnal Pengembangan IT) p-ISSN: 2477-5126 (print), e-ISSN 2548-9356 (online) 

Flag Counter
 
 
 
 
site
stats
 
View Visitor Statistic
 
 
 
 
 

 

Creative Commons License
JPIT (Jurnal Informatika: Jurnal Pengembangan IT) is licensed under a Creative Commons Attribution 4.0 International License.