Algoritma K-Prototype dalam Pengelompokan Kabupaten/Kota di Provinsi Sulawesi Selatan Berdasarkan Indikator Kesejahteraan Rakyat Tahun 2020

Zulkifli Rais; Suwardi  Annas; Muhammad Refaldy

doi:10.35580/variansiunm20

Zulkifli Rais Department of Statistics, Universitas Negeri Makassar
Suwardi Annas Department of Statistics, Universitas Negeri Makassar
Muhammad Refaldy Prodi Statistika FMIPA UNM

DOI: https://doi.org/10.35580/variansiunm20

Keywords: K-Prototype Algorithm, Cluster, Elbow Method, People's Welfare Indicators

Abstract

Clustering is something that is used to analyze data both in machine learning, data mining, pattern engineering, image analysis and bioinformatics. To produce the information needed for a data analysis using the clustering process, this is because the data has a large variety and amount. Researchers will use the K-Prototype method where this method becomes an efficient and effective algorithm in processing mixed-type data. The K-Prototype algorithm has problems in finding the best number of clusters. So, in this paper, researchers will conduct research by finding the best number of clusters in the K-Prototype method. There are many ways to determine this, one of which is the Elbow method. The determination of this method is seen from the SSE (Sum Square Error) graph of several number of clusters. The results of the clustering formed 2 clusters which were considered optimal based on the value of k that experienced the greatest decrease. The results showed that, cluster 1 is a cluster that has characteristics of people's welfare which is better than cluster 2.

References

Amah, N., Wahyuningsih, S., Deny, F., & Amijaya, T. (2017). Analisis Cluster Non-Hirarki Dengan Menggunakan Metode K-Modes pada Mahasiswa Program Studi Statistika Angkatan 2015 FMIPA Universitas Mulawarman Non-Hierarchical Cluster Analysis Using K-Modes Method. Eksponensial, 8, 9–16.

Annas, S., Rahmat, H. S., & Rais, Z. (2022). K-Prototypes Algorithm For Clustering The Tectonic Earthquake In Sulawesi Island. 5(2), 191–198.

Azizah, R. N. (2013). Sistem Informasi Mengklasifikasi Pemilihan Jurusan Di Perguruan Tinggi Bagi Lulusan Sma Berbasis Web Menggunakan Algoritma K-Mean. Journal of Chemical Information and Modeling, 53(9), 1689–1699.

Azwar, Saifuddin.Penyusunan Skala Psikologi / Azwar, Saifuddin .(2017)

Badruttamam, A., Sudarno, S., & Maruddani, D. A. I. (2020). Penerapan Analisis Klaster K-Modes dengan Validasi Davies Bouldin Index dalam menentukan Karakteristik Kanal Youtube di Indonesia (Studi Kasus: 250 Kanal YouTube Indonesia Teratas Menurut Socialblade). Jurnal Gaussian, 9(3), 263–272. https://doi.org/10.14710/j.gauss.v9i3.28907

Badan Pusat Statistik. (2021) Indikator Kesejahteraan Rakyat. BPS: Sulawesi Selatan.

García Reyes, L. E. (2013). Kesejahteraan Masyarakat. Journal of Chemical Information and Modeling, 53(9), 1689–1699.

Hair, J. F., Black, W. C., Babin, B. J. & Anderson, R. E., (2014). Multivariate Data Analysis 7th. USA: Pearson.

Huang, Z. (1998). Extensions to the k-means algorithm for clustering large data sets with categorical values. Data mining and knowledge discovery, 2(3), 283-304.

Jia, Z., & Song, L. (2020). Weighted k-Prototypes Clustering Algorithm Based on the Hybrid Dissimilarity Coefficient. Mathematical Problems in Engineering, 2020. https://doi.org/10.1155/2020/5143797

Johnson, R. A. & Wichern, D. W., (2002). Applied Multivariate Statistical Analysis 5th. New Jersey: Pearson.

Madhulatha, T. S. (2012). An overview on clustering methods. arXiv preprint arXiv:1205.1117.

Maiti, & Bidinger. (1981). Analisis Cluster. Journal of Chemical Information and Modeling, 53(9), 1689–1699.

Mattjik, A. A. & Sumertajaya, I. M., 2011. Sidik Peubah Ganda dengan Menggunakan SAS. Bogor: IPB Press.

Nahak, M. (2017). Bab Ii Tinjauan Pustaka Dan Landasan Teori. Journal of Chemical Information and Modeling, 53(9), 21–25. http://www.elsevier.com/locate/scp

Nooraeni, R., Supriadi, J., Si, S., & Sc, M. (2019). K-Prototype Untuk Pengelompokan Data Campuran. February, 1–6.

Nooraeni, R., Tinggi, S., & Statistik, I. (2015). Metode Cluster Menggunakan Kombinasi Algoritma Cluster K-Prototype Dan Algoritma Genetika Untuk Data Bertipe Campuran Cluster Method Using a Combination of Cluster K-Prototype Algorithm and Genetic Algorithm for Mixed Data. Jurnal Aplikasi Statistika & Komputasi Statistik, 7(2), 17–17. https://jurnal.stis.ac.id/index.php/jurnalasks/article/view/23

Poerwadarminta, W. J. S. (1990). Kamus Besar Bahasa Indonesia Balai Pustaka, Jakarta p. 1158. Go to reference in article.

Prasetyo, E. (2010).Data Mining dan Aplikasi Menggunakan Matlab. Yogyakarta: Penerbit ANDI.

Rachmatin, D. (2014). Aplikasi Metode-Metode Agglomerative Dalam Analisis Klaster Pada Data Tingkat Polusi Udara. Infinity Journal, 3(2), 133. https://doi.org/10.22460/infinity.v3i2.59

Rachmatin, D., & Sawitri, K. (2016). Perbandingan Antara Metode Agglomeratif, Metode Divisif dan Metode K-Means Dalam Analisis Klaster. 1, 9–17.

Setyawan, A. H., & Pratiwi, N. (2019). Penerapan Metode Two Step Cluster Untuk Pengelompokan Potensi Desa. Jurnal Statistika Industri Dan …, 4(2), 41–51. https://journal.akprind.ac.id/index.php/Statistika/article/view/1923

Sopha, B. M. (2018). Analisis Klasterisasi Industri Kecil Menengah di Kabupaten Banyuasin, Provinsi Sumatera Selatan dengan Algoritma K-Prototypes ANDIKA YUSUF PUTRA, Bertha Maya Sopha, S.T., M.Sc., Ph.D.

Sulastri, S., Usman, L., & Syafitri, U. D. (2021). K-prototypes Algorithm for Clustering Schools Based on The Student Admission Data in IPB University. Indonesian Journal of Statistics and Its Applications, 5(2), 228–242. https://doi.org/10.29244/ijsa.v5i2p228-242

Suryono, A. (2018). Kebijakan Publik Untuk Kesejahteraan Rakyat. Transparansi Jurnal Ilmiah Ilmu Administrasi, 6(2), 98–102. https://doi.org/10.31334/trans.v6i2.33

Usman, H., & Sobari, N. (2013). Aplikasi multivariate untuk riset pemasaran. Jakarta: PT. RajaGrafindo.