Comparative Analysis of Classification of K-Nearest Neighbor (KNN) Algorithm and Decision Tree in Breast Cancer Using Rapidminer

Authors

  • Ema Rosida Universitas Pelita Bangsa
  • Andri Firmansyah Universitas Pelita Bangsa
  • Suherman Universitas Pelita Bangsa

Keywords:

Breast Cancer, K-Nearest Neighbor (KNN), Decision Tree, Data Mining, ROC Curve, RapidMiner

Abstract

Breast cancer is the leading cause of cancer-related deaths among women in Indonesia and worldwide. Early detection is critical for improving survival rates, yet many cases are diagnosed in late stages due to inadequate awareness and diagnostic tools. This study compares the performance of K-Nearest Neighbor (KNN) and Decision Tree algorithms for breast cancer classification using the Wisconsin Breast Cancer dataset. The Cross-Industry Standard Process for Data Mining (CRISP-DM) framework was applied, consisting of Business Understanding, Data Understanding, Data Preparation, Modeling, Evaluation, and Deployment phases. The results indicate that KNN achieved the highest accuracy (97.14%) and Area Under the Curve (AUC) value (0.976), outperforming the Decision Tree algorithm (accuracy: 96.49%, AUC: 0.965). These findings highlight the potential of data mining techniques for enhancing early breast cancer detection and improving clinical decision-making.

References

N. A. Madyaningrum and Sulastri, “Analisa Prediksi Kekambuhan Kanker Payudara Dengan Menggunakan K-Nearest Neighbor,” Proceeding SINTAK 2019, pp. 180–185, 2019.

Fahrurrozi and Wasilah, “Deteksi Dini Kanker Payudara Menggunakan Algoritma K-Nearest Neighbor (KNN) Dan Decision Tree C-45,” Teknika, vol. 17, no. 2, pp. 427–434, 2023, [Online]. Available: https://jurnal.polsri.ac.id/index.php/teknika/article/view/7565

B. A. Farahdiba, D. Yusuf, and S. Nugroho, “Klasifikasi Kanker Payudara Menggunakan Algoritma Gain Ratio.”

V. Angkasa and J. J. Pangaribuan, “Information System Development Komparasi Tingkat Akurasi Random Forest Dan Knn Untuk Mendiagnosis Penyakit Kanker Payudara,” J. Inf. Syst. Dev., vol. 7, no. 1, pp. 37–38, 2022, [Online]. Available: http://dx.doi.org/10.19166/xxxx

S. A. Mohammed, S. Darrab, S. A. Noaman, and G. Saake, “Analysis of breast cancer detection using different machine learning techniques,” in Communications in Computer and Information Science, Springer, 2020, pp. 108–117. doi: 10.1007/978-981-15-7205-0_10.

Y. Findawati, I. R. I. Astutik, A. S. Fitroni, I. Indrawati, and N. Yuniasih, “Comparative analysis of Naïve Bayes, K Nearest Neighbor and C.45 method in weather forecast,” J. Phys. Conf. Ser., vol. 1402, no. 6, 2019, doi: 10.1088/1742-6596/1402/6/066046.

D. Derisma and F. Febrian, “Perbandingan Teknik Klasifikasi Neural Network, Support Vector Machine, dan Naive Bayes dalam Mendeteksi Kanker Payudara,” Bina Insa. Ict J., vol. 7, no. 1, p. 53, 2020, doi: 10.51211/biict.v7i1.1343.

M. Abdul Jabbar, E. Hasmin, C. Susanto, W. Musu, and I. Artikel, “Komparasi Algoritma Decision Tree, Naive Bayes, dan K-Nearest Neighbors dalam Klasifikasi Kanker Payudara Comparison of Decision Tree Algorithms, Naive Bayes, and K-Nearest Neighbors in Breast Cancer Classification,” Oktober, vol. 14, no. 3, pp. 258–270, 2022, [Online]. Available: https://www.doi.org/10.22303/csrid.14.3.2022.258-270

Hidayati, F. S. Rahmat Suwandi, D. Ediana, and F. Keperawatan dan Kesehatan Masyarakat Universitas Prima Nusantara Bukittinggi Sumatera Barat, “Pengalaman Pasien Pertama Kali Terdiagnosis Kanker Paru Ditinjau Dari Teori the Five Stages of Grieving Article Information a B S T R a K,” vol. 14, pp. 70–073, 2023, [Online]. Available: http://ejurnal.stikesprimanusantara.ac.id/

R. Wulandari, W. Wijayanti, E. Hapsari, D. Widyastutik, and S. Putri H, “Upaya Peningkatan Ketrampilan Kader dalam Deteksi Dini kanker Payudara dengan Pemeriksaan Payudara Sendiri (SADARI) di Posyandu Tanggul Asri RW 10 Kelurahan Kadipiro Kecamatan Banjarsari Surakarta,” J. Salam Sehat Masy., vol. 3, no. 2, pp. 47–52, 2022, doi: 10.22437/jssm.v3i2.18171.

D. R. Aini Silvi Astuti, Yunia Renny Andhikatias, “Efektivitas Pendidikan Kesehatan Sadari Terhadap Tingkat Pengetahuan Remaja Putri Tentang Deteksi Dini Kanker Payudara Di Tegalsari Bendungan,” Angew. Chemie Int. Ed. 6(11), 951–952., vol. 2, 2019.

N. Destria, “Sistem Pendukung Keputusan Perusahaan yang Berprestasi dalam Sektor Indutri dengan Metode Weighted Product,” J. Ris. Sist. Inf. dan Teknol. Inf., vol. 3, no. 2, pp. 1–11, 2021, doi: 10.52005/jursistekni.v3i2.88.

I. Nawangsih, I. Melani, S. Fauziah, and A. I. Artikel, “Pelita Teknologi Prediksi Pengangkatan Karyawan Dengan Metode Algoritma C5.0 (Studi Kasus Pt. Mataram Cakra Buana Agung,” J. Pelita Teknol., vol. 16, no. 2, pp. 24–33, 2021.

F. Kirsten, Prediction (of metaphor). 2021.

T. C. F. Polo and H. A. Miot, “Aplicações da curva ROC em estudos clínicos e experimentais,” J. Vasc. Bras., vol. 19, pp. 13–16, 2020, doi: 10.1590/1677-5449.200186.

K. Erwansyah, “Implementasi Data Mining Untuk Menganalisa Hubungan Data Penjualan Produk Bahan Kimia Terhadap Persedian Stok Barang Menggunakan Algoritma FP ( Frequent Pattern ) Growth Pada PT . Grand Multi Chemicals,” J. Teknol. Sist. Inf. dan Sist. Komput. TGD (J-SISKO TECH), vol. 2, no. 2, pp. 30–40, 2019.

E. Manurung1 and P. S. Hasugian2, “DATA MINING TINGKAT PESANAN INVENTARIS KANTOR MENGGUNAKAN ALGORITMA APRIORI PADA KEPOLISIAN DAERAH SUMATERA UTARA,” 2019.

L. Setiyani, M. Wahidin, D. Awaludin, and S. Purwani, “Analisis Prediksi Kelulusan Mahasiswa Tepat Waktu Menggunakan Metode Data Mining Naïve Bayes : Systematic Review,” Fakt. Exacta, vol. 13, no. 1, p. 35, 2020, doi: 10.30998/faktorexacta.v13i1.5548.

A. H. Nasrullah, “Implementasi Algoritma Decision Tree Untuk Klasifikasi Produk Laris,” J. Ilm. Ilmu Komput., vol. 7, no. 2, pp. 45–51, 2021, doi: 10.35329/jiik.v7i2.203.

Y. E. Fadrial, “Algoritma Naive Bayes Untuk Mencari Perkiraan Waktu Studi Mahasiswa Naive Bayes Algorithm for Finding Student Estimated Time Students,” J. Inf. Technol. Comput. Sci., vol. 4, no. 1, pp. 20–29, 2021.

B. Hermanto and A. Jaelani, “PENERAPAN DATA MINING UNTUK PREDIKSI PENERIMA BANTUAN PANGAN NON TUNAI (BPNT) DI DESA WANACALA MENGGUNAKAN METODE NAÏVE BAYES.”

P. Putra, A. M. H. Pardede, and S. Syahputra, “Analisis Metode K-Nearest Neighbour (Knn) Dalam Klasifikasi Data Iris Bunga,” J. Tek. Inform. Kaputama, vol. 6, no. 1, pp. 297–305, 2022, [Online]. Available: https://garuda.kemdikbud.go.id/documents/detail/2458300

Mustakim, R. Hastarimasuci, P. Papilo, Zarkasih, Zaitun, and A. Nazir, “Variable Selection to Determine Majors of Student using K-Nearest Neighbor and Naïve Bayes Classifier Algorithm,” J. Phys. Conf. Ser., vol. 1363, no. 1, 2019, doi: 10.1088/1742-6596/1363/1/012057.

M. Reza Noviansyah, T. Rismawan, D. Marisa Midyanti, J. Sistem Komputer, and F. H. MIPA Universitas Tanjungpura Jl Hadari Nawawi, “Penerapan Data Mining Menggunakan Metode K-Nearest Neighbor Untuk Klasifikasi Indeks Cuaca Kebakaran Berdasarkan Data Aws (Automatic Weather Station) (Studi Kasus: Kabupaten Kubu Raya),” J. Coding, Sist. Komput. Untan, vol. 06, no. 2, pp. 48–56, 2018.

D. Setiawati, I. Taufik, J. Jumadi, and W. B. Zulfikar, “Klasifikasi Terjemahan Ayat Al-Quran Tentang Ilmu Sains Menggunakan Algoritma Decision Tree Berbasis Mobile,” J. Online Inform., vol. 1, no. 1, p. 24, 2016, doi: 10.15575/join.v1i1.7.

K. P. Keputusan, “Rujukan Decision Tree 3,” vol. 11, no. November, pp. 243–257, 2020.

R. Rustam, S. Rahmatullah, S. Supriyato, and S. Wahyuni, “Penerapan Data Mining Untuk Prediksi Penjualan Produk Triplek Pada Pt Puncak Menara Hijau Mas,” J. Inf. dan Komput., vol. 8, no. 2, pp. 75–86, 2020, doi: 10.35959/jik.v8i2.186.

B. G. Sudarsono, M. I. Leo, A. Santoso, and F. Hendrawan, “Analisis Data Mining Data Netflix Menggunakan Aplikasi Rapid Miner,” JBASE - J. Bus. Audit Inf. Syst., vol. 4, no. 1, pp. 13–21, 2021, doi: 10.30813/jbase.v4i1.2729.

S. James and C. Alley, “Working Paper Series by,” vol. 55, no. 97, pp. 1023–1038, 2007.

S. Haryati, A. Sudarsono, and E. Suryana, “Implementasi Data Mining Untuk Memprediksi Masa Studi Mahasiswa Menggunakan Algoritma C4.5 (Studi Kasus: Universitas Dehasen Bengkulu),” J. Media Infotama, vol. 11, no. 2, pp. 130–138, 2015.

Downloads

Published

2025-06-19

How to Cite

Rosida, E., Firmansyah, A., & Suherman. (2025). Comparative Analysis of Classification of K-Nearest Neighbor (KNN) Algorithm and Decision Tree in Breast Cancer Using Rapidminer. International Journal of Applied Research and Sustainable Sciences, 2(12). Retrieved from http://journal.multitechpublisher.com/index.php/ijarss/article/view/3073