Performa Comparison of the K-Means Method for Classification in Diabetes Patients Using Two Normalization Methods

VOLUME 04 ISSUE 01 JANUARY 2021

Performa Comparison of the K-Means Method for Classification in Diabetes Patients Using Two Normalization Methods

¹Dwianti Westari, ²Dr. Abdul Halim,

¹,Universitas Indonesia

²M. Eng, Universitas Indonesia

DOI : https://doi.org/10.47191/ijmra/v4-i1-03

Google Scholar Download Pdf
ABSTRACT:

The diabetes classification system is very useful in the health sector. This paper discusses the classification system for diabetes using the K-Means algorithm. The Pima Indian Diabetes (PID) dataset is used to train and evaluate this algorithm. The unbalanced value range in the attributes affects the quality of the classification result, so it is necessary to preprocess the data which is expected to improve the accuracy of the PID dataset classification result. Two types of preprocessing methods are used that are min-max normalization and z-score normalization. These two normalization methods are used and the classification accuracies are compared. Before the data classification process is carried out, the data is divided into training data and test data. The result of the classification test using the K-Means algorithm has shown that the best accuracy lies in the PID dataset which has been normalized using the min-max normalization method, which 79% compared to z-score normalization.

KEYWORDS

diabetes, k-means, min-max normalization, z-score normalization, Pima Indian Diabetes (PID)

REFERENCES

1) American Diabetes Association, Diagnosis and classification of diabetes mellitus, Diabetes Care 32 (Supplement1) (2009) S62–S67

2) "About diabetes". World Health Organization. Retrieved 4 April 2014

3) Yilmaz N., Inan O., Uzer M.S., " A new data preparation method based on clustering algorithms for diagnosis systems of heart and diabetes diseases," J Med Syst, vol. 38, no. 5 2014.

4) Lowongtraool C., Hiransakolwong N., "Noise filtering in unsupervised clustering using computation intelligence," International Journal of Math, vol. 6, no. 59,pp. 2911-2920,2012

5) Nirmala Devi M.,Appavu alias Balamurugan S.,Swathi U.V., 2013.", An amalgam KNN to predict Diabetes Mellitus", IEEE International Conference on Emerging Trends in Computing, Communication and Nanotechnology(ICECCN), pp 691- 695.

6) Panwar, Madhuri, et al. "K-nearest neighbor based methodology for accurate diagnosis of diabetes mellitus." 2016 Sixth International Symposium on Embedded Computing and System Design (ISED). IEEE, 2016.

7) Malley B, Ramazzotti D, Wu J T-y. Data pre-processing. Secondary analysis of electronic health records. Springer; 2016. p. 115–41.

8) Fikriya, Arina Ashfa, and Sanny Hikmawati. "Support Vector Machine Predictive Analysis Implementation: Case Study of Tax Revenue in Government of South Lampung." Proceeding International Conference on Science and Engineering. Vol. 3. 2020.

9) T. T. Hanifa, S. Al-faraby, F. Informatika, and U. Telkom,“Analisis Churn Prediction pada Data Pelanggan PT . Telekomunikasi dengan Logistic Regression dan Underbagging,” vol. 4, no. 2, pp. 3210–3225, 2017.

10) Azzahra, Darnisa Nasution dkk. Perbandingan Normalisasi Data Untuk KlasifikasiWine Menggunakan Algoritma K-NN. Journal of Caomputer Engineering System and Science Vol. 4 No. 1. Januari 2019.

11) Noor Fitriana, Perbandingan Kinerja Metode Lingkage, Metode Average Lingkage, dan Metode K-Means Dalam menentukan hasil analisis cluster, (Jogyakarta: UNY, 2014), h.22

12) Praja, Abdi, Chairisni Lubis, and Dyah Erny Herwindiati. "Deteksi Penyakit Diabetes dengan Metode Fuzzy C-means Clustering dan K-means Clustering." Computatio: Journal of Computer Science and Information Systems 1.1 (2017): 15- 24.

13) Garcia-Carretero, Rafael, et all. 2020. Use of a K-Nearest Neighbors Model to Predict The Development of Type 2 Diabetes Within 2 Years in An Obese, Hypertensesive Population. International Federation for Medical and Biological Engineering.

14) N. Syafitri. Perbandingan metode KNearestNeighbor (KNN) dan Meode Nearest Cluster Classifier (NCC) Dalam Pengklasifikasian kualitas batik tulis. Teknologi Informasi & pendidikan, Vol 2, no. 1, pp. 45-46, 2010

15) B. A. Muktamar.Implemenasi dengan Naive Bayes Classifier untuk mendukung Strategi pemasaran di Bagian Humas STMIK AMIKOM Yogyakarta [Laporan Penelitian], Yogyakarta, 2013.

VOLUME 04 ISSUE 01 JANUARY 2021

Our Services and Policies

The Journal reserves the right to make any further formal changes and language corrections necessary in a manuscript accepted for publication so that it conforms to the formatting requirements of the Journal.

International Journal of Multidisciplinary Research and Analysis will publish 12 monthly online issues per year,IJMRA publishes articles as soon as the final copy-edited version is approved. IJMRA publishes articles and review papers of all subjects area.

Open access is a mechanism by which research outputs are distributed online, Hybrid open access journals, contain a mixture of open access articles and closed access articles.

International Journal of Multidisciplinary Research and Analysis initiate a call for research paper for Volume 08 Issue 04 (April 2025).

PUBLICATION DATES:
1) Last Date of Submission : 26 April 2025.
2) Article published within a week.
3) Submit Article : editor@ijmra.in or Online

Why with us

International Journal of Multidisciplinary Research and Analysis is better then other journals because:-
1 : IJMRA only accepts original and high quality research and technical papers.
2 : Paper will publish immediately in current issue after registration.
3 : Authors can download their full papers at any time with digital certificate.

The Editors reserve the right to reject papers without sending them out for review.

Authors should prepare their manuscripts according to the instructions given in the authors' guidelines. Manuscripts which do not conform to the format and style of the Journal may be returned to the authors for revision or rejected. The Journal reserves the right to make any further formal changes and language corrections necessary in a manuscript accepted for publication so that it conforms to the formatting requirements of the Journal.

VOLUME 04 ISSUE 01 JANUARY 2021

Performa Comparison of the K-Means Method for Classification in Diabetes Patients Using Two Normalization Methods

1Dwianti Westari, 2Dr. Abdul Halim,

1,Universitas Indonesia

2M. Eng, Universitas Indonesia

VOLUME 04 ISSUE 01 JANUARY 2021

Our Services and Policies

Why with us

The Editors reserve the right to reject papers without sending them out for review.

Indexed In

¹Dwianti Westari, ²Dr. Abdul Halim,

¹,Universitas Indonesia

²M. Eng, Universitas Indonesia