Nonlinear Ordinal Logistic Regression and Multivariate Adaptive Regression Splines (NORL-MARS) for Prediction of Diabetes Mellitus Risk
Abstract
Diabetes Mellitus (DM) is a high-risk metabolic disease with increasing prevalence in Indonesia, requiring an effective classification model based on significant risk factors. This study uses Nonparametric Ordinal Logistic Regression based on the Multivariate Adaptive Regression Spline estimator (NOLR-MARS). Unlike conventional parametric ordinal regression, this model does not assume a fixed functional pattern but rather determines the form of the relationship based on data patterns through basis functions, making it more flexible in handling complex predictor variable interactions. Using 664 records from the Non-Alcoholic Fatty Liver Disease (NAFLD) cohort, we explore the relationship between metabolic factors, included age, sex, Body Mass Index (BMI), LDL cholesterol, and hypertension—and DM risk. This NOLR-MARS integration addresses the nonlinear relationship while maintaining the ordinal nature of DM stages, a combination often overlooked in traditional models. Based on Generalized Cross Validation (GCV) selection, the best model achieved 74.92% accuracy for in-sample data and 80.30% for out-sample data. Furthermore, a sensitivity of 70% and a specificity of 92.86% were obtained for stage 2 DM. Factors such as age, BMI, LDL cholesterol, and hypertension significantly influenced DM status. The results showed that the NORL-MARS model had good predictive performance. The novelty of this study lies in the integration of the MARS estimator into an ordinal logistic regression framework for more granular DM risk assessment. Although this model shows potential as a screening tool in high-risk metabolic cohorts, further clinical application requires external validation to ensure broader generalizability.
Keywords
Full Text:
PDFReferences
N. J. Sathi, Md. A. Islam, Md. S. Ahmed, and S. M. S. Islam, “Prevalence, trends and associated factors of hypertension and diabetes mellitus in Bangladesh: Evidence from BHDS 2011 and 2017–18,” PLoS One, vol. 17, no. 5, p. e0267243, May 2022, doi: 10.1371/journal.pone.0267243.
P. K. Roy, M. H. R. Khan, T. Akter, and M. S. Rahman, “Exploring socio-demographic-and geographical-variations in prevalence of diabetes and hypertension in Bangladesh: Bayesian spatial analysis of national health survey data,” Spat. Spatiotemporal Epidemiol., vol. 29, pp. 71–83, Jun. 2019, doi: 10.1016/j.sste.2019.03.003.
L. S. B. Dahman, A. M. Daakeek, H. S. Alghazali, A. M. Kaity, and M. S. Obbed, “Diagnosis and Classification of Diabetes Mellitus,” Diabetes Care, vol. 37, no. Supplement_1, pp. S81–S90, Jan. 2014, doi: 10.2337/dc14-S081.
N. Susanti, A. H. Nazli, D. Wahyuni, and W. Y. yasmin, “Faktor – Faktor yang Mempengaruhi Diabetes Mellitus di Puskesmas Tuntungan,” PREPOTIF : Jurnal Kesehatan Masyarakat, vol. 8, no. 2, pp. 4293–4299, Aug. 2024.
S. Sunarti et al., “Peningkatan Pengetahuan Masyarakat tentang Mitos dan Fakta Diabetes Melitus serta Diet Seimbang Penderita Diabetes Melitus,” Jurnal Pengabdian Masyarakat: Darma Bakti Teuku Umar, vol. 5, no. 2, p. 64, Jan. 2024, doi: 10.35308/baktiku.v5i2.7230.
G. Jia and J. R. Sowers, “Hypertension in Diabetes: An Update of Basic Mechanisms and Clinical Disease,” Hypertension, vol. 78, no. 5, pp. 1197–1205, Nov. 2021, doi: 10.1161/HYPERTENSIONAHA.121.17981.
C. C. Rani and N. S. Mulyani, “Faktor-faktor yang berhubungan dengan kejadian diabetes mellitus tipe-II pada pasien rawat jalan,” Jurnal SAGO Gizi dan Kesehatan, vol. 2, no. 2, p. 122, Sep. 2021, doi: 10.30867/gikes.v2i2.258.
G. Li et al., “Incidence and Risk Factors of Gestational Diabetes Mellitus: A Prospective Cohort Study in Qingdao, China,” Front. Endocrinol. (Lausanne)., vol. 11, Sep. 2020, doi: 10.3389/fendo.2020.00636.
K. P. Irjayanti, S. Zaenal, and Suhartatik, “Faktor-Faktor yang Mempengaruhi Terjadinya Peningkatan Diabetes Melitus Tipe 2,” Jurnal Ilmiah Mahasiswa & Penelitian Keperawatan, vol. 1, no. 6, pp. 805–813, 2022.
F. R. Mahmud, Sudirman, and N. Afni, “Faktor-Faktor Yang Berhubungan Dengan Penyakit Diabetes Melitus di Ruang Poli Interna Rsud Mokopido Kabupaten Tolitoli,” Jurnal Kolaboratif Sains, vol. 1, no. 1, pp. 168–175, Nov. 2018.
S. Alam, Md. K. Hasan, S. Neaz, N. Hussain, Md. F. Hossain, and T. Rahman, “Diabetes Mellitus: Insights from Epidemiology, Biochemistry, Risk Factors, Diagnosis, Complications and Comprehensive Management,” Diabetology, vol. 2, no. 2, pp. 36–50, Apr. 2021, doi: 10.3390/diabetology2020004.
N. K. Hasibuan, S. Dur, and I. Husein, “Faktor Penyebab Penyakit Diabetes Melitus dengan Metode Regresi Logistik,” G-Tech: Jurnal Teknologi Terapan, vol. 6, no. 2, pp. 257–264, Sep. 2022, doi: 10.33379/gtech.v6i2.1696.
B. Lestari, Fatmawati, and I. N. Budiantara, “Smoothing Spline Estimator in Multiresponse Nonparametric Regression for Predicting Blood Pressure and Heart Rate,” International Journal of Academic and Applied Research (IJAAR), vol. 3, no. 9, pp. 1–8, Sep. 2019.
Suliyanto and M. Rifada, “Modeling of Risk for Diabetes Mellitus and Hypertension Using Bi-response Probit Regression,” in Proceedings of the Third International Conference on Computing, Mathematics and Statistics (iCMS2017), Singapore: Springer Singapore, 2019, pp. 383–389. doi: 10.1007/978-981-13-7279-7_47.
H. Al-Rimmawi, “Prediction of Type 2 Diabetes using logistic regression techniques,” Turkish Journal of Computer and Mathematics Education (TURCOMAT), vol. 15, no. 1, Jan. 2024, doi: 10.61841/turcomat.v15i1.13875.
R. D. Joshi and C. K. Dhakal, “Predicting Type 2 Diabetes Using Logistic Regression and Machine Learning Approaches,” Int. J. Environ. Res. Public Health, vol. 18, no. 14, p. 7346, Jul. 2021, doi: 10.3390/ijerph18147346.
W. A. Anam, A. Massaid, N. A. Amesya, and N. Chamidah, “Modeling of diabetes mellitus risk based on consumption of salt, sugar, and fat factors using local linear estimator,” in AIP Conference Proceedings 2264, AIP Publishing, Sep. 2020, p. 030009. doi: 10.1063/5.0023498.
W. O. R. Alifu, R. Andriani, and W. Ode, “Faktor- Faktor Yang Berhubungan dengan Kejadian Diabetes Melitus di Wilayah Kerja Puskesmas Sampolawa Kabupaten Buton Selatan,” Kampurui Jurnal Kesehatan Masyarakat (The Journal of Public Health), vol. 2, no. 2, pp. 6–12, Dec. 2020, doi: 10.55340/kjkm.v2i2.228.
R. A. Nugroho, Tarno, and A. Prahutama, “Klasifikasi Pasien Diabetes Mellitus Menggunakan Metode Smooth Support Vector Machine (Ssvm),” Jurnal Gaussian, vol. 6, no. 3, pp. 439–448, 2018.
K. Gholipour, M. Asghari-Jafarabadi, S. Iezadi, A. Jannati, and S. Keshavarz, “Modelling the prevalence of diabetes mellitus risk factors based on artificial neural network and multiple regression,” Eastern Mediterranean Health Journal, vol. 24, no. 08, pp. 770–777, Aug. 2018, doi: 10.26719/emhj.18.012.
P. McCullagh, “Regression Models for Ordinal Data,” J. R. Stat. Soc. Series B Stat. Methodol., vol. 42, no. 2, pp. 109–127, Jan. 1980, doi: 10.1111/j.2517-6161.1980.tb01109.x.
R. L. Eubank, Nonparametric Regression and Spline Smoothing. CRC Press, 1999. doi: 10.1201/9781482273144.
W. Härdle, Applied Nonparametric Regression. Cambridge University Press, 1990. doi: 10.1017/CCOL0521382483.
M. Hasyim and D. D. Prastyo, “Modelling lecturer performance index of private university in Tulungagung by using survival analysis with multivariate adaptive regression spline,” J. Phys. Conf. Ser., vol. 974, p. 012065, Mar. 2018, doi: 10.1088/1742-6596/974/1/012065.
W. K. Härdle, L. Simar, and M. R. Fengler, Applied Multivariate Statistical Analysis, 6th ed. Springer, 2024.
N. Chamidah, B. Lestari, H. Susilo, M. Y. Alsagaff, I. N. Budiantara, and D. Aydin, “Spline Estimator in Nonparametric Ordinal Logistic Regression Model for Predicting Heart Attack Risk,” Symmetry (Basel)., vol. 16, no. 11, p. 1440, Oct. 2024, doi: 10.3390/sym16111440.
M. Hasyim et al., “Bootstrap Aggregating Multivariate Adaptive Regression Splines (Bagging MARS) to Analyse the Lecturer Research Performance in Private University,” J. Phys. Conf. Ser., vol. 1114, p. 012117, Nov. 2018, doi: 10.1088/1742-6596/1114/1/012117.
J. H. Friedman and C. B. Roosen, “An introduction to multivariate adaptive regression splines,” Stat. Methods Med. Res., vol. 4, no. 3, pp. 197–217, Sep. 1995, doi: 10.1177/096228029500400303.
J. H. Friedman, “Multivariate Adaptive Regression Splines,” The Annals of Statistics, vol. 19, no. 1, Mar. 1991, doi: 10.1214/aos/1176347963.
J. H. Friedman, “Estimating Functions of Mixed Ordinal and Categorical Variables Using Adaptive Splines. Technical Report No. 108,” Stanford, 1991.
J.-H. Kim, “MARS Modeling for Ordinal Categorical Response Data: A Case Study,” Commun. Stat. Appl. Methods, vol. 7, no. 3, pp. 711–720, Dec. 2000.
A. Wibowo, “Pemodelan Mars dan Regresi Logistik Rumah Tangga Miskin Kalimantan Tengah Tahun 2016,” JMPM: Jurnal Matematika dan Pendidikan Matematika, vol. 3, no. 1, p. 1, Mar. 2018, doi: 10.26594/jmpm.v3i1.1023.
A. P. Ampulembang, B. W. Otok, A. T. Rumiati, and Budiasih, “Bi-Responses Nonparametric Regression Model using MARS and its Properties,” Applied Mathematical Sciences, vol. 9, pp. 1417–1427, 2015, doi: 10.12988/ams.2015.5127.
S. D. P. Yasmirullah, B. W. Otok, J. D. T. Purnomo, and D. D. Prastyo, “Modification of Multivariate Adaptive Regression Spline (MARS),” J. Phys. Conf. Ser., vol. 1863, no. 1, p. 012078, Mar. 2021, doi: 10.1088/1742-6596/1863/1/012078.
O. E. Agwu, S. Alatefi, A. Alkouh, and R. R. Suppiah, “Modelling the Flowing Bottom Hole Pressure of Oil and Gas Wells Using Multivariate Adaptive Regression Splines,” J. Pet. Explor. Prod. Technol., vol. 15, no. 2, p. 22, Feb. 2025, doi: 10.1007/s13202-025-01933-9.
S. Keawsawasvong, K. Kounlavong, N. T. Duong, V. Q. Lai, V. N. Khatri, and A. Eskandarinejad, “Seismic Stability Assessment of Rock Slopes Using Multivariate Adaptive Regression Splines,” Transportation Infrastructure Geotechnology, vol. 11, no. 4, pp. 2296–2318, Aug. 2024, doi: 10.1007/s40515-024-00374-x.
M. B. Jumber, M. T. Damtie, and D. Tegegne, “Integration of Multivariate Adaptive Regression Splines and Weighted Arithmetic Water Quality Index Methods for Drinking Water Quality Analysis,” Water Conservation Science and Engineering, vol. 9, no. 1, p. 6, Jun. 2024, doi: 10.1007/s41101-024-00239-x.
A. G. P. Varshini and K. A. Kumari, “Software Effort Estimation Using Stacked Ensemble Technique and Hybrid Principal Component Regression and Multivariate Adaptive Regression Splines,” Wirel. Pers. Commun., vol. 134, no. 4, pp. 2259–2278, Feb. 2024, doi: 10.1007/s11277-024-11010-9.
A. M. Allen, H. K. Van Houten, L. R. Sangaralingham, J. A. Talwalkar, and R. G. McCoy, “Healthcare Cost and Utilization in Nonalcoholic Fatty Liver Disease: Real‐World Data From a Large U.S. Claims Database,” Hepatology, vol. 68, no. 6, pp. 2230–2238, Dec. 2018, doi: 10.1002/hep.30094.
C. D. Byrne and G. Targher, “NAFLD: A Multisystem Disease,” J. Hepatol., vol. 62, no. 1, pp. S47–S64, Apr. 2015, doi: 10.1016/j.jhep.2014.12.012.
H. Tilg, A. R. Moschen, and M. Roden, “NAFLD and Diabetes Mellitus,” Nat. Rev. Gastroenterol. Hepatol., vol. 14, no. 1, pp. 32–42, Jan. 2017, doi: 10.1038/nrgastro.2016.147.
Zakariyah and I. Zain, “Analisis Regresi Logistik Ordinal pada Prestasi,” Jurnal Sains dan Seni ITS, vol. 4, no. 1, pp. 121–126, 2015.
M. Rifada, N. Chamidah, and R. A. Ningrum, “Estimation of Nonparametric Ordinal Logistic Regression Model using Generalized Additive Models (GAM) Method Based on Local Scoring Algorithm,” in AIP Conference Proceedings, 2022. doi: 10.1063/5.0111771.
D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression. Wiley, 2013. doi: 10.1002/9781118548387.
D. Ariesta, N. Gusriani, and K. Parmikanti, “Estimasi Parameter Model Regresi Nonparametrik B-Spline pada Angka Kematian Maternal,” Jurnal Matematika UNAND, vol. 10, no. 3, pp. 342–354, Jul. 2021, doi: 10.25077/jmu.10.3.342-354.2021.
X. Ju, V. C. P. Chen, J. M. Rosenberger, and F. Liu, “Fast knot optimization for multivariate adaptive regression splines using hill climbing methods,” Expert Syst. Appl., vol. 171, p. 114565, Jun. 2021, doi: 10.1016/j.eswa.2021.114565.
E. K. Koc and C. Iyigun, “Restructuring forward step of MARS algorithm using a new knot selection procedure based on a mapping approach,” Journal of Global Optimization, vol. 60, no. 1, pp. 79–102, Sep. 2014, doi: 10.1007/s10898-013-0107-5.
R. A. Johnson and D. W. Wichern, Applied Multivariate Statistical Analysis, 6th ed. New Jersey: Pearson Prentice Hall, 2007.
A. Miladitiya, “Sensitivitas dan Spesifisitas Lingkar Pinggang Dalam Mengidentifikasi Kelebihan Berat Badan Dan Obesitas Pada Wanita Dewasa,” Interest : Jurnal Ilmu Kesehatan, vol. 7, no. 1, May 2018, doi: 10.37341/interest.v7i1.64.
H. Kim, “Evidence on the Optimal Capital Structure for Korea Composite Stock Price Index (KOSPI)-listed Firms: Employing Inter- and Intra-industry Analyses,” Korea Open Acces Journal (KOAJ), vol. 13, no. 2, pp. 173–191, Apr. 2017, doi: 10.16980/jitc.13.2.201704.173.
K. Komariah and S. Rahayu, “Hubungan Usia, Jenis Kelamin dan Indeks Massa Tubuh dengan Kadar Gula Darah Puasa pada Pasien Diabetes Melitus Tipe 2 di Klinik Pramtama Rawat Jalan Proklamasi, Depok, Jawa Barat,” Jurnal Kesehatan Kusuma Husada, pp. 41–50, Jan. 2020, doi: 10.34035/jk.v11i1.412.
Y.-X. Cao et al., “The Longitudinal Association of Remnant Cholesterol with Cardiovascular Outcomes in Patients with Diabetes and Pre-Diabetes,” Cardiovasc. Diabetol., vol. 19, no. 1, p. 104, Dec. 2020, doi: 10.1186/s12933-020-01076-7.
O. Castañer et al., “Remnant Cholesterol, Not LDL Cholesterol, Is Associated With Incident Cardiovascular Disease,” J. Am. Coll. Cardiol., vol. 76, no. 23, pp. 2712–2724, Dec. 2020, doi: 10.1016/j.jacc.2020.10.008.
R. Anggraini, “Korelasi Kadar Kolesterol Dengan Kejadian Diabetes Mellitus Tipe 2 Pada Laki-Laki,” Medical and Health Science Journal, vol. 2, no. 2, Aug. 2018, doi: 10.33086/mhsj.v2i2.588.
Y. Akalu and Y. Belsti, “Hypertension and Its Associated Factors Among Type 2 Diabetes Mellitus Patients at Debre Tabor General Hospital, Northwest Ethiopia,” Diabetes Metab. Syndr. Obes., vol. Volume 13, pp. 1621–1631, May 2020, doi: 10.2147/DMSO.S254537.
P. Baldi, S. Brunak, Y. Chauvin, C. A. F. Andersen, and H. Nielsen, “Assessing the accuracy of prediction algorithms for classification: an overview,” Bioinformatics, vol. 16, no. 5, pp. 412–424, May 2000, doi: 10.1093/bioinformatics/16.5.412.
M. B. Jumber, M. T. Damtie, and D. Tegegne, “Integration of Multivariate Adaptive Regression Splines and Weighted Arithmetic Water Quality Index Methods for Drinking Water Quality Analysis,” Water Conservation Science and Engineering, vol. 9, no. 1, p. 6, Jun. 2024, doi: 10.1007/s41101-024-00239-x.
R. M. Adnan, Z. Liang, S. Heddam, M. Zounemat-Kermani, O. Kisi, and B. Li, “Least square support vector machine and multivariate adaptive regression splines for streamflow prediction in mountainous basin using hydro-meteorological data as inputs,” J. Hydrol. (Amst)., vol. 586, p. 124371, Jul. 2020, doi: 10.1016/j.jhydrol.2019.124371.
T. Chen and C. Guestrin, “XGBoost: A Scalable Tree Boosting System,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, New York, NY, USA: ACM, Aug. 2016, pp. 785–794. doi: 10.1145/2939672.2939785.
DOI: http://dx.doi.org/10.30829/zero.v10i1.28733
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.