Comparison of OLS Regression and Robust Regression in Overcoming Outlier Problems (Case Study: Cost of Living Data for Urban Areas in Indonesia)

Susiana Susiana, Chairunisah Chairunisah, Nice Rejoice Refisis

Abstract


Multiple regression analysis in quantitative statistical studies describes the relationship between independent and dependent variables. On the other hand, outliers in a set of data can have an unfavourable influence on data analysis, such as high residuals, significant variances, and bias, and can even cause errors in decision-making. It can be done in several ways to overcome the outlier problem in multiple linear regression analysis, including using robust regression or Ordinary Least Square (OLS) Regression by removing data indicated as an outlier first. The OLS Regression method forms a regression model by minimizing the sum of squared residuals from the estimator of the regression equation. Meanwhile, robust regression is closer to the average parameters and variance-covariance of a particular estimator, namely by standardizing the estimator for the average parameters and variance-covariance in such a way as to produce a consistent estimator for these parameters. This research aims to compare the OLS Regression and robust regression methods as alternatives for dealing with outlier problems in data. The data used in this research is secondary data (cost of living) from the Cost of Living Survey conducted by The Central Statistics Agency of the Republic of  Indonesia in 2018. The stages of this research method are literature study, data collection, descriptive analysis to see the characteristics of the data, forming a regression model using the OLS Regression method, testing classical assumptions, creating a new regression model OLS Regression, forming a regression model with Robust Regression, calculating the MSE (Mean Square Error) of each regression model formed, determining the best regression model, The results of the research show that for the cost of living data, the best regression model is obtained through the OLS Regression method with data without outliers, namely . 

 


Keywords


Robust Geographically, Weighted Regression, Stunting, and Modeling

Full Text:

PDF

References


Afifah, R., Andriyana, Y., & Jaya, I. G. N. M. (2017). Robust geographically weighted regression with least absolute deviation method in case of poverty in Java Island. AIP Conference Proceedings, 1827. https://doi.org/10.1063/1.4979439

Arum, P. R., Ridwan, M., Alfidayanti, I., & Wasono, R. (2024). ROBUST GEOGRAPHICALLY WEIGHTED REGRESSION WITH LEAST ABSOLUTE DEVIATION (LAD) ESTIMATION AND M-ESTIMATION ON GRDP OF WEST JAVA PROVINCE. BAREKENG: Jurnal Ilmu Matematika Dan Terapan, 18(3), 1573–1584. https://doi.org/10.30598/barekengvol18iss3pp1573-1584

Bekti, R. D., Nurhadiyanti, G., & Irwansyah, E. (2014). Spatial pattern of diarrhea based on regional economic and environment by spatial autoregressive model. 1621(November), 454–461. https://doi.org/10.1063/1.4898506

Chang, P. C., Wang, Y. W., & Liu, C. H. (2007). The development of a weighted evolving fuzzy neural network for PCB sales forecasting. Expert Systems with Applications, 32(1), 86–96. https://doi.org/10.1016/j.eswa.2005.11.021

de Myttenaere, A., Golden, B., Le Grand, B., & Rossi, F. (2016). Mean Absolute Percentage Error for regression models. Neurocomputing, 192, 38–48. https://doi.org/10.1016/j.neucom.2015.12.114

Fitrawaty, Mardiyah, A., & Maipita, I. (2020). The Determination Analysis of Inflation in North Sumatra. 124, 375–383. https://doi.org/10.2991/aebmr.k.200305.093

Hezaveh, A. M., Arvin, R., & Cherry, C. R. (2019). A geographically weighted regression to estimate the comprehensive cost of traffic crashes at a zonal level. Accident Analysis and Prevention, 131, 15–24. https://doi.org/10.1016/j.aap.2019.05.028

Oshan, T. M., Smith, J. P., & Fotheringham, A. S. (2020). Targeting the spatial context of obesity determinants via multiscale geographically weighted regression. International Journal of Health Geographics, 19(1). https://doi.org/10.1186/s12942-020-00204-6

O’Sullivan, D. (2003). Geographically Weighted Regression: The Analysis of Spatially Varying Relationships (review). Geographical Analysis, 35(3), 272–275. https://doi.org/10.1353/geo.2003.0008

Partner, A., & Vernitski, A. (2023). Maths lecturers in denial about their own maths practice? A case of teaching matrix operations to undergraduate students. MSOR Connections, 21(3), 18–28. https://doi.org/10.21100/msor.v21i3.1353

Rahayu, L., Ulfa, E. M., Sasmita, N. R., Sofyan, H., Kruba, R., Mardalena, S., & Saputra, A. (2023). Unraveling Geospatial Determinants: Robust Geographically Weighted Regression Analysis of Maternal Mortality in Indonesia. Infolitika Journal of Data Science, 1(2), 73–81. https://doi.org/10.60084/ijds.v1i2.133

Salsabila Rahman, A., Maria Tinungki, G., & Tri Herdiani, E. (2023). Robust Geographically Weighted Regression Model on Poverty Data in South Sulawesi in 2019. International Journal of Research Publications, 131(1). https://doi.org/10.47119/ijrp1001311820235415

Saralajew, S., Holdijk, L., Rees, M., & Villmann, T. (2018). Prototype-based Neural Network Layers: Incorporating Vector Quantization. 1–19.

Sugasawa, S., & Murakami, D. (2022). Adaptively Robust Geographically Weighted Regression.

Wang, M., Zhu, C., Wang, F., Li, T., & Zhang, X. (2020). Multi-factor of path planning based on an ant colony optimization algorithm. Annals of GIS, 26(2), 101–112. https://doi.org/10.1080/19475683.2020.1755725

Wulandari, P. P., Djuraidah, A., & Wigena, A. H. (2019). Robust Geographically Weighted Regression Modeling using Least Absolute Deviation and M-Estimator. International Journal of Scientific Research in Science, Engineering and Technology, 238–245. https://doi.org/10.32628/ijsrset196123

Zaki, U. H. H., Ibrahim, R., Halim, S. A., & Kamsani, I. I. (2024). Prioritized Text Detergent: Comparing Two Judgment Scales of Analytic Hierarchy Process on Prioritizing Pre-Processing Techniques on Social Media Sentiment Analysis. Baghdad Science Journal, 21(2), 662–683. https://doi.org/10.21123/bsj.2024.9750




DOI: http://dx.doi.org/10.30829/zero.v8i2.21345

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Publisher :
Department of Mathematics
Faculty of Science and Technology
Universitas Islam Negeri Sumatera Utara Medan
📱 WhatsApp:085270009767 (Admin Official)
SINTA 2 Google Scholar CrossRef Garuda DOAJ