Dengue Risk Stratification in Semarang City Using a Gaussian Mixture Model Based on Multi-Dimensional Urban Indicators

  • Nabila Izzatil Ismah Computer Science Program, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia
  • Amiq Fahmi Information Systems Program, Faculty of Computer Science, Universitas Dian Nuswantoro, Semarang, Indonesia https://orcid.org/0000-0002-6684-9694
Keywords: Dengue risk stratification, Gaussian mixture model, Bayesian iformation criterion, Urban built-environment, Spatial epidemiology

Abstract

Dengue fever remains a pressing public health challenge in major Indonesian cities, including Semarang. The complex interplay of heterogeneous demographic structures and built-environment characteristics generates spatially uneven transmission risks, while conventional risk-mapping approaches often fail to capture the probabilistic nature of these risks at fine-scale administrative levels, limiting their utility for targeted interventions. This study aims to develop a robust, replicable framework for dengue risk stratification that more accurately identifies localized high-risk areas and supports evidence-based public health decision-making. The research introduces a probabilistic clustering approach using Gaussian Mixture Models (GMM) to move beyond rigid partitioning methods, while simultaneously integrating multi-year incidence data (2021–2024) with eighteen multidimensional urban indicators across 177 sub-districts (kelurahan). This combined contribution advances methodological rigor by accommodating overlapping data distributions and probabilistic cluster memberships, and provides a nuanced, evidence-driven tool for stratifying dengue risk and guiding hyper-local interventions. Several GMM configurations were evaluated using the Bayesian Information Criterion (BIC) to determine the optimal number of clusters. The BIC value declined markedly when the number of clusters increased from two to three, indicating a substantial improvement in model fit. Further increases yielded only marginal gains, and the lowest BIC was achieved at three clusters, representing the most parsimonious and effective solution. Internal validation confirmed that the cluster structure robustly captured epidemiological variance despite the inherent heterogeneity of urban spatial data. Cluster 2 emerged as a critical high-risk epicenter, geographically limited yet characterized by consistently elevated incidence, pronounced temporal variability, and extreme values. The proposed GMM-based framework demonstrates that dengue risk in Semarang is concentrated within localized foci of heightened vulnerability rather than uniformly distributed. Ultimately, the methodology is replicable in other complex tropical urban environments, thereby strengthening both academic rigor and practical public health decision-making

Downloads

Download data is not yet available.

References

R. Lowe et al., “Articles Combined effects of hydrometeorological hazards and urbanisation on dengue risk in Brazil : a spatiotemporal modelling study,” Lancet Planet. Heal., vol. 5, no. 4, pp. e209–e219, 2021, doi: 10.1016/S2542-5196(20)30292-8.

I. G. Nyoman, M. Jaya, Y. Andriyana, B. Tantular, and S. S. Pangastuti, “Spatiotemporal Dengue Forecasting for Sustainable Public Health in Bandung , Indonesia : A Comparative Study of Classical , Machine Learning , and Bayesian Models,” pp. 1–26, 2025.

A. Oliveira, L. W. Lopes, and R. M. De Moraes, “Evaluation of spatial cluster detection methods for dengue fever in the state of Paraiba , Brazil,” vol. 20, pp. 393–400, 2025, doi: 10.4081/gh.2025.1393.

S. Hossain, M. M. Safa, N. F. Juthi, and N. Tasnia, “Bayesian hybrid statistical and machine learning models for dengue forecasting in Bangladesh: Temporal and spatial analysis for an early warning system,” 2025.

C. E. Sekarrini, Sumarmi, S. Bachri, D. Taryana, and E. A. Giofandi, “The application of geographic information system for dengue epidemic in Southeast Asia: A review on trends and opportunity,” J. Public health Res., vol. 11, no. 3, Jul. 2022, doi: 10.1177/22799036221104170.

S. Yin, C. Ren, Y. Shi, J. Hua, H. Yuan, and L. Tian, “A Systematic Review on Modeling Methods and Influential Factors for Mapping Dengue-Related Risk in Urban Settings,” 2022.

A. Lim et al., “A systematic review of the data , methods and environmental covariates used to map Aedes -borne arbovirus transmission risk,” pp. 1–18, 2023.

P. Kumar et al., “Envisioning urban environments resilient to vector-borne diseases : a protocol to study dengue in Vietnam,” pp. 17–27, doi: 10.37349/edht.2023.00004.

R. Feliciano, J. Valter, J. Silva, and A. F. Pastor, “Spatiotemporal dynamics , risk areas and social determinants of dengue in Northeastern Brazil , 2014 – 2017 : an ecological study,” Infect. Dis. Poverty, pp. 1–16, 2020, doi: 10.1186/s40249-020-00772-6.

M. K. Gurram, S. Gwee, Y. Wang, and J. Pang, “Spatiotemporal distribution of sustained dengue hotspots associated with climate and urbanisation in Singapore,” pp. 1–11, 2025.

I. C. Gormley, T. B. Murphy, and A. E. Raftery, “Model-Based Clustering,” pp. 573–595, 2023.

Y. Koesmaryono, A. Sopaheluwakan, R. Hidayati, and B. D. Dasanto, “Spatiotemporal Characterization of Dengue Incidence and Its,” 2024.

B. Liu, F. Hossain, and S. Hossain, “OPEN A comparative evaluation of multiple machine learning approaches for forecasting dengue outbreaks in Bangladesh,” 2025.

B. Chassagnol et al., “Gaussian Mixture Models in R,” vol. 15, no. June, pp. 56–76, 2023.

M. L. B. Lopes, “Unsupervised Learning Applied to the Stratification of Preterm Birth Risk in Brazil with Socioeconomic Data,” 2022.

H. Fidan and M. Erkan, “A comparative study for determining Covid-19 risk levels by unsupervised machine learning methods,” Expert Syst. Appl., vol. 190, no. August 2021, p. 116243, 2022, doi: 10.1016/j.eswa.2021.116243.

J. Cao, L. Lin, and W. Li, “Limitations of clustering with PCA and correlated noise,” J. Stat. Comput. Simul., vol. 94, no. 11, pp. 2489–2512, 2024, doi: 10.1080/00949655.2024.2329976.

M. E. Mar, “Confidence set for mixture order selection,” pp. 1–14.

M. Bai, “Ecohydrological zoning based on the Gaussian mixture model clustering method,” vol. 27, no. 4, pp. 771–786, 2025, doi: 10.2166/hydro.2025.305.

M. Science, E. Division, and S. Arabia, “Bayesian spatial functional data clustering: applications in disease surveillance b,” pp. 1–19, 2019.

C. E. Sekarrini, Sumarmi, S. Bachri, D. Taryana, and E. A. Giofandi, “The application of geographic information system for dengue epidemic in Southeast Asia: A review on trends and opportunity,” J. Public health Res., vol. 11, no. 3, 2022, doi: 10.1177/22799036221104170.

O. Mendoza-cano, R. Danis-lozano, X. Trujillo, and M. Huerta, “Spatial patterns and clustering of dengue incidence in Mexico : Analysis of Moran ’ s index across 2 , 471 municipalities from 2022 to 2024,” vol. 2024, pp. 1–14, 2025, doi: 10.1371/journal.pone.0324754.

C. Rotejanaprasert, K. Chinpong, A. B. Lawson, and P. Chienwichai, “Evaluation and comparison of spatial cluster detection methods for improved decision making of disease surveillance : a case study of national dengue surveillance in Thailand,” BMC Med. Res. Methodol., pp. 1–13, 2024, doi: 10.1186/s12874-023-02135-9.

A. L. Asyary, “Spatial Analysis on Dengue Fever Vulnerability in the Provinces of South Sulawesi and East Nusa Tenggara in Indonesia,” vol. 91, no. 1, pp. 1–13, 2025, doi: 10.5334/aogh.4915.

Y. Wang and X. Chen, “A Review of Bayesian Spatiotemporal Models in Spatial Epidemiology,” 2024.

F. P. Rocha and M. Giesbrecht, “Machine learning algorithms for dengue risk assessment : a case study for São Luís do Maranhão,” Comput. Appl. Math., vol. 41, no. 8, pp. 1–27, 2022, doi: 10.1007/s40314-022-02101-z.

C. Corvalan et al., “Towards Climate Resilient and Environmentally Sustainable Health Care Facilities,” 2020.

M. Mahdavian, K. Yin, and M. Chen, “Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps”.

C. Inference and H. Sciences, Causal Inference and Machine Learning.

V. Gómez-rubio, “Journal of Statistical Software,” vol. 100, no. 1, pp. 1–7, 2021, doi: 10.18637/jss.v100.i01.

E. M. Dillon et al., “Dermal denticle assemblages in coral reef sediments correlate with conventional shark surveys,” vol. 2020, no. September 2019, pp. 362–375, 2020, doi: 10.1111/2041-210X.13346.

Y. Cao et al., “Epidemiological Trends and Age – Period – Cohort Effects on Dengue Incidence Across High-Risk Regions from 1992 to 2021,” pp. 1–17, 2025.

J. Briseno-ramirez et al., “Annals of Epidemiology Trends and cyclical patterns of dengue disease in Mexico : A 40-year time series analysis,” vol. 112, no. October, pp. 53–63, 2025, doi: 10.1016/j.annepidem.2025.10.022.

X. Shou, G. Mavroudeas, A. New, and K. Arhin, “Supervised Mixture Models for Population Health,” 2016.

Z. R. Mccaw, H. Julienne, and H. Aschard, “MGMM : An R Package for fitting Gaussian Mixture Models on Incomplete Data,” pp. 0–46, 2020.

P. A. J and E. P. T, “Spatial Autoregressive Model with Mixture of Gaussian Distribution for the Random Effect : Formulation , Estimation and Application,” vol. 12, no. 4, pp. 314–323, 2024, doi: 10.13189/ms.2024.120402.

D. Y. Faidah, A. M. Hudzaifa, and R. S. Pontoh, “Clustering of childhood Diarrhea diseases using gaussian mixture model,” Commun Math Biol Neurosci, vol. 2024, p. Article-ID, 2024.

J. K. Wororomi, F. Reba, and F. A. Asmuruf, “Clustering and Mixture Modeling of Schooling Expectancy Trends in Papua Province : A Spatial Analysis Using the Mapping Toolbox,” vol. 11, no. 3, pp. 459–472, 2025.

M. Ventura, J. Jacques, and P. Zuccolotto, “Model-Based Clustering of Multivariate Rating Data Accounting for Feeling and Uncertainty,” 2025, doi: 10.1007/s00357-025-09521-6.

S. O. Eui et al., “Accessing the fine temporal scale of EUV brightenings and their quasi-periodic pulsations : 1 second cadence observations by,” vol. 2, 2025.

S. Morrissette, S. Muthukumarana, and M. Turgeon, “Machine Learning with Applications Parsimonious Bayesian model-based clustering with dissimilarities,” Mach. Learn. with Appl., vol. 15, no. January, p. 100528, 2024, doi: 10.1016/j.mlwa.2024.100528.

U. Padjadjaran and U. Padjadjaran, “Model-based clustering approach for clustering of heart disease patients based on risk factors,” pp. 1–13, 2025.

L. Laanyuni, N. Id, W. Senyoni, O. Cronie, M. Alifrangis, and A. Stensgaard, “Quantitative modelling for dengue and Aedes mosquitoes in Africa : A systematic review of current approaches and future directions for Early Warning System development,” pp. 1–22, 2024, doi: 10.1371/journal.pntd.0012679.

Published
2026-01-27
How to Cite
[1]
N. Izzatil Ismah and A. Fahmi, “Dengue Risk Stratification in Semarang City Using a Gaussian Mixture Model Based on Multi-Dimensional Urban Indicators”, j.electron.electromedical.eng.med.inform, vol. 8, no. 1, pp. 395-408, Jan. 2026.
Section
Medical Informatics