Dengue Risk Stratification in Semarang City Using a Gaussian Mixture Model Based on Multi-Dimensional Urban Indicators
Abstract
Dengue fever remains a pressing public health challenge in major Indonesian cities, including Semarang. The complex interplay of heterogeneous demographic structures and built-environment characteristics generates spatially uneven transmission risks, while conventional risk-mapping approaches often fail to capture the probabilistic nature of these risks at fine-scale administrative levels, limiting their utility for targeted interventions. This study aims to develop a robust, replicable framework for dengue risk stratification that more accurately identifies localized high-risk areas and supports evidence-based public health decision-making. The research introduces a probabilistic clustering approach using Gaussian Mixture Models (GMM) to move beyond rigid partitioning methods, while simultaneously integrating multi-year incidence data (2021–2024) with eighteen multidimensional urban indicators across 177 sub-districts (kelurahan). This combined contribution advances methodological rigor by accommodating overlapping data distributions and probabilistic cluster memberships, and provides a nuanced, evidence-driven tool for stratifying dengue risk and guiding hyper-local interventions. Several GMM configurations were evaluated using the Bayesian Information Criterion (BIC) to determine the optimal number of clusters. The BIC value declined markedly when the number of clusters increased from two to three, indicating a substantial improvement in model fit. Further increases yielded only marginal gains, and the lowest BIC was achieved at three clusters, representing the most parsimonious and effective solution. Internal validation confirmed that the cluster structure robustly captured epidemiological variance despite the inherent heterogeneity of urban spatial data. Cluster 2 emerged as a critical high-risk epicenter, geographically limited yet characterized by consistently elevated incidence, pronounced temporal variability, and extreme values. The proposed GMM-based framework demonstrates that dengue risk in Semarang is concentrated within localized foci of heightened vulnerability rather than uniformly distributed. Ultimately, the methodology is replicable in other complex tropical urban environments, thereby strengthening both academic rigor and practical public health decision-making
Downloads
References
R. Lowe et al., “Articles Combined effects of hydrometeorological hazards and urbanisation on dengue risk in Brazil : a spatiotemporal modelling study,” Lancet Planet. Heal., vol. 5, no. 4, pp. e209–e219, 2021, doi: 10.1016/S2542-5196(20)30292-8.
I. G. Nyoman, M. Jaya, Y. Andriyana, B. Tantular, and S. S. Pangastuti, “Spatiotemporal Dengue Forecasting for Sustainable Public Health in Bandung , Indonesia : A Comparative Study of Classical , Machine Learning , and Bayesian Models,” pp. 1–26, 2025.
A. Oliveira, L. W. Lopes, and R. M. De Moraes, “Evaluation of spatial cluster detection methods for dengue fever in the state of Paraiba , Brazil,” vol. 20, pp. 393–400, 2025, doi: 10.4081/gh.2025.1393.
S. Hossain, M. M. Safa, N. F. Juthi, and N. Tasnia, “Bayesian hybrid statistical and machine learning models for dengue forecasting in Bangladesh: Temporal and spatial analysis for an early warning system,” 2025.
C. E. Sekarrini, Sumarmi, S. Bachri, D. Taryana, and E. A. Giofandi, “The application of geographic information system for dengue epidemic in Southeast Asia: A review on trends and opportunity,” J. Public health Res., vol. 11, no. 3, Jul. 2022, doi: 10.1177/22799036221104170.
S. Yin, C. Ren, Y. Shi, J. Hua, H. Yuan, and L. Tian, “A Systematic Review on Modeling Methods and Influential Factors for Mapping Dengue-Related Risk in Urban Settings,” 2022.
A. Lim et al., “A systematic review of the data , methods and environmental covariates used to map Aedes -borne arbovirus transmission risk,” pp. 1–18, 2023.
P. Kumar et al., “Envisioning urban environments resilient to vector-borne diseases : a protocol to study dengue in Vietnam,” pp. 17–27, doi: 10.37349/edht.2023.00004.
R. Feliciano, J. Valter, J. Silva, and A. F. Pastor, “Spatiotemporal dynamics , risk areas and social determinants of dengue in Northeastern Brazil , 2014 – 2017 : an ecological study,” Infect. Dis. Poverty, pp. 1–16, 2020, doi: 10.1186/s40249-020-00772-6.
M. K. Gurram, S. Gwee, Y. Wang, and J. Pang, “Spatiotemporal distribution of sustained dengue hotspots associated with climate and urbanisation in Singapore,” pp. 1–11, 2025.
I. C. Gormley, T. B. Murphy, and A. E. Raftery, “Model-Based Clustering,” pp. 573–595, 2023.
Y. Koesmaryono, A. Sopaheluwakan, R. Hidayati, and B. D. Dasanto, “Spatiotemporal Characterization of Dengue Incidence and Its,” 2024.
B. Liu, F. Hossain, and S. Hossain, “OPEN A comparative evaluation of multiple machine learning approaches for forecasting dengue outbreaks in Bangladesh,” 2025.
B. Chassagnol et al., “Gaussian Mixture Models in R,” vol. 15, no. June, pp. 56–76, 2023.
M. L. B. Lopes, “Unsupervised Learning Applied to the Stratification of Preterm Birth Risk in Brazil with Socioeconomic Data,” 2022.
H. Fidan and M. Erkan, “A comparative study for determining Covid-19 risk levels by unsupervised machine learning methods,” Expert Syst. Appl., vol. 190, no. August 2021, p. 116243, 2022, doi: 10.1016/j.eswa.2021.116243.
J. Cao, L. Lin, and W. Li, “Limitations of clustering with PCA and correlated noise,” J. Stat. Comput. Simul., vol. 94, no. 11, pp. 2489–2512, 2024, doi: 10.1080/00949655.2024.2329976.
M. E. Mar, “Confidence set for mixture order selection,” pp. 1–14.
M. Bai, “Ecohydrological zoning based on the Gaussian mixture model clustering method,” vol. 27, no. 4, pp. 771–786, 2025, doi: 10.2166/hydro.2025.305.
M. Science, E. Division, and S. Arabia, “Bayesian spatial functional data clustering: applications in disease surveillance b,” pp. 1–19, 2019.
C. E. Sekarrini, Sumarmi, S. Bachri, D. Taryana, and E. A. Giofandi, “The application of geographic information system for dengue epidemic in Southeast Asia: A review on trends and opportunity,” J. Public health Res., vol. 11, no. 3, 2022, doi: 10.1177/22799036221104170.
O. Mendoza-cano, R. Danis-lozano, X. Trujillo, and M. Huerta, “Spatial patterns and clustering of dengue incidence in Mexico : Analysis of Moran ’ s index across 2 , 471 municipalities from 2022 to 2024,” vol. 2024, pp. 1–14, 2025, doi: 10.1371/journal.pone.0324754.
C. Rotejanaprasert, K. Chinpong, A. B. Lawson, and P. Chienwichai, “Evaluation and comparison of spatial cluster detection methods for improved decision making of disease surveillance : a case study of national dengue surveillance in Thailand,” BMC Med. Res. Methodol., pp. 1–13, 2024, doi: 10.1186/s12874-023-02135-9.
A. L. Asyary, “Spatial Analysis on Dengue Fever Vulnerability in the Provinces of South Sulawesi and East Nusa Tenggara in Indonesia,” vol. 91, no. 1, pp. 1–13, 2025, doi: 10.5334/aogh.4915.
Y. Wang and X. Chen, “A Review of Bayesian Spatiotemporal Models in Spatial Epidemiology,” 2024.
F. P. Rocha and M. Giesbrecht, “Machine learning algorithms for dengue risk assessment : a case study for São Luís do Maranhão,” Comput. Appl. Math., vol. 41, no. 8, pp. 1–27, 2022, doi: 10.1007/s40314-022-02101-z.
C. Corvalan et al., “Towards Climate Resilient and Environmentally Sustainable Health Care Facilities,” 2020.
M. Mahdavian, K. Yin, and M. Chen, “Robust Visual Teach and Repeat for UGVs Using 3D Semantic Maps”.
C. Inference and H. Sciences, Causal Inference and Machine Learning.
V. Gómez-rubio, “Journal of Statistical Software,” vol. 100, no. 1, pp. 1–7, 2021, doi: 10.18637/jss.v100.i01.
E. M. Dillon et al., “Dermal denticle assemblages in coral reef sediments correlate with conventional shark surveys,” vol. 2020, no. September 2019, pp. 362–375, 2020, doi: 10.1111/2041-210X.13346.
Y. Cao et al., “Epidemiological Trends and Age – Period – Cohort Effects on Dengue Incidence Across High-Risk Regions from 1992 to 2021,” pp. 1–17, 2025.
J. Briseno-ramirez et al., “Annals of Epidemiology Trends and cyclical patterns of dengue disease in Mexico : A 40-year time series analysis,” vol. 112, no. October, pp. 53–63, 2025, doi: 10.1016/j.annepidem.2025.10.022.
X. Shou, G. Mavroudeas, A. New, and K. Arhin, “Supervised Mixture Models for Population Health,” 2016.
Z. R. Mccaw, H. Julienne, and H. Aschard, “MGMM : An R Package for fitting Gaussian Mixture Models on Incomplete Data,” pp. 0–46, 2020.
P. A. J and E. P. T, “Spatial Autoregressive Model with Mixture of Gaussian Distribution for the Random Effect : Formulation , Estimation and Application,” vol. 12, no. 4, pp. 314–323, 2024, doi: 10.13189/ms.2024.120402.
D. Y. Faidah, A. M. Hudzaifa, and R. S. Pontoh, “Clustering of childhood Diarrhea diseases using gaussian mixture model,” Commun Math Biol Neurosci, vol. 2024, p. Article-ID, 2024.
J. K. Wororomi, F. Reba, and F. A. Asmuruf, “Clustering and Mixture Modeling of Schooling Expectancy Trends in Papua Province : A Spatial Analysis Using the Mapping Toolbox,” vol. 11, no. 3, pp. 459–472, 2025.
M. Ventura, J. Jacques, and P. Zuccolotto, “Model-Based Clustering of Multivariate Rating Data Accounting for Feeling and Uncertainty,” 2025, doi: 10.1007/s00357-025-09521-6.
S. O. Eui et al., “Accessing the fine temporal scale of EUV brightenings and their quasi-periodic pulsations : 1 second cadence observations by,” vol. 2, 2025.
S. Morrissette, S. Muthukumarana, and M. Turgeon, “Machine Learning with Applications Parsimonious Bayesian model-based clustering with dissimilarities,” Mach. Learn. with Appl., vol. 15, no. January, p. 100528, 2024, doi: 10.1016/j.mlwa.2024.100528.
U. Padjadjaran and U. Padjadjaran, “Model-based clustering approach for clustering of heart disease patients based on risk factors,” pp. 1–13, 2025.
L. Laanyuni, N. Id, W. Senyoni, O. Cronie, M. Alifrangis, and A. Stensgaard, “Quantitative modelling for dengue and Aedes mosquitoes in Africa : A systematic review of current approaches and future directions for Early Warning System development,” pp. 1–22, 2024, doi: 10.1371/journal.pntd.0012679.
Copyright (c) 2026 Nabila Izzatil Ismah, Amiq Fahmi

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
Authors who publish with this journal agree to the following terms:
- Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution-ShareAlikel 4.0 International (CC BY-SA 4.0) that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
- Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
- Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work (See The Effect of Open Access).


.png)
.png)
.png)
.png)
.png)