Comparison of the performance of linear discriminant analysis and binary logistic regression applied to risk factors for mortality in Ebola virus disease patients

  • Leader Lawanga Ontshick Department of Mathematics, Statistics and National Institute for Biomedical Research, Kinshasa, D.R.Congo. Computer Science, Faculty of Science and Technology, University of Kinshasa, Kinshasa, D.R.Congo and
  • Jean-Christophe Mulangu Sabue Department of Clinical Biology, Faculty of Medicine, University of Kinshasa, Kinshasa, D.R.Congo and National Institute for Biomedical Research, Kinshasa, D.R.Congo. https://orcid.org/0000-0002-2722-2637
  • Placide Mbala Kiangebeni Department of Clinical Biology, Faculty of Medicine, University of Kinshasa, Kinshasa, D.R.Congo and National Institute for Biomedical Research, Kinshasa, D.R.Congo.
  • Olivier Tshiani Mbayi Department of Clinical Biology, Faculty of Medicine, University of Kinshasa, Kinshasa, D.R.Congo and National Institute for Biomedical Research, Kinshasa, D.R.Congo.
  • Jean-Michel Nsengi Ntamabyaliro Department of Basic Science, Faculty of Medicine, University of Kinshasa, Kinshasa, D.R.Congo https://orcid.org/0000-0003-2929-9301
  • Jean-jacque Muyembe Tamfumu Department of Clinical Biology, Faculty of Medicine, University of Kinshasa, Kinshasa, D.R.Congo and National Institute for Biomedical Research, Kinshasa, D.R.Congo.
  • Rostin Mabela Makengo Matendo Department of Mathematics, Statistics and Computer Science, Faculty of Science and Technology, University of Kinshasa, Kinshasa, D.R.Congo
Keywords: Ebola Virus, Ebola Virus Mortality, Linear Discriminant Analysis, Logistic Regression, Risk Factor

Abstract

Our study aimed to identify risk factors associated with mortality in Ebola patients using binary logistic regression analysis and linear discriminant analysis, and to assess the predictive power of these two methods. Our study was a randomized, double-blind, controlled (observational) clinical trial conducted in 2018 during the 10th Ebola outbreak in eastern DRC. The study included 363 patients divided into two treatment arms, including 182 patients treated with MAB114 (Ebang) and 181 patients treated with REGENERON (REGN-EB3 ). After a thorough analysis of the data, both statistical analysis methods selected the same set of variables (risk factors) for the binary logistic regression we obtained: viral load 0.58 (0.5-0.67), creatinine 1.98 (1.58-0.67) and aspartate aminotransferase 0.99 (0.9-1); as for the linear discriminant analysis we have viral load (0.88), creatinine (0.94) and aspartate aminotransferase (0.78). We also see almost the same results when different prediction probabilities are evaluated. Logistic regression predicted a mortality rate of 36.5% and linear discriminant analysis predicted a mortality rate of 38.8%. Using the AUC (area under the curve) score, we were able to evaluate two methods and obtain a score of 0.935 for the binary logistic regression and 0.932 for the linear discriminant analysis. According to the evaluation hypothesis, both methods give the same risk factors (viral load (Ctnp), creatinine and alanine aminotransferase (ALT)) with a probability of 93%.

Downloads

Download data is not yet available.

References

[1] Afsa C. Statistical Methodology M 2016/01 The Logit Model Theory and Application [Internet]. Available from: http://www.insee.fr 35
[2] Ahmed AA, Koko AO, Bahar ME. Estimation of sex based on the sterna of Sudanese adults using 47
[3] Ali, A. (2021). Data mining methods in psychiatric epidemiology: application to the analysis of risk factors and markers of depressive symptomatology in adolescence (Doctoral dissertation, Université Paris-Saclay).
[4] Bonita R, Kjellström Tord, Beaglehole R, World Health Organization. Elements 30 of epidemiology. World Health Organization; 2010. 233 p.
[5] Bouveyron C, Girad S, Schmid C. High Dimensional Discriminant Analysis [Internet]. Available from https://hal.inria.fr/inria-00071243 25
[6] Desbois D. An introduction to discriminant analysis with SPSS for Windows [Internet]. Available 39 from: http://www.sm.u-bordeaux2.fr/~corsini/Pedagogy/
[7] Duyme F, Claustriaux JJ, Daudin JJ. REVUE DE STATISTIQUE APPLIQUÉE EGRESSION LOGISTIQUE BINAIRE [Internet]. Available from:http://www.sfds.asso.fr/publicat/rsa.htm
[8] Feng C, Kephart G, Juarez-Colunga E. Predicting COVID-19 mortality risk in Toronto, Canada: a comparison of tree-based and regression-based machine learning methods. BMC Med Res Methodol. 2021 Dec 1;21(1).
[9] Fourati, S. (2022). Multiomic analysis for the identification of predictive biomarkers of vaccine response.
[10] Ghasemi, E., & Gholizadeh, H. (2019). Development of two empirical correlations for tunnel squeezing prediction using binary logistic regression and linear discriminant analysis. Geotechnical and Geological Engineering, 37, 3435-3446.
[11] Graf, R., Zeldovich, M., & Friedrich, S. (2022). Comparing linear discriminant analysis and supervised learning algorithms for binary classification-A method comparison study. Biometrical Journal.
[12] Hasan, M. N. (2019, December). A comparison of logistic regression and linear discriminant analysis in predicting of female students attrition from school in Bangladesh. In 2019 4th international conference on electrical information and communication technology (EICT) (pp. 1-3). IEEE.
[13] Jaspard, M., Mulangu, S., Juchet, S., Serra, B., Dicko, I., Lang, H. J., ... & Malvy, D. (2022). Development of the PREDS score to predict in-hospital mortality of patients with Ebola virus disease under advanced supportive care: Results from the EVISTA cohort in the Democratic Republic of the Congo. Eclinicalmedicine, 54, 101699.
[14] Jaspard, M., Mulangu, S., Juchet, S., Serra, B., Dicko, I., Lang, H. J., ... & Malvy, D. (2022). Development of the PREDS score to predict in-hospital mortality of patients with Ebola virus disease under advanced supportive care: Results from the EVISTA cohort in the Democratic Republic of the Congo. Eclinicalmedicine, 54, 101699.
[15] Lacroix, A., Mbala Kingebeni, P., Ndimbo Kumugo, S. P., Lempu, G., Butel, C., Serrano, L., ... & Ahuka Mundeke, S. (2021). Investigating the circulation of Ebola viruses in bats during the Ebola virus disease outbreaks in the Equateur and North Kivu Provinces of the Democratic Republic of Congo from 2018. Pathogens, 10(5), 557.
[16] Mulangu S, Dodd LE, Davey RT, Tshiani Mbaya O, Proschan M, Mukadi D, et al. A Randomized, 9 Controlled Trial of Ebola Virus Disease Therapeutics. New England Journal of Medicine. 2019 Dec 12;381(24):2293-303.
[17] Altayeb Abdalla Ahmed, Alaa Osman Koko, Mustafa Elnour Bahar, multidetector computed tomography: a comparison of discriminant function analysis and binary logistic regression. Homo. 2021 Mar 21;72(1):41-51.
[18] Pham, B. T., & Prakash, I. (2019). Evaluation and comparison of LogitBoost Ensemble, Fisher's Linear Discriminant Analysis, logistic regression and support vector machines methods for landslide susceptibility mapping. Geocarto International, 34(3), 316-333.
[19] Pietri, M., & Bonnet, A. (2017, February). Alexithymia, emotional intensity and anxiety/depressive symptomatology: Explanatory dimensions of smoking. In Annales Médico-psychologiques, revue psychiatrique (Vol. 175, No. 2, pp. 146-152). Elsevier Masson.
[20] Rani D, Krishan K, Kanchan T. A methodological comparison of discriminant function analysis and 43 binary logistic regression for estimating sex in forensic research and case-work. Med Sci Law. 2022 Nov 10;002580242211366
[21] Rani, D. , Krishan, K., & Kanchan, T. (2022). A methodological comparison of discriminant function analysis and binary logistic regression for estimating sex in forensic research and case-work . Medicine, Science and the Law, 00258024221136687.
[22] Santos F, Guyomarc'h P, Bruzek J. Statistical sex determination from craniometrics: Comparison of 14 linear discriminant analysis, logistic regression, and support vector machines. Forensic Sci Int. 2014 15 Dec 1;245:204.e1-204.e8.
[23] Steffen, I., Lu, K., Yamamoto, L. K., Hoff, N. A., Mulembakani, P., Wemakoy, E. O., ... & Simmons, G. (2019). Serologic prevalence of Ebola virus in equatorial Africa. Emerging Infectious Diseases, 25(5), 911.
[24] Traoré, B. B. (2018). Information modeling and knowledge extraction for crisis management (Doctoral dissertation).
[25] Tshomba, A. O., Mukadi-Bamuleka, D. R., De Weggheleire, A., Tshiani, O. M., Kitenge, R. O., Kayembe, C. T., ... & Mulangu, S. (2022). Development of Ebola virus disease prediction scores: Screening tools for Ebola suspects at the triage-point during an outbreak. Plos one, 17(12), e0278678.
[26] Yan, D., Chi, G., & Lai, K. K. (2020). Financial distress prediction and feature selection in multiple periods by lassoing unconstrained distributed lag nonlinear models. Mathematics, 8(8), 1275.
Published
2023-07-29
How to Cite
[1]
L. Lawanga Ontshick, “Comparison of the performance of linear discriminant analysis and binary logistic regression applied to risk factors for mortality in Ebola virus disease patients”, j.electron.electromedical.eng.med.inform, vol. 5, no. 3, pp. 205-210, Jul. 2023.
Section
Electronics