An Explainable Machine Learning Pipeline for Stroke Prediction on Imbalanced Data

Kokkotis C., Giarmatzis G., Giannakou E., Moustakidis S., Tsatalas T., Tsiptsios D., Vadikolias K., Aggelousis N.

dc.creator	Kokkotis C., Giarmatzis G., Giannakou E., Moustakidis S., Tsatalas T., Tsiptsios D., Vadikolias K., Aggelousis N.	en
dc.date.accessioned	2023-01-31T08:43:33Z
dc.date.available	2023-01-31T08:43:33Z
dc.date.issued	2022
dc.identifier	10.3390/diagnostics12102392
dc.identifier.issn	20754418
dc.identifier.uri	http://hdl.handle.net/11615/74961
dc.description.abstract	Stroke is an acute neurological dysfunction attributed to a focal injury of the central nervous system due to reduced blood flow to the brain. Nowadays, stroke is a global threat associated with premature death and huge economic consequences. Hence, there is an urgency to model the effect of several risk factors on stroke occurrence, and artificial intelligence (AI) seems to be the appropriate tool. In the present study, we aimed to (i) develop reliable machine learning (ML) prediction models for stroke disease; (ii) cope with a typical severe class imbalance problem, which is posed due to the stroke patients’ class being significantly smaller than the healthy class; and (iii) interpret the model output for understanding the decision-making mechanism. The effectiveness of the proposed ML approach was investigated in a comparative analysis with six well-known classifiers with respect to metrics that are related to both generalization capability and prediction accuracy. The best overall false-negative rate was achieved by the Multi-Layer Perceptron (MLP) classifier (18.60%). Shapley Additive Explanations (SHAP) were employed to investigate the impact of the risk factors on the prediction output. The proposed AI method could lead to the creation of advanced and effective risk stratification strategies for each stroke patient, which would allow for timely diagnosis and the right treatments. © 2022 by the authors.	en
dc.language.iso	en	en
dc.source	Diagnostics	en
dc.source.uri	https://www.scopus.com/inward/record.uri?eid=2-s2.0-85140908695&doi=10.3390%2fdiagnostics12102392&partnerID=40&md5=8501575116060cf1f15b4cbe8099682b
dc.subject	glucose	en
dc.subject	age	en
dc.subject	Article	en
dc.subject	artificial intelligence	en
dc.subject	body mass	en
dc.subject	cerebrovascular accident	en
dc.subject	classifier	en
dc.subject	comparative study	en
dc.subject	cross validation	en
dc.subject	diagnostic accuracy	en
dc.subject	diagnostic test accuracy study	en
dc.subject	false negative result	en
dc.subject	false positive result	en
dc.subject	female	en
dc.subject	glucose blood level	en
dc.subject	human	en
dc.subject	hypertension	en
dc.subject	k nearest neighbor	en
dc.subject	logistic regression analysis	en
dc.subject	machine learning	en
dc.subject	male	en
dc.subject	multilayer perceptron	en
dc.subject	predictive model	en
dc.subject	prognosis	en
dc.subject	random forest	en
dc.subject	receiver operating characteristic	en
dc.subject	risk factor	en
dc.subject	sensitivity and specificity	en
dc.subject	stroke patient	en
dc.subject	support vector machine	en
dc.subject	XGBoost	en
dc.subject	MDPI	en
dc.title	An Explainable Machine Learning Pipeline for Stroke Prediction on Imbalanced Data	en
dc.type	journalArticle	en

Dateien zu dieser Ressource

Dateien	Größe	Format	Anzeige
Zu diesem Dokument gibt es keine Dateien.

Das Dokument erscheint in:

Δημοσιεύσεις σε περιοδικά, συνέδρια, κεφάλαια βιβλίων κλπ. [19735]

Zur Kurzanzeige