Comprehensive Evaluation of Multi-Class Classifiers in the Early Detection of Neurodegenerative Diseases through Decision Support System

shamaila iram; hafiz muhammad athar farid; farideh javid; rakesh mishra

Outline

Open Access

Research article

Comprehensive Evaluation of Multi-Class Classifiers in the Early Detection of Neurodegenerative Diseases through Decision Support System

Shamaila Iram¹^*

,

Hafiz Muhammad Athar Farid¹^*

,

Farideh Javid²

,

Rakesh Mishra³

¹

Department of Computer Science, University of Huddersfield, HD1 3DH Huddersfield, United Kingdom

²

Department of Pharmacy, University of Huddersfield, HD1 3DH Huddersfield, United Kingdom

³

School of Computing and Engineering, University of Huddersfield, HD1 3DH Huddersfield, United Kingdom

International Journal of Computational Methods and Experimental Measurements

|

Volume 14, Issue 2, 2026

|

Pages 188-206

https://doi.org/10.56578/ijcmem140202

Received: 03-23-2026,

Revised: 05-07-2026,

Accepted: 05-22-2026,

Available online: 05-29-2026

View Full Article|

Download PDF

Abstract:

The early and accurate diagnosis of neurodegenerative diseases presents a significant clinical challenge, particularly in distinguishing between conditions with overlapping symptoms. Much of the existing research has focused on binary classification, which inadequately addresses the multi-class nature of real-world differential diagnosis. This study’s objective is to conduct a comprehensive evaluation of multi-class machine learning classifiers for the early detection of neurodegenerative diseases using gait signal data. Furthermore, we propose and implement a novel decision support system to automate the selection of the most effective classifier based on defined clinical priorities. We utilised a public gait dynamics dataset from Physionet, comprising data from healthy individuals and patients diagnosed with Parkinson’s Disease, Huntington’s Disease, and Amyotrophic Lateral Sclerosis (ALS), forming a four-class classification problem. A feature set including gait signals and demographic variables such as age and body mass index was established. Eleven classifiers, categorised as density-based, linear, and non-linear, were trained and evaluated. To automate the selection of the optimal model, a decision-making framework was employed to assign weights to evaluation metrics and rank the classifiers. The classifiers demonstrated varied performance across multiple evaluation metrics. The Bayes Normal-U (UDC) classifier achieved the highest accuracy at 65.0%, with a precision of 86.4%, sensitivity of 63.0%, and specificity of 70.0%. The Bayes Normal-L (LDC) classifier yielded an accuracy of 62.5%, with 85.7% precision, 60.0% sensitivity, and 70.0% specificity. The implemented decision support system ranked the UDC classifier as the optimal choice. Notably, the system ranked Fisher’s classifier third, ahead of others with higher accuracy, by prioritising its superior sensitivity (57.5%) and lower Type II error rate, which are critical for reducing missed diagnoses in a clinical setting. Simple accuracy is an insufficient metric for evaluating classifiers in complex, multi-class medical diagnostic scenarios. Our proposed decision support framework provides a robust and automated methodology for selecting the most clinically relevant classifier by systematically balancing multiple performance indicators. This approach enhances the transparency and reliability of machine learning in clinical decision-making and contributes to the development of more effective, deployable diagnostic tools for neurodegenerative diseases.

Keywords: Classification, Feature extraction, Gait, Automated evaluation, Decision support tool, Neurodegenerative diseases

1. Introduction

Cognitive impairments are considered one of the most important symptoms of Dementia that can be linked with the agitated behaviour in the people who are suffering with this disease. Agitation could be of different forms; physical aggression, agitated movements, or verbally agitated behaviours. It is possible for the agitation to be influenced by the surrounding environment and other circumstances and to go undetected in its early phases. There could be a rapid increase in the symptoms in the absence of intervention from the carer (CG) at the right time [1]. One of the primary reasons that these patients end up in the nursing homes is because of the significant effect of agitation on their daily routines [2]. Taking into account the growing proportion of elderly people in the population as well as the fact that advancing age is the most significant risk factor for dementia, it is critical to make sure that people living with dementia can continue to participate in communal life for as long as possible. Therefore, it is vital to lessen the stress of CG management at home while simultaneously raising the level of CG self-efficacy.

Dementia and cognitive decline are associated with considerable gait and postural deficits [3], which are further connected with poor health and mortality. These impairments include sluggish gait speed, greater variability, difficulty to dual-tasking, and increased postural sway. People who have Mild Cognitive Impairment (MCI) could exhibit motor dysfunction that includes abnormalities in walking and balance, in addition to cognitive problems. Motor function assessment may make it possible to detect the motor abnormalities that are associated with MCI. This could possibly allow us early diagnosis and intervention of dementia [4].

Although studies suggest a strong relation between cognitive impairment and gait disorder, the chronological development of gait problems induced by cognitive dysfunctions in the context of the evolution of dementia and its therapeutic application has been inadequately investigated. This is especially true when considering the advancement of dementia [5].

Cerebrospinal fluid analysis (CFA), biomarker detection, and neuroimaging are considered essential tools in traditional healthcare for a thorough diagnosis [6], [7], [8]. However, obtaining such information frequently depends on the provision of costly equipment, and the analysis is frequently considered a laborious process [9]. Furthermore, the ultimate diagnosis is significantly impacted by the proficiency and knowledge of physicians. Hence, the acquisition of patients’ medical problems while mitigating undue financial strain and formulating an appropriate analytical approach have emerged as prominent areas of research in contemporary times [10]. The relationship between human gait and cognitive capacities is evident from the literature, and it could be used as a clinical marker for the transition from MCI to dementia. Therefore, the gait signals can serve as a reliable indicator for distinguishing movement problems resulting from dysfunction in certain areas of the brain.

Precise and timely identification of neurological diseases is essential for prescribing suitable treatment. Utilising non-invasive techniques would be highly beneficial in attaining this objective. Streamlining disease identification facilitates physicians in minimising diagnosis expenses and time, which is vital for swiftly treating progressing illnesses. Recognising problems with movement by considering their impact on patient motions can be extremely beneficial, especially when more sophisticated testing is necessary [11]. This research work evaluates several classifiers by utilising variables derived from the gait signals of both healthy individuals and patients diagnosed with three specific movement issues. These movement problems are linked with three neurological diseases named as Parkinson Disease, Amyotrophic Lateral Sclerosis (ALS), and Huntington’s disease. The selection of these conditions was based on their common neurodegenerative aetiology, which can make diagnosis more challenging. More detail about these diseases is provided in the next section.

The process of decision-making is crucial to our daily lives and holds significance in all areas. The decision-making process is a systematic approach that allows specialists to address difficulties by carefully assessing information, exploring alternative options, and eventually drawing well-informed conclusions. Moreover, this clearly defined approach offers an opportunity to evaluate the effectiveness of the final decision, thereby ensuring its alignment with the anticipated outcome [12].

Every day, healthcare professionals must make carefully thought-out decisions that have substantial consequences, impacting not only patients but also communities, nations, and the global population. Healthcare professionals may sometimes face the task of making decisions with limited information, resources, and knowledge. However, it is expected that they approach this task with careful attention to detail and accuracy [13].

This study also addresses the challenge of selecting the optimal classifier by evaluating their performance metrics and errors. The performance of classifiers is evidently influenced by performance metrics and errors, which have opposing aims that need to be fulfilled. Therefore, the current issue is addressed using multi-criteria decision-making (MCDM) models. MCDM models involve the process of rating available alternative options based on preset criteria to determine the optimum decision for addressing complex real-world situations.

We employed two methodologies for this purpose. The weights for the criteria (performance metrics and errors) were determined using the “Logarithmic Percentage Change-driven Objective Weighting (LOPCOW) method”. The “Evaluation based on Relative Utility and Nonlinear Standardisation (ERUNS)” method was employed to identify the best classifier. The LOPCOW approach, developed by Ecer and Pamucar [14], calculates the mean square value of a series as a percentage of its standard deviation. This calculation eliminates the difference (gap) caused by variations in the size of the data. LOPCOW is highly beneficial and favoured by numerous authors across diverse fields such as sociology, business, engineering, and scientific research [15], [16], [17]. It offers several advantages, including the ability to address the highly imbalanced distribution of criteria weights that impact the accurate ranking positions, handle negative and zero values in the initial performance or decision matrix, and accommodate a large number of criteria and alternatives. Biswas et al. [18] introduced the ERUNS approach, which relies on the standardisation of the initial decision matrix. The choice of normalisation strategy often significantly impacts the final outcome. Normalisation often proves inadequate in accurately capturing the full range of performance values when dealing with a substantial number of criteria and alternatives. The ERUNS approach differentiates between alternatives by carefully choosing the standardisation interval and considering many parameters. In addition, it provides a unique interpretation of utility degree, which is applied using a softmax function and user-centric non-linear interval standardisation. The solution provided is both stable and reliable and does not exhibit the rank reversal problem [18]. Figure 1 presents a flowchart of the step-by-step process for the application of the decision support tool to select the most suitable classifiers for the early detection of neurodegenerative diseases.

Figure 1. Flow chart of research methodology

1.1 Research Challenges

Most literature focuses on applying pattern recognition techniques to binary (2-class) datasets, where it is relatively straightforward to classify data and evaluate performance using standard metrics like accuracy. However, the complexity increases significantly when dealing with multi-class datasets. In such cases, it becomes challenging to apply pattern recognition or machine learning algorithms effectively, and a comprehensive set of evaluation metrics is required to accurately assess the performance of the proposed techniques.

In this research, we aim to utilise a multi-class dataset comprising four distinct classes. We identify the following critical challenges in the evaluation of the performance of our proposed techniques for the detection of different kinds of neurodegenerative diseases using gait signals.

1. What are different sets of evaluation techniques that provide a balanced consideration of all relevant metrics, such as accuracy, precision, sensitivity, etc., to understand the performance of the selected classifiers?

2. What is the criterion to determine which classifier outperforms others when different classifiers excel according to different evaluation metrics?

3. What is the criterion to automate the selection process of the most suitable evaluation technique without visually exploring and comparing the results to provide in-time treatments?

1.2 Contributions of the Paper

This research makes a substantial contribution to the discipline through the provision of a novel decision-making tool that helps to choose the more effective ML algorithm for the pattern recognition of Dementia. Its notable contributions are as follows:

1. This study identifies all significant features that influence the progression of neurodegenerative diseases (NDDs). Key factors include age, gender, weight, BMI, and the precise severity level of the disease. These critical aspects are frequently overlooked in the existing literature, and our research aims to fill this gap.

2. Due to the multiclass nature of the datasets, this research utilised a comprehensive set of evaluation metrics to compare the performance of eleven different classifiers.

3. The difficulty of determining the most effective classifier is examined in this research through an assessment of performance metrics and errors. It is recognised that individual classifiers may succeed in various metrics.

1.3 Organization of the Paper

The subsequent sections of this article are organised as follows: Section 2 explores various NDDs, their progression stages, and associated symptoms. Section 3 discusses the application of machine learning algorithms for early detection of NDDs and highlights existing gaps. Section 4 delves into feature selection and classification algorithms used in this study. Section 5 presents the results of the evaluation metrics used to assess classifier performance. Section 6 details the application of decision-making algorithms for selecting the best-performing classifier. Finally, Section 7 offers concluding remarks and future recommendations.

2. Neurodegenerative Diseases

NDDs are a range of medical illnesses that predominantly affect the neurones in the brain. Conditions such as Parkinson’s, Alzheimer’s, Huntington’s, and ALS are classified within this category. People affected by these disorders usually have a gradual deterioration in their cognitive function, which leads to symptoms such as problems with walking, difficulties in speaking, and loss of memory as their cognitive abilities degrade significantly over time. As the lifespan of individuals increases, there is a growing prevalence of neurological illnesses in developed countries. This places a substantial financial burden on healthcare systems. In 2005, the expenses for treating the 29.3 million people affected by dementia were around USD 315 billion [19]. The global cost of treating 34 million individuals with dementia was estimated to be USD 422 billion in 2009 [20]. As the disease advances, patients may exhibit behaviour resembling that of a child due to the degenerative effects on neurone development. This reversal of neuronal processes leads to functional impairments, changes in behaviour, disabilities, and cognitive or neurological disorders.

The subsequent sections provide a concise overview of each of these diseases.

Alzheimer’s Disease: Alzheimer’s disease is the most significant and rapidly increasing neurological condition affecting older individuals [21]. According to a recent survey, Alzheimer’s disease has gained equal importance in healthcare alongside leukaemia and cardio-related diseases, which have traditionally been considered of high priority [22]. This condition has been demonstrated to be both critical and intractable. The precise aetiology is uncertain, with no substantiated proof to indicate if the sickness or the accumulation of proteins is the primary causative factor.

The neurological manifestations linked to these symptoms involve the development of “tangles” and “plaque” composed of a detrimental protein known as Amyloid Beta (A$\beta$). The entorhinal cortex and hippocampus are where pathological neurofibrillary tangles gradually build up. These are the regions of the brain responsible for both immediate and enduring memory in individuals [23]. Neuroscientists have shown that maintaining communication between these two brain regions is crucial for preserving memory. Any disruption in this connection can interrupt the circuit and result in memory disturbance, ultimately leading to memory loss [24].

Parkinson’s Disease: In 1817, Parkinson’s disease, first documented by Parkinson [25], is a common neurological disorder that impacts more than 2% of those aged 65 and older. It has an annual occurrence rate of 5–20 cases per 100,000 people. The treatment of this condition results in a significant economic cost, as seen by the £5993 expenditure per person, and it is anticipated that these expenses will increase in the future. The prevalence of this condition is more than 2% in adults over 65 years old, with an incidence rate of approximately 5–20 per 100,000 individuals per year. This indicates a strong association between the disease and the ageing process [26]. Based on a health economic assessment from the UK, the required investment for treating one person is £5993. This represents a significant challenge, and there is a potential concern of further cost increases in the future [27].

Parkinson’s disease is defined by the degeneration of dopaminergic nerve cells in the substantia nigra, a region of the brain that produces dopamine. Dopamine is a neurotransmitter that regulates movement in various areas of the body. At the base of the brain, this degenerative process initiates the deterioration of the olfactory bulbs. This is then followed by the lower brain stem, and later affects the substantia nigra and mid brain [28]. Ultimately, the limbic system and frontal neocortex will be eradicated, leading to the emergence of cognitive and psychiatric disorders.

Huntington’s Disease: It was discovered in 1872 by George Huntington and is a severe degenerative neuropsychiatric illness [29]. Based on a report, the general occurrence rate of Huntington’s illness in Caucasian populations is 8 cases per 10,000 individuals [30]. Thus far, no prophylactic interventions have been identified for this lethal ailment.

Thus far, no prophylactic measures have been unearthed for this lethal ailment. The PolyQ region of the brain contains the Huntington’s gene, which consists of 11–34 repeating sections of glutamine. This gene is responsible for the creation of a cytoplasmic protein known as Huntington. Huntington’s disease is caused by the production of a mutant Huntingtin protein when the PolyQ region generates more glutamine sections. This condition is an untreatable hyperkinetic motor dysfunction. The main manifestations of this illness are involuntary and unsteady movements known as chorea [31].

ALS: It is sometimes referred to as Lou Gehrig’s disease, which is a neurological disorder that causes degeneration of both the lower motor neurones (LMN) and upper motor neurones (UMN). This disease, which is more prevalent in males, is primarily observed in adults aged 40–70 years [32]. This disease, whether occurring sporadically or in familial forms, has an estimated incidence rate of 0.4 to 1.8 per 100,000 people, which is very consistent worldwide [33].

This concise introduction highlights that various neurological disorders lead to atrophy in distinct regions of the brain. In particular, Alzheimer’s disease leads to the deterioration of the cortex and hippocampus, the caudate damage in the brain happens due to Huntington’s disease, Parkinson’s disorder affects the area known as the substantia nigra, and the damage in the lower motor and pyramidal neurones happens due to ALS, resulting in a substantial impairment of bodily movement.

2.1 Gait Abnormality Relation with Neurodegenerative Diseases

“Gait” refers to the cyclic movement of the feet, where one foot strikes the ground alternately with the other. The measurements collected from the Instances in which the feet move from stride to stride are known as gait signals. Hausdorff et al. [34] proposed that studying the connection between the decline of motor neurons and the disruption of stride-to-stride dynamics can aid in monitoring the evolution of neurodegenerative illnesses and evaluating prospective treatment strategies. The gait cycle duration, commonly known as stride time, varies in a complex manner from one stride to the next. On the other hand, the amplitude of the fluctuations in the strides of the control individuals is quite low (2%) due to intact neural regulation.

Anticipating an interruption in walking patterns suggests an interruption in cognitive abilities. Scherdera et.al [35] introduced the term “Last-in-First-out” to describe the susceptibility of brain circuits that mature later in the development life cycle to neurodegeneration. This notion aids in the early prediction of many forms of dementia (neurodegenerative disorders). It was mentioned by Zhu and Henry [36] that a correct gait pattern requires input not just from the neurological system that is connected to neurons responsible for sensation and movement, but also from cerebral processes like cognition, decision-making, and spatial perception. Currently, there is a focus on studying more complex walking difficulties that are strongly linked to disruptions in connections between different areas of the brain, such as the frontal and parietal lobes, and the frontal lobes and basal ganglia [37]. There is a direct correlation between cognitive function disruption and more severe gait problems at a higher level. This disruption is a primary indication of brain disease.

E-Health systems have been rapidly advancing to enhance the process of identifying and treating diseases through the implementation of disease management or integrated care techniques. The adoption of Information technology can enhance the decision-making process of physicians during diagnosis and treatment. Regrettably, a definitive diagnosis of NDDs may only be made after the patient’s death, by the direct examination of the afflicted brain tissues [38]. The illness signs are only apparent in the final stage, namely when gait abnormality occurs. At this point, no cure works, which leads to a distressing situation for the patient, awaiting their inevitable demise. For a more comprehensive comprehension of neurodegenerative disorders, it is worthwhile to go into the specifics of some of these conditions.

3. Related Work and Research Gaps

Pattern recognition is currently receiving significant interest in the medical field due to its shown superiority over traditional clinical statistical methods [39] in predicting clinical outcomes. For example, Lin [40] developed a framework of regression tree and classification to treat liver disease, while Lee et al. [41] constructed a strategy based on feature selection and classification to detect lung nodules. Shi et al. [42] deployed artificial neural networks (ANNs) to identify electromyography (EMG) signals to accurately classify term and preterm labour in rats with a 100% success rate. Similar to this, Long et al. [43] used structural pictures and functional magnetic resonance imaging (fMRI) as characteristics to correctly classify Parkinson’s disease using an SVM classifier.

Within the realm of supervised machine learning, it is possible to recognise data patterns through the utilisation of template matching, deep neural networks, and mathematical techniques [44]. A fundamental limitation of pattern matching is its inability to identify patterns that come from classes with significant variances between them. Given the sequential nature of our data, it is possible to include new datasets into the existing training sets by placing them between the classes. Neural networks exhibit a “black box” behaviour, characterised by highly intricate nonlinear interactions between inputs and outputs, making it difficult to visually grasp the data. Our objective is to identify any irregularities in the patterns of gait and electroencephalography (EEG) data that should closely correspond to the actual physical measures. However, in statistical analysis, every unique data pattern is depicted as a single point in a space that is composed of multiple dimensions, with separate areas for all classes. Moreover, the actual perception of the traits is preserved by the utilisation of this technique.

Researchers have suggested many approaches to identify neurodegenerative illnesses at an early stage, including assessing cognitive decline, analysing biomarkers, and detecting the presence of metabolites or genes [45]. Recently, the use of early detection and neuroimaging approaches, such as genetic analysis, has become prevalent in identifying fatal illnesses such as neurological conditions, including tumour and the condition cystic fibrosis [46]. The Mini-Mental Score assessment (MMSE) and measurement of symptoms are widely used tools for diagnosing neurodegenerative illnesses [47]. In our previous work, we have applied neural synchrony measurement techniques to early detect neurodegenerative diseases [48]. By comparing the right and left parts of brain, Iram et al. [49] were able to differentiate between patients with moderate Alzheimer’s disease and healthy controls using phase synchrony, magnitude-squared coherence, and cross-correlation techniques.

However, it is thought that using computer algorithms and visualisation methods is essential to helping the early identification process [50]. An instance of this can be observed in the Common Spatial Patterns (CSP) method, which was put forth by Woon et al. [45]. These are considered suitable techniques in researching Alzheimer’s disease. Significant aspects of class labelling and dimensionality reduction are incorporated into CSP, which is a member of the unfavourable class of techniques called Blind Source Separation (BSS).

For the analysis of medical data sets, other algorithms—like differentiality-based categorisation methods—have been shown to be particularly helpful. Seismic signal classification, for instance, has made substantial use of techniques like non-linear and linear classification. Nonetheless, several prototypes are available; the results demonstrate that Nearest Neighbours classification algorithm outperforms density-orientated classifiers.

Even while these methods have clear advantages, there is still a lack of consistency in the ways that existing medical data classification systems expose important information that is hidden, particularly when we use data from real-world application domains. The primary drawback of the methods discussed is that they only take few classifiers into account. Moreover, a lot of them neglect to include pertinent and crucial details, including gender and age, which might have a big influence on outcomes. Furthermore, the performance evaluation may be more affected by other variables than by the collection of variables that determine total accuracy [51]. In our previous work [52], we have also applied various Machine Learning algorithms for the diagnosis and detection of neurodegenerative diseases where multi-class datasets were used. However, further research is required to understand the selection criteria for the suitable evaluation techniques due to the following reasons: (a) when we have multi-class datasets and (b) when a large number of classifiers are used in the experiments.

4. Materials and Methods

4.1 Data Description

We have gathered gait signals for both healthy individuals and those with neurodegenerative diseases from Physionet [53]. The data used in this study were accessed for research purposes between [12/01/2025] and [26/05/2025]. The authors did not have access to information that could identify individual participants during or after data collection. The numbers of Control Objects (CO) are 16—14 females and 2 males—20 Huntington’s patients—14 females and 6 males—13 ALS patients—three females and ten males—and 15 Parkinson’s patients—five females and ten males—who have movement issues and are nearing the end of their illness. Data on left and right foot stride signals have been gathered, together with the time which is measured in milliseconds. Total duration of 10 seconds is set in the beginning. A person’s shoe was fitted with force-sensitive insoles to measure the temporal aspects of their stride. Force-sensitive resistors were employed to gather the unprocessed data, and the output was approximately proportional to the force applied to the foot during each cycle. These signals were used to get precise measurements of the timing of footfall contact from one stride to the next. An ankle-worn recorder was used to store the data after it was sampled at 300 Hz using an analogue-to-digital converter. The Hohn and Yahr score (1.5 $\leq$ Severity $\leq$ 4) is used to determine the degree of Parkinson’s disease in the participants; a higher score denotes more advanced disease. The Functional Capacity Measure (1 $\leq$ Severity $\leq$ 12) is used to determine the severity of Huntington’s disease in the participants; a lower score indicates greater levels of progressive functional disability. The time since the disease’s beginning (1 $\leq$ Severity $\leq$ 54) is the number for the participants with ALS; an arbitrary “0” is used as a placeholder for the healthy subjects.

4.2 Visual Analysis of Gait Signals

Prior to classifying the features, we evaluated the gait signals to determine whether a person with neurological disorders had a particular walking pattern. Ageing-related neurophysiological alterations impact the locomotor system’s capacity to produce correlations between stride intervals. We have examined the gait signals of control objects and individuals with neurodegenerative disorders in order to validate the idea that gait patterns correlate with cerebral activities that change with age. The calculation of signals for the stride of the left and right foot showed that the gait cycle’s duration varies in a sophisticated way from one stride to the next.For instance, Figure 2 does not show any signs of variability to the person’s normal gait. However, Figure 3 displays some “noisy” changes with the person’s stride signals that exhibit some fractal feature. However, normal ageing can also cause these oscillations in movement signals, indicating neurological changes in healthy individuals’ brains as well—just not as a result of degeneration.

Figure 2. Visual analysis of gait signals for healthy person

Figure 3. Visual analysis of gait signals for neurodegenerative diseases (NDDs) patient

4.3 Features Selection

The feature set needed for an appropriate diagnosis of neurodegenerative illnesses is provided by the dataset that contains the eight features mentioned in Table 1. More precisely, this dataset is used to choose a classifier, train, test, and then assess the outcome to see if the right categorisation was made. The computation increases in direct proportion to the number of features in the dataset that are taken into account. Figure 4 illustrates the intricacy of classifying gait patterns for individual subjects with a scatter plot that uses just three characteristics. In this case, Feature 3 denotes “age”, which is thought to be a significant component in the course of the disease, whereas Features 1 and 2 are linked to “right and left foot movement signals”.

Table 1. Selection of features for classification in all subjects

Features	Healthy Subjects	Parkinson	Huntington	Amyotrophic Lateral Sclerosis (ALS)
Right feet signals (motion vector)	(-0.6739–0.5411)	(-0.9942–0.3693)	(-0.5576–0.6634)	(-0.9958–0.1645)
Left feet signals (motion vector)	(-0.5421–0.2069)	(-0.8065–0.78)	(-0.1.750–0.5834)	(-1.1159–0.2155)
Age (years)	20–74	44–80	29–71	36–70
Height (m)	1.67–1.94	1.67–2.13	1.57–2	1.57–1.88
Weight (Kg)	50–95	43–100	45–102	40.82–117.5
Time (s)	10	10	10	10
Walking speed (m/s)	0.91–1.54	0.5–1.33	0.56–1.82	0.77–1.302
BMI (kg/m$^2$)	14.9–25.2	14.5–26.6	16.2–32.2	16.6–37.1

Figure 4. Three-dimensional scatter plot to understand the complexity of multiclass dataset

Before classification, each class is given a unique label. The class of healthy subjects in this instance is grouped in one class called “Class 1”, and the classifications for all diseases are grouped into Class 2, 3, and 4.

4.4 Classification Algorithms

We employed PRTools, a widely recognised MATLAB toolbox known for its extensive assortment of functions specifically designed for pattern recognition. This toolkit includes a diverse range of widely recognised categorisation techniques, making it extremely beneficial for such jobs. The following are various sets of classifiers utilised for the classification of feature sets for all objects.

4.4.1 Density-based classifiers

One kind of machine learning technique that classifies data points according to the estimated probability density of the classes is called a density-based classifier. In order to determine which class has the highest density at a given point, these classifiers first model the distribution of data points within each class and then allocate a new data point to that class. Because they can more precisely represent the underlying distribution of the data, density-based classifiers are especially useful for handling complicated, non-linear decision boundaries. Because the assumption of linearity is broken in applications like anomaly detection, picture segmentation, and bioinformatics, this makes them valuable. The classifiers that are considered in this research work are shown in Figure 5.

Figure 5. Set of eleven classifiers

4.4.2 Linear classifiers

In a high-dimensional space, Linear Classifiers operate by identifying the hyperplane that best divides the data points of various classes. Finding a linear decision boundary that can reliably differentiate between the classes based on their attributes is the main objective. Because of their ease of use, effectiveness, and interpretability, these models are highly regarded in a variety of domains, including spam detection, image recognition, and medical diagnosis. Even though they are straightforward, linear classifiers can be very powerful, particularly in cases where there is a roughly linear connection between the target variable and the data. For Linear Classification, a list of four (4) classifiers is shown in Figure 5.

4.4.3 Non-linear classifiers

Non-linear classifiers are types of machine learning algorithms capable of capturing intricate relationships between features and target variables. Unlike linear classifiers, which presuppose a linear association between features and targets, non-linear classifiers possess the ability to model complex patterns and decision boundaries within the data. Non-linear classifiers deal with data that cannot be divided by a linear decision boundary or a straight line. These classifiers can use a variety of strategies, including neural networks, polynomial functions, and kernel methods, to describe intricate correlations between input characteristics and output classes.

Figure 5 presents a set of eleven (11) classifiers that are used in this research work.

5. Performance Evaluation Metrics and Results

The performance of a classifier is predominantly assessed by a parameter called the decision threshold t (0 $\leq$ $t$ $\leq$ 1) to determine the ultimate class affiliation of an item being classified [54]. An object is assigned to a class that has a higher probabilistic likelihood of meeting this threshold. When dealing with multiclass or unbalanced datasets, this threshold number could change. In this research work, We are presenting two distinct types of indicators to illustrate and subsequently compare the results of the outcome evaluation:

• A statistical analysis is conducted to assess the outcomes of the assessment using formulas from mathematics such as F-Measure, Precision, Recall, Sensitivity, specificity, and classification accuracy (Confusion Matrix).

• Type I and Type II Errors which are used to detect the false positives and false negatives, respectively. Type I errors include rejecting a true null hypothesis, whereas Type II errors involve failing to reject a false one.

Confusion Matrix: A confusion matrix, which displays the numbers of true positive, true negative, false positive, and false negative predictions, is an essential tool for evaluating a classification model’s performance. Understanding the accuracy, precision, recall, and other critical metrics of the model is aided by this matrix. The formula is given below:

$\text { Confusion Matrix }=\frac{T P+T N}{T P+F P+T N+F N}$

(1)

Precision: The terms “true positive” and “objects that are incorrectly classified as positive”, or false positives, determine precision.

$\text { Precision }=\frac{T P}{T P+F P}$

(2)

Recall/Sensitivity and Specificity: Recall, along with sensitivity and specificity, is a function of the objects that are successfully classified (true positives) and the erroneously classified (false negatives). The results are described in the form of real negative items by specificity.

Precision and Recall, both related to one another. Recall, which is calculated by taking the percentage of output connected to the search item that is successfully retrieved, whereas precision is the percentage of derived output that is relevant to the original query. Their formulas are given below:

$\text { Recall/Sensitivity }=\frac{T P}{T P+F N}$

(3)

$\text { Specificity }=\frac{T N}{T N+F P}$

(4)

F-Measure: Another popular assessment metric that incorporates recall and precision into a single number is the F-Measure. The formula is as:

$\text { F-Measure }=\frac{2 \times \text { Recall } \text { × Precision }}{\text { Recall }+ \text { Precision }}$

(5)

Error Types: Two types of errors as Type I and Type II are used to detect the false positives and false negatives, respectively.

$\text { Type I Error }=\frac{F N}{T P+F N}$

(6)

$\text { Type II Error }=\frac{F P}{T N+F P}$

(7)

Figure 6 displays the accuracy results of the eleven classifiers utilised in this research.

Figure 6. Accuracy results of eleven classifiers

5.1 Results Discussion and Challenges

As previously stated, 11 classifiers from various classification categories are examined. Five out of eleven classifiers have produced comparatively better results as compared to the rest. One of them is a nonlinear classifier called the PARZEN Classifier, One is Linear Classifier and the other three are Bayes Normal classifiers called Bayes Normal-U (UDC), Bayes Normal-L (LDC), and Quadratic Discriminant Classifier (QDC). The results of all five classifiers that outperform others are presented in Table 2. These results are calculated using a comprehensive set of evaluation metrics. Surprisingly, linear classifiers were unable to produce better outcomes. This could be because of these two reasons: (1) for multiclass datasets, linear classification is not a good choice; (2) there are several inter-subject variances in our datasets. As in the cases of LOGLC, FISHERC, NMC, and the POLYC, which leads to the occurrence of redundant class probabilities inside an attribute space, ultimately resulting in a higher misinterpretation risk.

Table 2. Selection of features for classification in all subjects

Metrics	Bayes Normal-U (UDC)	Bayes Normal-L (LDC)	Parzen Classifier	Quadratic Discriminant Classifier (QDC)	Fishers’s Classifier
Confusion matrix	65.00%	62.50%	60.00%	57.50%	57.50%
Precision	86.36%	85.71%	85.00%	72.41%	80.95%
Recall/sensitivity	63.33%	60.00%	56.67%	70.00%	56.67%
Specificity	70.00%	70.00%	40.00%	20.00%	60.00%
F-Measure	72.72%	70.34%	67.52%	70.42%	66.67%
Type I Error	36.67%	40.00%	43.33%	30.00%	43.33%
Type II Error	30.00%	30.00%	60.00%	80.00%	40.00%

Due to the structure of our database, when dealing with datasets that include multiple classes, it becomes exceedingly challenging to compare the results with earlier research, the majority of which focused on binary class datasets. We do, however, explain our findings from many angles. The UDC, LDC, PARZEN, QDC amd Fishers classifiers have accuracy rates of 65%, 62.5%, 60%, 57.5%, and 57.5% respectively, as determined by the confusion matrix.

It is also noticed that although both QDC and Fisher’s classifiers have similar accuracy rates, the outputs of other evaluation techniques are quite different, such as for precision, sensitivity, specificity, f-measure and for both two types of errors.

Following are the main challenges that we have encountered in the evaluation of multi-class data classification:

Challenge 1: Due to three distinct neurodegenerative illnesses, it might be difficult to determine true positive and false positive findings in terms of sensitivity and specificity. For instance, if we take an example of UDC and check its sensitivity result as a two-class dataset instead of a 4-class dataset, the results will be significantly different. In 2-class classification matrix, the sensitivity of UDC is 90%, 50%, and 50% for Healthy-Huntington, Healthy-Parkinson, and Healthy-ALS, respectively. However, for all 4-class datasets it has an overall sensitivity of 63.33%. Similarly, the sensitivity of LDC for Healthy-Huntington, Healthy-Parkinson, and Healthy-ALS is 40%, 80%, and 60%, respectively. However, the percentage is 60% overall. The total sensitivity for PARZEN is 56.67%. Calculating PARZEN’s specificity for a certain disease can be done in the same way. Nevertheless, the respective percentages for UDC, LDC, and PARZEN are 70%,70%, and 40% overall.

In a similar vein, the precision of PARZEN, LDC, and UDC is 85%, 85.71%, and 86.36%, respectively. Sensitivity and specificity are 72.72%, 70%, and 67.52% for UDC, Linear Discriminative Classifier, and Parazenc, respectively, that are used to compute the F-measure. Additionally, the false positive and false negative values for all of these classifiers are shown in Table 2.

Challenge 2: It is also noticed in Table 2 that although the accuracy of UDC is higher than the rest of the classifiers, this is not the case with the outputs of all the rest of the evaluation techniques. For instance, if we take sensitivity into consideration, then the results of QDC outperform all the rest of the classifiers, including UDC. Similarly, as discussed earlier, although the accuracy of QDC and Fisher’s classifiers is same 57.5%, the sensitivity of QDC is higher (70%) than Fisher’s (56.67%). It is also very obvious that in the medical data it is very important to correctly identify the true positive cases to diagnose the diseases in time for better quality treatment provision. Higher sensitivity means that most patients with the conditions are correctly identified, which reduces the risk of missed diagnosis.

Due to the above-stated challenges, it is very difficult to manually compare the results of all classifiers for all different evaluation techniques. This task becomes even more tedious if the number of classifiers is higher than what we have selected in our research.

Furthermore, relevant expertise is required all the time in order to compare the results of various evaluation techniques and to select the output of one particular classifier that performs better than others.

State-of-the-art study also shows that a classifier may be evaluated using one performance measure while failing to be measured using another [55]. Although academics have evaluated categorisation algorithms using many methodologies, there is no definitive criterion that surpasses others in terms of performance. In the next section, we introduce an intelligent decision-making tool designed to compare the performance of various classifiers across different evaluation techniques. This tool facilitates the selection of the most effective classifier by systematically analysing and comparing the results from all evaluation methods.

6. Decision-Making Algorithm

As opposed to employing a single-criterion approach, a decision-making problem takes into account multiple criteria to ascertain the optimal alternative from a provided set. Presently, a multitude of MCDM techniques are being utilised to tackle a diverse array of challenges. A significant proportion of these methodologies are founded on analogous principles of decision-making. An initial decision-making matrix is provided, which compares a number of alternatives to a variety of competing criteria. The final ranking of alternatives produced by any MCDM method assists decision-makers in selecting the optimal alternative.

Assume that there are $m$ alternatives given as:

$A =\left\{A_{1}, \ldots, A_{i}, \ldots, A_{m}\right\}(m\geq 2)$ and $ C =\left\{C_{1}, \ldots, C_{j}, \ldots, C_{n}\right\}$ $(n\geq 2)$ that comprise the finite set of $n$ criteria.

The LOPCOW-ERUNS method is described in the following steps.

Step 1:

Attain the decision matrix $X=\left[x_{i j}\right]_{m \times n}$.

LOPCOW method (for criteria weights)

Step 2:

Normalise the decision matrix by using linear max-min type of normalisation approach to the initial decision matrix. Accordingly, the elements of the normalised decision matrix are obtained as $R=\left[r_{i j}\right]_{m \times n}$.

where,

$r_{i j}=\frac{X_{i j}-X_{\min }^{j}}{X_{\max }^{j}-X_{\text {min }}^{j}}(\text{When} j \in j^{+}, \text{effect direction: maximise})$

(8)

$r_{i j}=\frac{X_{\max }^{j}-x_{i j}}{X_{\max }^{j}-X_{\min }^{j}}(\text{When}j \in j^{-}, \text{effect direction: minimise})$

(9)

Step 3:

Obtain the percentage value (PV) for the criteria. The PV for each criterion is calculated, as:

$P_{j}=\left|\ln \left(\frac{\sqrt{ \displaystyle \frac{ \displaystyle\sum_{i=1}^{m} r_{i j}^{2}}{m}}}{\sigma}\right) .100\right|$

(10)

where, $\sigma$ is the standard deviation of the performance values of the alternatives under a specific criterion.

Step 4:

The weight for the $j^{\text{th}}$ criterion is calculated by using:

$w_{j}=\frac{P_{j}}{ \displaystyle\sum_{j=1}^{n} P_{j}} $

(11)

where, $\displaystyle\sum_{j=1}^{n} w_{j}=1$.

ERUNS method (for ranking alternatives)

Step 5:

The standardisation of the elements of the matrix $X$ was performed using a function that enables the mapping of criterion intervals into an arbitrarily chosen interval [$\alpha, \beta$], where $\alpha$ represents the left limit of the interval, while $\beta$ represents the right limit of the interval. The standardisation of the elements of the initial decision matrix $X=\left[x_{i j}\right]_{m \times n}$ is carried out in two steps:

Step 5.1:

In the first step, by applying Eq. (12), the elements of the decision matrix are mapped into the interval $[\alpha, \beta]$.

$\varphi_{i j}=\left(\frac{x_{j}^{\text {min }}}{x_{i j}}\right)^{3} \beta+\frac{\alpha}{x_{j}^{\text {min }}}$

(12)

where, $x_{j}^{\min}= \displaystyle \min_{1 \leq i \leq m}\left(x_{i j}\right)$ represents the absolute minimum values from the matrix $X=\left[x_{i j}\right]_{m \times n}$, while $\alpha$ and $\beta$ represent the left and right limits of the standardised interval.

It is recommended that in decision problems where there are a large number of alternatives, an interval with a larger range should be adopted, since the standardised interval should have a sufficient range so that the standardised values could be distributed proportionally to their influence in the original criterion interval. Applying Eq. (12), we obtain a modified decision matrix (MDM) $X^{N}=\left[\varphi_{i j}\right]_{m \times n}$.

Step 5.2:

In the second step, if the criterion is of the max type, the values of $X^{N}=\left[\varphi_{i j}\right]_{m \times n}$ are modified by applying the Eq. (13):

$\xi_{i j}=-\varphi_{i j}+\max _{1 \leq i \leq m}\left(\varphi_{i j}\right)+\min _{1 \leq i \leq m}\left(\varphi_{i j}\right) $

(13)

If the criterion is of the min type, it is assumed that $\xi_{i j}=\varphi_{i j}$. Thus, we get the final standardised decision matrix (SDM) $\mathbb{Q}=\left[\xi_{i j}\right]_{m \times n}$.

Step 6:

The weighted standardised decision matrix (WSDM) is represented by $V=\left[v_{i j}\right]_{m \times n}$ where the elements are derived using the softmax function to provide a concise representation of weighted relations between criteria:

$v_{i j}=\frac{\exp \left(f\left(\xi_{i j}\right) / k\right) w_{j}}{ \displaystyle \sum_{j=1}^{n} \exp \left(f\left(\xi_{i j}\right) / k\right) w_{j}} $

(14)

where, $f\left(\xi_{i j}\right)=\xi_{i j} / \sum_{j=1}^{n} \xi_{i j}, j \in\{1,2,3 \ldots . n\} \quad w_{j}$ is the weight of $j^{\text {th}}$ the criterion and $k>0$ is modulation parameter. We suggest for easier calculation to adopt this, $k=1$ and this parameter can be used for simulation of different scenarios in sensitivity analysis.

Step 7:

The utility degrees of the $i^{\text {th}}$ alternative with respect to the ideal and anti-ideal solutions are given by:

$U^{+}=\frac{\prod_{j=1}^{n}\left(\xi_{i j}\right)^{v_{i j}}}{\sum_{j=1}^{n} v_{j}^{+}}$

(15)

$U^{-}=-\frac{\sum_{j=1}^{n} v_{j}^{-}}{\prod_{j=1}^{n}\left(\xi_{i j}\right)^{v_{j j}}}+\max _{1 \leq i \leq m}\left(\frac{\sum_{j=1}^{n} v_{j}^{-}}{\prod_{j=1}^{n}\left(\xi_{i j}\right)^{v_{j j}}}\right)+\min _{1 \leq i \leq m}\left(\frac{\sum_{j=1}^{n} v_{j}^{-}}{\prod_{j=1}^{n}\left(\xi_{i j}\right)^{v_{j j}}}\right)$

(16)

where, $v_{j}^{+}= \displaystyle \max_{1 \leq i \leq m}\left(\xi_{i j} \cdot w_{j}\right)$ and $v_{j}^{-}= \displaystyle \min_{1 \leq i \leq m}\left(\xi_{i j} \cdot w_{j}\right)(i=1,2 \ldots m ; j=1,2 \ldots n)$.

Step 8:

Based on the aggregated levels of utility, we can define the final utility functions derived as:

$f\left(U_{i}^{+}\right)=\frac{U_{i}^{+}}{U_{i}^{+}+U_{i}^{-}}$

(17)

$f\left(U_{i}^{-}\right)=\frac{U_{i}^{-}}{U_{i}^{+}+U_{i}^{-}}$

(18)

The utility functions are additive in nature.

Step 9:

Using the values of the utility functions, rank the alternatives. The appraisal scores of the alternatives are calculated using Eq. (19):

$AS_{i}=\left(U_{i}^{+}+U_{i}^{-}\right) \frac{\left(1+f\left(U_{i}^{+}\right)\right)^{\delta}\left(1+f\left(U_{i}^{-}\right)\right)^{1-\delta}-\left(1-f\left(U_{i}^{+}\right)\right)^{\delta}\left(1-f\left(U_{i}^{-}\right)\right)^{1-\delta}}{\left(1+f\left(U_{i}^{+}\right)\right)^{\delta}\left(1+f\left(U_{i}^{-}\right)\right)^{1-\delta}+\left(1-f\left(U_{i}^{+}\right)\right)^{\delta}\left(1-f\left(U_{i}^{-}\right)\right)^{1-\delta}}$

(19)

The appraisal score has a parameter $\delta$ which is defined from the interval [0, 1]. Using the parameter $\delta$, the influence of utility function values in the final decision is defined. It is recommended that $\delta$ = 0.5 be adopted for the calculation of the final appraisal score. Thus, the equal influence of utility function values in the final decision is simulated. The final ranking of the alternatives is defined on the basis of the final appraisal score, where it is desirable that the alternative has the highest possible value.

6.1 Experimental Results

In this case, the alternatives are given as $A_{1}$ = UDC, $A_{2}$ = LDC, $A_{3}$ = Parzen classifier, $A_{4}$ = QDC, $A_{5}$ = Fisher’s classifier. Whereas the criteria are given as $C_{1}$ = confusion matrix, $C_{2}$ = precision, $C_{3}$ = recall/sensitivity, $C_{4}$ = specificity, $C_{5}$ = F-measure, $C_{6}$ = Type I error and $C_{7}$ = Type II error.

Step 1:

Attain the decision matrix $X=\left[x_{i j}\right]_{m \times n}$ based on the Table 3, which is extracted from Table 2.

Table 3. Initial decision matrix

	$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
$A_1 $	0.650	0.864	0.630	0.700	0.727	0.366	0.300
$A_2 $	0.625	0.857	0.600	0.700	0.700	0.400	0.300
$A_3 $	0.600	0.850	0.560	0.400	0.670	0.433	0.600
$A_4 $	0.575	0.724	0.700	0.200	0.700	0.300	0.800
$A_5 $	0.575	0.81	0.570	0.600	0.660	0.500	0.400

LOPCOW method (for criteria weights)

Step 2:

Normalise the decision-matrix by applying the linear max-min type of normalisation approach, $R=\left[r_{i j}\right]_{m \times n}$ given in Table 4.

Table 4. Normalised decision matrix

	$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
$A_1 $	0.081	0.208	0.080	0.179	0.127	0.146	0.179
$A_2 $	0.081	0.209	0.079	0.181	0.126	0.144	0.180
$A_3 $	0.083	0.220	0.079	0.187	0.129	0.144	0.157
$A_4 $	0.083	0.207	0.089	0.157	0.136	0.171	0.157
$A_5 $	0.081	0.215	0.079	0.190	0.128	0.141	0.167

Step 3:

Obtain the PV for the criteria. The PV for each criterion is given in Table 5.

Table 5. Percentage values for the criteria

$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
24.9496	63.8317	25.1295	54.5747	38.9514	47.8788	54.5747

Step 4:

The weight for the $j^{\text{th}}$ criterion is given in Table 6.

Table 6. Weights for the criteria

$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
0.0805	0.2060	0.0811	0.1761	0.1257	0.1545	0.1761

ERUNS method (for ranking alternatives)

Step 5:

The standardization of the elements of the matrix $X=\left[x_{i j}\right]_{m \times n}$ is carried out in two steps:

Step 5.1:

In the first step, by applying Eq. (12), the elements of the decision matrix are mapped into the interval $[ 1, 100]$, MDM $X^{N}=\left[\varphi_{i j}\right]_{m \times n}$ is given in Table 7.

Table 7. Modified decision matrix

	$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
$A_1 $	70.9642	60.2215	72.0189	7.3324	76.3370	58.4040	103.3333
$A_2 $	79.6079	61.6751	83.0894	7.3324	85.3332	45.5208	103.3333
$A_3 $	89.7527	63.1770	101.7857	17.5000	97.1040	36.5916	15.8333
$A_4 $	101.7391	101.3812	52.9857	105.0000	85.3332	103.3333	8.6068
$A_5$	101.7391	72.7915	96.6144	8.7037	101.5152	24.9333	45.5208

Step 5.2:

There is some max-type criterion; that’s why values $X^{N}=\left[\varphi_{i j}\right]_{m \times n}$ are modified by applying Eq. (13), given in Table 8.

Table 8. Standardised decision matrix

	$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
$A_1 $	101.7391	101.3812	82.7525	105.0000	101.5152	58.4040	103.3333
$A_2 $	93.0954	99.9276	71.6820	105.0000	92.5190	45.5208	103.3333
$A_3 $	82.9506	98.4257	52.9857	94.8324	80.7481	36.5916	15.8333
$A_4 $	70.9642	60.2215	101.7857	7.3324	92.5190	103.3333	8.6068
$A_5 $	70.9642	88.8112	58.1571	103.6287	76.3370	24.9333	45.5208

Step 6:

The $V=\left[v_{i j}\right]$ matrix with $k$ = 1 is given in Table 9.

Table 9. Weighted standardised decision matrix

	$\mathbf{C_1}$	$\mathbf{C_2}$	$\mathbf{C_3}$	$\mathbf{C_4}$	$\mathbf{C_5}$	$\mathbf{C_6}$	$\mathbf{C_7}$
$A_1 $	0.0814	0.2082	0.0797	0.1790	0.1271	0.1462	0.1785
$A_2 $	0.0810	0.2095	0.0788	0.1806	0.1263	0.1438	0.1801
$A_3 $	0.0832	0.2202	0.0786	0.1868	0.1293	0.1445	0.1575
$A_4 $	0.0828	0.2068	0.0894	0.1570	0.1357	0.1709	0.1574
$A_5 $	0.0808	0.2148	0.0792	0.1895	0.1276	0.1406	0.1674

Step 7 and Step 8:

The utility degrees and utility function values of the $i^{\text {th}}$ alternative is given in Table 10.

Table 10. Utility degrees and utility function values of the alternatives

	$\boldsymbol{U^{+}}$	$\boldsymbol{U^{-}}$	$\boldsymbol{f(U^{+})}$	$\boldsymbol{f(U^{-})}$
$A_1 $	0.9048	0.9829	0.4793	0.5207
$A_2 $	0.8458	0.9539	0.4699	0.5300
$A_3 $	0.5660	0.7341	0.4354	0.5646
$A_4 $	0.3828	0.4159	0.4793	0.5207
$A_5 $	0.6201	0.7919	0.4391	0.5609

Step 9:

The appraisal scores of the alternatives are calculated by setting the value of $\delta$ = 0.5 and given in Table 11.

Table 11. Appraisal scores and ranking of the alternatives

	Appraisal Scores	Ranking
$A_1 $	0.9444	1
$A_2 $	0.9009	2
$A_3 $	0.6537	4
$A_4 $	0.3997	5
$A_5 $	0.7095	3

The results of the decision-making algorithms are detailed in Table 11. This table indicates that the two classifiers with the highest accuracy rates are UDC and LDC, suggesting these classifiers perform better based on accuracy and other evaluation metrics. This finding further supports the performance of our classification algorithms as previously discussed in Table 2.

An interesting observation from the decision-making algorithm’s output is the categorisation of the Fisher’s classifier in third place, instead of fifth, which contrasts with our manual classification based solely on accuracy. In the decision-making algorithm, sensitivity and Type II error rate are prioritised as key evaluation metrics. As previously discussed, “false negative” results are particularly critical in the context of medical data. Although the Fisher’s classifier has a relatively lower accuracy rate compared to the Parzen classifier, it demonstrates better sensitivity and Type II error rate results. This likely explains why the Fisher’s classifier outperforms the Parzen classifier in terms of result reliability.

6.2 Discussion

The findings of this study highlight the complexity of multi-class differential diagnosis in neurodegenerative diseases, where density-based classifiers, particularly the UDC model, achieved the highest accuracy of 65.0% compared to linear and non-linear alternatives. This performance advantage suggests that the feature space of gait dynamics in Parkinson’s, Huntington’s, and ALS patients follows probabilistic distributions that are better captured by density estimation than by linear decision boundaries, which performed suboptimally due to high inter-subject variance and redundant class probabilities. However, a critical outcome of this research is the validation of the LOPCOW-ERUNS decision support framework, which demonstrated that raw accuracy is insufficient for clinical safety. By systematically weighting conflicting performance metrics, the automated system ranked Fisher’s classifier third, surpassing models with higher accuracy due to its superior sensitivity and lower Type II error rate, thereby prioritising the minimisation of missed diagnoses, which is paramount in medical settings.

Despite the robustness of the decision-making framework, the study is subject to specific limitations regarding the dataset and feature scope. The experimental analysis relied on a relatively small cohort of 64 subjects with notable demographic imbalances, such as a control group comprising 14 females and only 2 males, which may introduce bias and affect the generalisability of the classification boundaries. Additionally, the restriction of the input vector to eight features derived solely from gait stride signals and demographics (e.g., age and BMI) limits the system’s ability to fully characterise the multifaceted nature of neurodegeneration. Future iterations of this work aim to address these constraints by integrating multi-modal data sources, including neuroimaging (fMRI, CT) and biochemical markers, to enhance predictive precision and developing mobile-based monitoring to support continuous clinical assessment.

7. Conclusion and Future Work

In conclusion, this study identifies several significant features that influence the progression of NDDs, including age, gender, weight, BMI, and disease severity. These critical aspects are often overlooked in existing literature, and our research aims to address this gap. Due to the multiclass nature of the datasets, we utilised a comprehensive set of evaluation metrics to compare the performance of eleven different classifiers. This research examines the challenge of determining the most effective classifier by assessing performance metrics and errors, recognising that individual classifiers may excel in different metrics. In order to address this issue, we have deployed decision-making algorithms to automate the selection of the most appropriate classification algorithm based on evaluation techniques used in this study.

Our research is an initial effort to diagnose neurodegenerative disorders.

In our future work, we aim to develop advanced diagnostic methods to identify complicated gait problems that are challenging for medical practitioners to detect. Our goals involve creating new approaches for diagnosing, evaluating rehabilitation, and planning therapy. We also want to assist in clinical assessments and thoroughly evaluate these analytical tools through well-designed clinical studies. In addition, we expect that the accuracy of predicting a wider range of diseases will be improved by integrating characteristics from other data sources, such as skeletal data, biochemical data, RGB images, fMRI scans and CT images. Furthermore, we will enhance our model to enable regular patient monitoring on mobile devices. This will establish a basis for and maybe inspire further utilisation of machine learning in computer-aided diagnosis systems.

Author Contributions

Conceptualization, S.I. and R.M.; methodology, S.I. and H.M.A.F.; validation, H.M.A.F.; formal analysis, S.I.; investigation, S.I. and F.J.; resources, R.M.; data curation, F.J.; writing—original draft preparation, S.I.; writing—review and editing, H.M.A.F.; visualization, S.I.; project administration, F.J. All authors have read and agreed to the published version of the manuscript.

Data Availability

We used the open source data, availeble to every one using this link: https://www.physionet.org/content/gaitndd/1.0.0/.

Conflicts of Interest

The authors declare no conflict of interest.

References

1.

J. Cohen-Mansfield, M. S. Marx, and A. S. Rosenthal, “A description of agitation in a nursing home,” J. Gerontol., vol. 44, no. 3, pp. M77–M84, 1989. [Google Scholar] [Crossref]

2.

H. C. Kales, L. N. Gitlin, and C. G. Lyketsos, “Assessment and management of behavioural and psychological symptoms of dementia,” Br. Med. J., vol. 350, p. h369, 2015. [Google Scholar] [Crossref]

3.

V. Valkanova and K. P. Ebmeier, “What can gait tell us about dementia? Review of epidemiological and neuropsychological evidence,” Gait Posture, vol. 53, pp. 215–223, 2017. [Google Scholar] [Crossref]

4.

L. Bahureksa, B. Najafi, A. Saleh, M. Sabbagh, D. Coon, M. J. Mohler, and M. Schwenk, “The impact of mild cognitive impairment on gait and balance: A systematic review and meta-analysis of studies using instrumented assessment,” Gerontology, vol. 63, no. 1, pp. 67–83, 2016. [Google Scholar] [Crossref]

5.

O. Beauchet, C. Annweiler, M. L. Callisaya, A. M. De Cock, J. L. Helbostad, R. W. Kressig, V. Srikanth, J. P. Steinmetz, H. M. Blumen, J. Verghese, and G. Allali, “Poor gait performance and prediction of dementia: Results from a meta-analysis,” J. Am. Med. Dir. Assoc., vol. 17, no. 6, pp. 482–490, 2016. [Google Scholar] [Crossref]

6.

T. Zhou, M. Liu, K. H. Thung, and D. Shen, “Latent representation learning for Alzheimer’s disease diagnosis with incomplete multi-modality neuroimaging and genetic data,” IEEE Trans. Med. Imaging, vol. 38, no. 10, pp. 2411–2422, 2019. [Google Scholar] [Crossref]

7.

M. Mora Pinzon, J. Krainer, T. LeCaire, S. Houston, G. Green-Harris, N. Norris, S. Barnes, L. R. Clark, C. E. Gleason, B. P. Hermann, et al., “The Wisconsin Alzheimer’s Institute dementia diagnostic clinic network: A community of practice to improve dementia care,” J. Am. Geriatr. Soc., vol. 70, no. 7, pp. 2121–2133, 2022. [Google Scholar] [Crossref]

8.

J. Stevenson-Hoare, A. Heslegrave, G. Leonenko, D. Fathalla, E. Bellou, L. Luckcuck, R. Marshall, R. Sims, B. P. Morgan, J. Hardy, et al., “Plasma biomarkers and genetics in the diagnosis and prediction of Alzheimer’s disease,” Brain, vol. 146, no. 2, pp. 690–699, 2023. [Google Scholar] [Crossref]

9.

J. Ren, H. Li, A. Wang, K. Saho, and L. Meng, “Radar-based gait analysis by transformer-liked network for dementia diagnosis,” Biomed. Signal Process. Control, vol. 91, p. 105917, 2024. [Google Scholar] [Crossref]

10.

M. Montero-Odasso, Y. Sarquis-Adamson, M. Speechley, M. Borrie, V. Hachinski, J. Wells, P. Riccio, M. Schapira, E. Sejdic, R. Camicioli, et al., “Association of dual-task gait with incident dementia in mild cognitive impairment: Results from the gait and brain study,” JAMA Neurol., vol. 74, no. 7, pp. 857–865, 2017. [Google Scholar] [Crossref]

11.

M. Banaie, M. Pooyan, and M. Mikaili, “Introduction and application of an automatic gait recognition method to diagnose movement disorders that arose of similar causes,” Expert Syst. Appl., vol. 38, no. 6, pp. 7359–7363, 2011. [Google Scholar] [Crossref]

12.

G. Demir, P. Chatterjee, and D. Pamucar, “Sensitivity analysis in multi-criteria decision making: A state-of-the-art research perspective using bibliometric analysis,” Expert. Syst. Appl., vol. 237, p. 121660, 2024. [Google Scholar] [Crossref]

13.

H. C. Sox, M. C. Higgins, D. K. Owens, and G. S. Schmidler, Medical Decision Making. John Wiley & Sons, 2024. [Google Scholar]

14.

F. Ecer and D. Pamucar, “A novel LOPCOW-DOBI multi‐criteria sustainability performance assessment methodology: An application in developing country banking sector,” Omega, vol. 112, p. 102690, 2022. [Google Scholar] [Crossref]

15.

F. Ecer, H. Küçükönder, S. K. Kaya, and ö. F. Görçün, “Sustainability performance analysis of micromobility solutions in urban transportation with a novel IVFNN-Delphi-LOPCOW-CoCoSo framework,” Transp. Res. Part A Policy Pract., vol. 172, p. 103667, 2023. [Google Scholar] [Crossref]

16.

B. Nila and J. Roy, “A new hybrid MCDM framework for third-party logistic provider selection under sustainability perspectives,” Expert Syst. Appl., vol. 234, p. 121009, 2023. [Google Scholar] [Crossref]

17.

V. Simic, S. Dabic-Miletic, E. B. Tirkolaee, ž. Stević, A. Ala, and A. Amirteimoori, “Neutrosophic LOPCOW-ARAS model for prioritizing industry 4.0-based material handling technologies in smart and sustainable warehouse management systems,” Appl. Soft Comput., vol. 143, p. 110400, 2023. [Google Scholar] [Crossref]

18.

S. Biswas, D. Pamucar, S. Dawn, and V. Simic, “Evaluation based on relative utility and nonlinear standardization (ERUNS) method for comparing firm performance in energy sector,” Decis. Making Adv., vol. 2, no. 1, pp. 1–21, 2024. [Google Scholar] [Crossref]

19.

A. Wimo, B. Winblad, and L. Jönsson, “An estimate of the total worldwide societal costs of dementia in 2005,” Alzheimer’s Dement., vol. 3, no. 2, pp. 81–91, 2007. [Google Scholar] [Crossref]

20.

A. Wimo, B. Winblad, and L. Jönsson, “The worldwide societal costs of dementia: Estimates for 2009,” Alzheimer’s Dement., vol. 6, no. 2, pp. 98–103, 2010. [Google Scholar] [Crossref]

21.

A. Abbott, “Dementia: A problem for our age,” Nature, vol. 475, pp. S2–S4, 2011. [Google Scholar] [Crossref]

22.

R. Barker, 2030: The Future of Medicine: Avoiding a Medical Meltdown. Oxford University Press, 2011. [Google Scholar]

23.

R. A. Armstrong, N. J. Cairns, R. Patel, P. L. Lantos, and M. N. Rossor, “Relationships between β-amyloid (Aβ) deposits and blood vessels in patients with sporadic and familial Alzheimer’s disease,” Neuropathol. Appl. Neurobiol., vol. 207, no. 3, pp. 171–174, 1996. [Google Scholar] [Crossref]

24.

H. Braak and E. Braak, “Neuropathological stageing of Alzheimer-related changes,” Acta Neuropathol., vol. 82, pp. 239–259, 1991. [Google Scholar] [Crossref]

25.

J. Parkinson, “An essay on the shaking palsy,” J. Neuropsychiatry Clin. Neurosci., vol. 14, no. 2, pp. 223–236, 2002. [Google Scholar] [Crossref]

26.

H. Allain, D. Bentué-Ferrer, and Y. Akwa, “Disease-modifying drugs and Parkinson’s disease,” Prog. Neurobiol., vol. 84, no. 1, pp. 25–39, 2008. [Google Scholar] [Crossref]

27.

L. Findley, M. Aujla, P. G. Bain, M. Baker, C. Beech, C. Bowman, J. Holmes, W. K. Kingdom, D. G. MacMahon, V. Peto, and J. R. Playfer, “Direct economic impact of Parkinson’s disease: A research survey in the United Kingdom,” Mov. Disord., vol. 18, no. 10, pp. 1139–1145, 2003. [Google Scholar] [Crossref]

28.

J. G. G. Hou and E. C. Lai, “Non-motor symptoms of Parkinson’s disease,” Int. J. Gerontol., vol. 1, no. 2, pp. 53–64, 2007. [Google Scholar] [Crossref]

29.

S. Dufrasne, M. Roy, M. Galvez, and D. S. Rosenblatt, “Experience over fifteen years with a protocol for predictive testing for Huntington disease,” Mol. Genet. Metab., vol. 102, no. 4, pp. 494–504, 2011. [Google Scholar] [Crossref]

30.

D. Langbehn, R. Brinkman, D. Falush, J. Paulsen, and M. Hayden, “A new model for prediction of the age of onset and penetrance for Huntington’s disease based on CAG length,” Clin. Genet., vol. 65, no. 4, pp. 267–277, 2004. [Google Scholar] [Crossref]

31.

D. Neela and K. Rangarajan, “Hybrid workflow net based architecture for modeling Huntington’s disease,” in Proceedings of Sixth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA), Penang, Malaysia, 2011, pp. 319–323. [Google Scholar] [Crossref]

32.

P. K. Kasi, L. S. Krivickas, M. Meister, E. Chew, P. Bonato, M. Schmid, G. Kamen, L. Pu, and E. A. Clancy, “Characterization of motor unit behavior in patients with amyotrophic lateral sclerosis,” in Proceedings of 4th International IEEE/EMBS Conference on Neural Engineering, Antalya, 2009, pp. 10–13. [Google Scholar] [Crossref]

33.

F. Gros-Louis, C. Gaspar, and G. A. Rouleau, “Genetics of familial and sporadic amyotrophic lateral sclerosis,” Biochim. Biophys. Acta Mol. Basis Dis., vol. 1762, no. 11–12, pp. 956–972, 2006. [Google Scholar] [Crossref]

34.

M. Hausdorff, A. Lertratanakul, E. Cudkowicz, L. Peterson, D. Kaliton, and L. Goldberger, “Dynamic markers of altered gait rhythm in amyotrophic lateral sclerosis,” J. Appl. Physiol., vol. 88, no. 6, pp. 2045–2053, 2000. [Google Scholar] [Crossref]

35.

E. Scherder, L. Eggermont, C. Visscher, P. Scheltens, and D. Swaab, “Understanding higher level gait disturbances in mild dementia in order to improve rehabilitation: ‘Last in–first out’,” Neurosci. Biobehav. Rev., vol. 35, no. 3, pp. 699–714, 2011. [Google Scholar] [Crossref]

36.

Y. F. Zhu and J. L. Henry, “Excitability of A(β) sensory neurons is altered in an animal model of peripheral neuropathy,” BMC Neurosci., vol. 13, p. 15, 2012. [Google Scholar] [Crossref]

37.

L. E. Hebert, B. Laurel, S. Paul, and D. A. Evans, “Annual incidence of Alzheimer disease in the United States projected to the years 2000 through 2050,” Alzheimer Dis. Assoc. Disord., vol. 15, no. 4, pp. 169–173, 2001. [Google Scholar] [Crossref]

38.

R. S. Turner, “Biomarkers of Alzheimer’s disease and mild cognitive impairment: Are we there yet?,” Exp. Neurol., vol. 183, no. 1, pp. 7–10, 2003. [Google Scholar] [Crossref]

39.

L. Bouarfa, A. Schneider, H. Feussner, N. Navab, H. U. Lemke, P. P. Jonker, and J. Dankelman, “Prediction of intraoperative complexity from preoperative patient data for laparoscopic cholecystectomy,” Artif. Intell. Med., vol. 52, no. 3, pp. 169–176, 2011. [Google Scholar] [Crossref]

40.

R. H. Lin, “An intelligent model for liver disease diagnosis,” Artif. Intell. Med., vol. 47, no. 1, pp. 53–62, 2009. [Google Scholar] [Crossref]

41.

M. C. Lee, L. Boroczky, K. Sungur-Stasik, A. D. Cann, A. C. Borczuk, S. M. Kawut, and C. A. Powell, “Computer-aided diagnosis of pulmonary nodules using a two-step approach for feature selection and classifier ensemble construction,” Artif. Intell. Med., vol. 50, no. 1, pp. 43–53, 2010. [Google Scholar] [Crossref]

42.

S. Q. Shi, W. L. Maner, L. B. Mackay, and R. E. Garfield, “Identification of term and preterm labor in rats using artificial neural networks on uterine electromyography signals,” Am. J. Obstet. Gynecol., vol. 198, no. 2, p. 235, 2008. [Google Scholar] [Crossref]

43.

D. Long, J. Wang, M. Xuan, Q. Gu, X. Xu, D. Kong, and M. Zhang, “Automatic classification of early Parkinson’s disease with multi-modal MR imaging,” PLoS ONE, vol. 7, no. 11, p. e47714, 2012. [Google Scholar] [Crossref]

44.

C. A. Miller and M. K. Hinders, “Classification of flaw severity using pattern recognition for guided wave-based structural health monitoring,” Ultrasonics, vol. 54, no. 1, pp. 247–258, 2014. [Google Scholar] [Crossref]

45.

W. L. Woon, A. Cichocki, F. Viallate, and T. Musha, “Techniques for early detection of Alzheimer’s disease using spontaneous EEG recordings,” Physiol. Meas., vol. 28, no. 4, pp. 335–347, 2007. [Google Scholar] [Crossref]

46.

S. Iram, D. Al-Jumeily, P. Fergus, M. Randles, and M. Davies, “E-Health: The potential of linked data and stream reasoning for personalised healthcare,” in Proceedings of 2011 Developments in E-systems Engineering (DeSE), Dubai, United Arab Emirates, 2011, pp. 46–49. [Google Scholar] [Crossref]

47.

K. van der Hiele, A. Vein, R. Reijntjes, R. Westendorp, E. Bollen, M. Buchem, J. van Dijk, and H. Middelkoop, “EEG correlates in the spectrum of cognitive decline,” Clin. Neurophysiol., vol. 118, no. 9, pp. 1931–1939, 2007. [Google Scholar] [Crossref]

48.

D. Al-Jumeily, S. Iram, F. Vialatte, P. Fergus, and A. Hussain, “A novel method of early diagnosis of Alzheimer’s disease based on EEG signals,” Sci. World J., vol. 2015, no. 1, p. 931387, 2015. [Google Scholar] [Crossref]

49.

S. Iram, F. B. Vialatte, and M. I. Qamar, “Early diagnosis of neurodegenerative diseases from gait discrimination to neural synchronization,” in Applied Computing in Medicine and Health, Morgan Kaufmann, 2016, pp. 1–26. [Google Scholar] [Crossref]

50.

H. M. Shakeel, S. Iram, H. Al-Aqrabi, T. Alsboui, and R. Hill, “A comprehensive state-of-the-art survey on data visualization tools: Research developments, challenges and future domain specific visualization framework,” IEEE Access, vol. 10, pp. 96581–96601, 2022. [Google Scholar] [Crossref]

51.

U. Knauer and B. Meffert, “Evaluation based combining of classifiers for monitoring honeybees,” in Proceedings of 2009 Workshop on Applications of Computer Vision (WACV), Snowbird, UT, USA, 2009, pp. 1–6. [Google Scholar] [Crossref]

52.

S. Iram, P. Fergus, D. Al-Jumeily, A. Hussain, and M. Randles, “A classifier fusion strategy to improve the early detection of neurodegenerative diseases,” Int. J. Artif. Intell. Soft Comput., vol. 5, no. 1, pp. 23–44, 2015. [Google Scholar] [Crossref]

53.

A. L. Goldberger, L. A. N. Amaral, L. Glass, J. M. Hausdorff, P. C. Ivanov, R. G. Mark, J. E. Mietus, G. B. Moody, C. K. Peng, and H. E. Stanley, “PhysioBank, PhysioToolkit, and PhysioNet: Components of a new research resource for complex physiologic signals,” Circulation, vol. 101, no. 23, pp. e215–e220, 2000. [Google Scholar] [Crossref]

54.

N. Seliya, T. Khoshgoftaar, and J. Van Hulse, “Aggregating performance metrics for classifier evaluation,” in Proceedings of 2009 IEEE International Conference on Information Reuse & Integration, Las Vegas, NV, USA, 2009, pp. 35–40. [Google Scholar] [Crossref]

55.

J. Lever, M. Krzywinski, and N. Altman, “Classification evaluation,” Nat. Methods, vol. 13, pp. 603–604, 2016. [Google Scholar] [Crossref]

Cite this:

APA Style

IEEE Style

BibTex Style

MLA Style

Chicago Style

GB-T-7714-2015

Iram, S., Farid, H. M. A., Javid, F., & Mishra, R. (2026). Comprehensive Evaluation of Multi-Class Classifiers in the Early Detection of Neurodegenerative Diseases through Decision Support System. Int. J. Comput. Methods Exp. Meas., 14(2), 188-206. https://doi.org/10.56578/ijcmem140202

cc

©2026 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.

pdf

Figure 1. Flow chart of research methodology

Table 1. Selection of features for classification in all subjects

Citations

Crossref: 0