The early and accurate diagnosis of neurodegenerative diseases presents a significant clinical challenge, particularly in distinguishing between conditions with overlapping symptoms. Much of the existing research has focused on binary classification, which inadequately addresses the multi-class nature of real-world differential diagnosis. This study’s objective is to conduct a comprehensive evaluation of multi-class machine learning classifiers for the early detection of neurodegenerative diseases using gait signal data. Furthermore, we propose and implement a novel decision support system to automate the selection of the most effective classifier based on defined clinical priorities. We utilised a public gait dynamics dataset from Physionet, comprising data from healthy individuals and patients diagnosed with Parkinson’s Disease, Huntington’s Disease, and Amyotrophic Lateral Sclerosis (ALS), forming a four-class classification problem. A feature set including gait signals and demographic variables such as age and body mass index was established. Eleven classifiers, categorised as density-based, linear, and non-linear, were trained and evaluated. To automate the selection of the optimal model, a decision-making framework was employed to assign weights to evaluation metrics and rank the classifiers. The classifiers demonstrated varied performance across multiple evaluation metrics. The Bayes Normal-U (UDC) classifier achieved the highest accuracy at 65.0%, with a precision of 86.4%, sensitivity of 63.0%, and specificity of 70.0%. The Bayes Normal-L (LDC) classifier yielded an accuracy of 62.5%, with 85.7% precision, 60.0% sensitivity, and 70.0% specificity. The implemented decision support system ranked the UDC classifier as the optimal choice. Notably, the system ranked Fisher’s classifier third, ahead of others with higher accuracy, by prioritising its superior sensitivity (57.5%) and lower Type II error rate, which are critical for reducing missed diagnoses in a clinical setting. Simple accuracy is an insufficient metric for evaluating classifiers in complex, multi-class medical diagnostic scenarios. Our proposed decision support framework provides a robust and automated methodology for selecting the most clinically relevant classifier by systematically balancing multiple performance indicators. This approach enhances the transparency and reliability of machine learning in clinical decision-making and contributes to the development of more effective, deployable diagnostic tools for neurodegenerative diseases.