Deep Learning-based Optimized Model for Emotional Psychological Disorder Activities Identification in Smart Healthcare System
Abstract:
Accurately diagnosing emotional and psychological disorders is essential for prompt mental health interventions, especially in intelligent healthcare systems. This paper proposes a deep learning model that uses convolutional neural networks (CNN) and long short-term memory (LSTM) networks to classify emotional states based on physiological inputs like EEG and ECG. Bayesian optimisation improves the model's learning efficacy and generalisation ability by adjusting hyperparameters. In comparison to conventional machine learning models such as Support Vector Machines (SVM), random forest, and standalone deep learning models (CNN and LSTM), the proposed CNN-LSTM architecture increases classification accuracy by 25%, to 92.1%. Its exceptional performance is demonstrated by its AUC-ROC score of 0.96, accuracy of 0.93, recall of 0.91, and F1-score of 0.92. These results show that the model can distinguish between several emotional states, including neutral, tense, and concerned. A real-time application is used to investigate the potential of wearable EEG-based brain-computer interface (BCI) devices for continuous emotional monitoring. The findings indicate that the proposed framework might be a helpful tool for the early detection and tailored management of mental health conditions in intricate healthcare environments.
1. Introduction
Mental health problems like depression, anxiety, and stress have become a major concern around the world, affecting millions of people. Many emotional and psychological disorders are increasing, making it important to develop better tools to detect and treat them early. Nowadays, doctors mainly rely on patients describing their symptoms, psychological tests, and clinical evaluations to diagnose these issues. While these methods are useful, they can sometimes be unreliable because they depend on how patients express their feelings, take a lot of time, and can lead to mistakes in diagnosis. With the growth of artificial intelligence and deep learning, computers are now being used to help identify mental health disorders automatically.
A comprehensive review of existing methodologies, including machine learning and deep learning approaches, for mental health diagnosis further highlights this growing trend and the advancements in the field (Iyortsuun et al., 2023). Deep learning models, like Convolutional Neural Networks and Long Short-Term Memory networks, have been very effective in studying different body patterns linked to mental health problems. CNNs are good at analysing images and sensor data, while LSTMs are great at finding patterns in time-based data, such as brain signals from EEG, heart rate changes, and facial expressions. By combining smart healthcare technology with deep learning, mental health diagnosis could become much better. This would allow real-time tracking, personalized treatment, and early intervention. Emerging approaches, such as reinforcement learning with multimodal emotion recognition, are also showing promise in promoting mental health (Pathirana et al., 2024). The use of multimodal data fusion, for instance, from educational settings, is also gaining traction for detecting students' mental health concerns (Guo et al., 2022). Similarly, deep learning systems leveraging real-time bio signals have been developed for predicting various health conditions, including stroke disease, showcasing the broader applicability of such approaches in smart healthcare (Choi et al., 2021). Using Bayesian optimization in deep learning also helps improve accuracy, make training models faster, and prevent errors from happening too often.
This research is important because there is a big need for easy and effective ways to detect emotional and mental health problems early. Doctors mostly depend on talking to patients and personal opinions to diagnose these issues. But this may not work for everyone, especially for those who feel ashamed, don’t know they have a problem, or find it hard to talk about their feelings. Wearable health devices, smart monitoring systems, and AI- based tools can help by tracking emotions regularly and providing useful data for better mental health care. AI- driven mental health diagnostics are gaining interest, but several challenges make it difficult to use them widely. One major issue is that mental health conditions are very complex and vary a lot from person to person, so it is hard to create a single solution that works for everyone. Another problem is the lack of high-quality datasets. Many of the available mental health data collections are unbalanced, which can cause AI models to make biased predictions. Additionally, deep learning models are often difficult to understand, which makes it hard for doctors and mental health professionals to trust their decisions. These AI models also require much computing power, making real-time applications difficult. On top of that, using AI for mental health raises serious privacy and ethical concerns. Mental health data is very sensitive, and there are issues related to security, user consent, and privacy protection.
This study presents a novel approach that combines convolutional neural networks (CNN) for extracting essential features with long short-term memory (LSTM) networks for identifying temporal patterns, thereby enhancing detection accuracy. The proposed model incorporates automated hyperparameter tuning, which improves overall efficiency and minimizes the need for extensive computational resources. It is specifically designed for integration into IoT-based health monitoring systems, enabling continuous tracking of emotional health. The model has been validated using real EEG and physiological datasets, demonstrating superior performance compared to traditional machine learning techniques. Additionally, the research includes the implementation of privacy-preserving mechanisms to ensure patient data protection while maintaining the effectiveness of AI-based diagnostic outcomes.
2. Literature Review
Emotional and mental health problems are a growing concern in healthcare, making it important to develop automated systems for early detection and treatment. Traditional methods for diagnosing mental health issues depend on clinical interviews, self-reported questionnaires, and personal evaluations, which can sometimes be biased and take time. Using deep learning in smart healthcare has shown great potential to improve mental health diagnoses' accuracy, speed, and reach. This review looks at studies on deep learning methods for detecting emotional disorders, how they are used in smart healthcare, and the main challenges involved.
Early models for detecting emotional and psychological disorders used basic machine learning methods like Support Vector Machines (SVM), Random Forest (RF), and K-Nearest Neighbours (k-NN). These models needed experts to manually find important features from speech, facial expressions, brain and heart signals (EEG, ECG), and text data. For example, Gupta et al. (2020) used ML and DL to study brain signals and identify depression, getting a high accuracy than traditional approach. In another study, Chatterjee et al. (2021) used language processing techniques to analyse social media posts and detect early signs of depression and anxiety. While these machine learning models worked well, they depended too much on manually selected features and often struggled to work well in real-life situations.
Deep learning has dramatically helped in understanding emotions and detecting psychological disorders. Convolutional Neural Networks (CNNs) are great for recognizing emotions from images and speech. For example, Zhao et al. (2022) and Almutairi et al. (2020) used CNNs to study facial expressions and found they could detect depression with 92% accuracy. Wu et al. (2021) used a CNN model to examine audio recordings and determine stress levels using spectrograms. Recurrent Neural Networks (RNNs) and Long Short-Term Memory (LSTM) networks understand time-based data well. This makes them useful for analysing signals from the body, like EEG and ECG, to detect emotional issues. Patel et al. (2023) created an LSTM model that identified anxiety and depression from EEG signals with 89.4% accuracy. Similarly, Ahmed et al. (2022) combined Bi-LSTM with an attention mechanism to detect emotional disorders from text records in electronic health systems. In the field of speech analysis, Vamsinath et al. (2022) demonstrated the effectiveness of machine learning for stress detection.
Recent research has introduced advanced deep learning models combining techniques like CNNs, LSTMs, and Transformers to help identify psychological disorders. For example, Kim et al. (2023) created a model that used both CNN and LSTM to analyse facial expressions, speech, and brain activity (EEG) to detect bipolar disorder. Their model was highly accurate, achieving a 94.2% success rate. Another study by Xu et al. (2023) used a Transformer-based method to monitor mental health in real time. Their approach significantly reduced the number of false alarms, making it more reliable. These new models are powerful because they can process multiple data types and learn patterns over time, making them ideal for real-time healthcare applications.
Even though deep learning has been successful in mental health applications, there are still many challenges. One major issue is the lack of enough labelled data on psychological disorders. Many datasets are small, and the data is often imbalanced, making the models biased and unreliable. Another big problem is that deep learning models work like black boxes, meaning they make decisions in ways that are hard to understand or explain, especially in medical applications where clarity is essential. Privacy is also a serious concern because psychological health data is very sensitive. It must be handled carefully, and strict rules like GDPR and HIPAA must be followed to protect people's personal information.
Additionally, deep learning models require a lot of computing power, which makes it challenging to use them on smaller devices for real-time smart healthcare applications. In recent years, the use of deep learning to identify emotional and psychological disorders in smart healthcare has gained considerable attention. Researchers have explored various methods and datasets to improve the accuracy with which these conditions can be diagnosed and monitored in real time. Many studies have examined different ways to achieve better results in this field.
One study by Le et al. (2021) used CNN and LSTM models on EEG signals to identify emotional distress and found that their method performed better than traditional approaches. Similarly, Kim et al. (2020) developed a hybrid CNN-LSTM model for detecting depression and achieved high accuracy using physiological signals. Snoek et al. (2012) introduced Bayesian optimization to improve deep learning models, making them less complex and more efficient. Artificial intelligence has also been used in mental health monitoring systems. Chen et al. (2019) created an AI-based system that combined deep learning with IoT devices to detect mental health issues in real- time. EEG signals have also been used to assess mental health, with Li et al. (2022) applying deep learning to detect stress and anxiety, showing better results than traditional machine learning.
Sentiment analysis has also played a role in psychological diagnosis. Hassan et al. (2021) used natural language processing and deep learning to analyse text and detect psychological conditions. Similarly, Zhang et al. (2020) showed that schizophrenia could be detected in its early stages by applying CNNs to MRI scans, improving accuracy. Feature extraction techniques have also been explored, with Sun et al. (2019) investigating how deep autoencoders could be used for emotion recognition. Wearable technology has also contributed to mental health research. Wang et al. (2021) studied how wearable sensors could collect data that deep learning models could process to detect stress in real-time. Patel et al. (2021), focused on creating a personalized mental health tracking system using AI, mobile applications, and cloud computing.
Deep learning has also been applied to voice and facial expression analysis to detect depression. Singh et al. (2020) used CNNs to classify depression based on voice and facial expressions, achieving a high F1 score. AI- powered chatbots have also been developed for psychological support, with Park et al. (2022) introducing a chatbot that used deep learning to provide real-time support and early intervention. Researchers have also focused on making AI more transparent in decision-making. Xie et al. (2020), worked on explainable AI models to help users understand how AI identifies psychological disorders. Similarly, Miller et al. (2021) analysed social media text with deep learning to predict suicide risk. A comparison between machine learning and deep learning in mental health was conducted by Khan et al. (2019), where they found that deep learning performed better in identifying mental health disorders.
Speech analysis has also been used in this field. Sharma et al. (2021) applied deep learning to analyse speech patterns to detect early signs of anxiety and depression. Deep reinforcement learning has also been explored for mental health treatments. Yu et al. (2021), developed a system to optimize patient intervention strategies. Meanwhile, multimodal fusion approaches have been studied by Gupta et al. (2020), who combined EEG, facial expressions, and speech data to improve diagnosis accuracy. Choi et al. (2019) worked on real-time EEG signal processing using deep learning to monitor mental health conditions. Snoek et al. (2012) introduced Bayesian optimization to improve deep learning models, making them less complex and more efficient. Lastly, ethical concerns have also been considered. Brown et al. (2022) discussed the challenges and privacy concerns of using AI in mental health diagnosis, highlighting the importance of ethical considerations in this growing field. Furthermore, a comprehensive review of AI in mental health has highlighted the technological advancements and ethical issues prevalent in psychiatry (Poudel et al., 2025).
3. Problem Statement and Dataset Description
Mental health issues, including emotional and psychological conditions, are becoming more common worldwide. It is essential to identify these problems early and accurately. Traditional methods rely on clinical assessments, self-reported questionnaires, and occasional evaluations. However, these approaches can delay diagnosis, lead to inconsistent results, and make treatment less effective. They also do not allow for real-time monitoring or personalized care for people struggling with mental health challenges. With advancements in smart healthcare, deep learning has become a valuable tool for analysing brain signals, such as EEG and other bio signals, to detect emotional and psychological disorders. However, current machine learning and deep learning models have some limitations. They struggle to extract essential features efficiently, fail to recognize patterns over time, and do not always have well-tuned settings, which reduces accuracy and increases processing time.
To overcome these problems, this study introduces an improved deep-learning model for identifying emotional and psychological disorders in smart healthcare systems. It uses a combination of two techniques: CNN and LSTM. The CNN extracts essential patterns from the data, while the LSTM helps understand changes over time. To further improve performance, Bayesian optimization is used to fine-tune the model’s settings, making it more efficient and reducing unnecessary processing. Even though deep learning models have great potential, they still need improvements to work effectively in real-time healthcare monitoring. This research aims to create a system that is more accurate, reliable, and capable of providing real-time mental health assessments. By testing the model on well-known datasets, this study hopes to show that it performs better than traditional machine learning methods. Ultimately, this approach could help make mental health diagnoses faster, more accurate, and better suited to individual needs.
The EEG dataset was provided by 16 healthy individuals (aged 22 to 30) who had never used a BCI. EEG data was recorded using 16 active electrodes (g.USBamp system). They were captured at 256 Hz and positioned in compliance with the 10-20 international standard. As part of a visual speller task designed to elicit emotional reactions (Neutral, Stressed, Anxious), each target letter was shown in five randomly selected stimulus sequences. The stimulus features included a 150 ms inter-stimulus delay and a 250 ms stimulus onset asynchrony. To remove artefacts, EEG data were pre-processed and separated into tagged epochs. In order to create a structured dataset that could be utilised to train the CNN-LSTM model for emotional state classification, both temporal and frequency domain characteristics were extracted (https://ieee-dataport.org/documents/event-related-potentials-p300-eeg-bci-dataset).
4. Proposed Model
Smart healthcare systems have greatly improved by using advanced artificial intelligence to help find and diagnose emotional and mental health problems. In the past, doctors had to rely on their own judgment, sometimes leading to mistakes or treatment delays. To solve this problem, this study introduces a deep learning model that combines two methods: Convolutional Neural Networks (CNNs) to pick out essential details from data and Long Short-Term Memory (LSTM) networks to understand patterns over time. By working together, these methods make it easier to detect emotional and mental health issues quickly and accurately. The overall structure of the proposed model is shown in Figure 1.


The process started with collecting and preparing the data. EEG and other physiological signals were taken from well-known datasets like DEAP, SEED, or AMIGOS. These datasets include brain activity recorded while individuals experienced different emotions. The raw signals contained unwanted noise, so a bandpass filter was used to clean them. The cleaned signals were divided into smaller segments to help the model process them better. Standard normalization techniques, like z-score normalization or min-max scaling, were applied to ensure all features were on the same scale, helping the model train more efficiently.
During training, the pre-processed data was fed into the CNN-LSTM model in small batches. The model learned by adjusting its internal parameters using backpropagation and gradient descent, which reduced errors over time. Training lasted for a fixed number of cycles (epochs), but early stopping prevented overtraining if no further improvements were observed. The dataset was divided into 80% for training and 20% for validation, ensuring the model could generalize well.
Convolutional neural networks (CNNs) were specifically used for feature extraction in this study rather than categorisation since they have shown the capacity to learn spatial hierarchies from structured input, such as physiological data and EEG. Even while EEG data is mostly temporal, it offers rich spatial information since separate electrodes (channels) detect activity from different areas of the brain. Because CNNs can detect local activation patterns and spatial correlations across several electrode channels, they are ideal for simulating spatial dependencies in multichannel EEG data.
CNNs may also efficiently extract frequency-domain features from raw or pre-processed input that incorporates time-frequency representations (such spectrograms or wavelet transformations). This enables them to recognise discriminative patterns in a variety of emotional states, including reduced alpha activity or increased beta activity associated with worry or stress. CNN layers, in contrast to conventional classifiers, function as automatic feature extractors, eliminating the need for manually created features and facilitating end-to-end learning. The LSTM network, which is excellent at capturing temporal dynamics and long-term relationships that cut across time frames, is then fed these gathered attributes. This division of work improves the model's capacity to learn intricate spatiotemporal patterns, which are essential for emotion recognition, with CNN handling spatial feature extraction and LSTM handling temporal modelling.
To get the best results, hyperparameters such as learning rate, batch size, the number of CNN filters, and LSTM hidden units were optimized using Bayesian Optimization. This method tested different values and found the best combination to improve accuracy and reduce unnecessary calculations. The model was trained using the Adam optimizer, which adjusts learning rates automatically to make training smoother. A dropout technique was also applied to prevent overfitting so the model could perform well on new data.
CNNs process the input data, such as Electroencephalogram (EEG) signals or physiological indicators, by extracting spatial features through convolutional layers. Given an input image or signal matrix X, the convolution operation can be mathematically represented as:
where: $F_{i, j}^l$ denotes the feature map at layer l, $W_{m, n}^{(l)}$ denotes convolutional kernel (filter), $b^{(l)}$ denotes the bias term and $\sigma(.)$ denotes activation function.
To analyse temporal dependencies, the extracted features are then stored in an LSTM network. The LSTM cell state enhancement is given by:
where: $f i$ 𝑖𝑠 𝑓𝑜𝑟𝑔𝑒𝑡 𝑔 𝑎𝑡𝑒, 𝑖𝑡 𝑖𝑠 𝑖𝑛𝑝𝑢𝑡 𝑔𝑎𝑡𝑒 𝑎𝑛𝑑 𝑜𝑡 𝑖𝑠 𝑜𝑢𝑡𝑝𝑢𝑡 𝑔𝑎𝑡𝑒, $C t$ is memory cell state, $W_f, W_i, W_C, W_o$ are the weight of matrices, $b_f, b_i, b_C, b_o$ are biases, $h t$ is the hidden state at time t, $x_t$ is input at time t and $\sigma(.)$ is the sigmoid activation function.
This work used Bayesian optimisation to enhance classification performance by changing key CNN-LSTM architecture hyperparameters. The following hyperparameters were among those that were optimised:
Learning rate: controls the step size during gradient descent; improper settings might cause slow convergence or overshooting.
Batch size: affects training stability and generalisation; larger batches produce noise, whereas smaller batches improve stability.
Number of CNN filters: assesses how well the model can extract complex spatial information from EEG inputs.
Number of LSTM hidden units: impacts the model's ability to detect temporal relationships in the sequential data.
Dropout rate: randomly deactivates units during training to prevent overfitting.
These parameters were selected for optimisation as they directly affect the model's representational capacity, learning dynamics, and generalisation performance. Bayesian optimisation successfully explores the hyperparameter space while finding a balance between exploration and exploitation by using a surrogate probabilistic model. This approach produced notable performance improvements; the accuracy of the optimised model increased from 88.9% to 92.1%. Bayesian is used to upgrade the model performance by fine-tune the hyperparameter. The optimization goal is:
where: $\theta$ denotes hyperparameters and $\mathrm{f}(\theta)$ denotes performance metrics. In Bayesian framework, a Gaussian process (GP) models$\mathrm{f}(\theta)$, the posterior distribution is repeatedly updated with new observations from the prior.
After training, the model’s performance was tested using accuracy, precision, recall, and F1-score. It was evaluated on unseen data to measure how well it could predict emotional disorders. A confusion matrix was used to check where the model made mistakes.
where: TP, TN, FP, and FN represent true positives, false positives, and false negatives.
The experiment was designed to test how well a deep learning model can identify emotional disorders using a combination of CNN and LSTM. The study used publicly available datasets containing signals from the brain and body, such as EEG and ECG, which help understand emotional states. These datasets were processed and labelled to differentiate between emotions and psychological conditions, allowing the model to learn patterns effectively.
To test how this model would work, a live setup was created using wearable brain-computer interface (BCI) devices. These devices captured real-time EEG signals, which were processed and analysed instantly by the CNN- LSTM model. The emotional states were displayed on a live dashboard, allowing for real-time monitoring. This experiment demonstrated that the model could be used in healthcare settings, helping psychologists and mental health professionals track emotional disorders more effectively.
5. Results and Discussion
The proposed CNN-LSTM model for emotional disorder identification in smart healthcare systems was evaluated against traditional machine learning models and standalone deep learning architectures. The simulation results of the proposed approach are discussed in this section with graphical and numerical results in detail.
Figure 2 shows how a model's accuracy improves over time during training. The yellow line represents the training accuracy, and the red dashed line represents the validation accuracy. As the number of epochs increases, both accuracies improve, meaning the model is learning well. By the end of training, the validation accuracy is close to the training accuracy, indicating that the model performs well on new data.

Figure 3 shows how the model's loss decreases over time during training. The yellow line represents the training loss, and the red dashed line represents the validation loss. Both losses start high and gradually decrease as the number of epochs increases, meaning the model is learning well. Since the validation loss follows the training loss closely, the model does not overfit and generalizes well to new data.

The model's performance was measured using accuracy, precision, recall, and F1-score, as shown in Table
1. The results demonstrate that the CNN-LSTM model significantly outperforms other methods, achieving the highest accuracy of 92.1%, along with superior precision (0.93), recall (0.91), and F1-score (0.92).
Model | Accuracy (%) | Precision | Recall | F1-Score |
SVM | 78.5 | 0.79 | 0.77 | 0.78 |
Random Forest | 81.2 | 0.82 | 0.80 | 0.81 |
CNN | 85.6 | 0.86 | 0.84 | 0.85 |
LSTM | 87.3 | 0.88 | 0.86 | 0.87 |
CNN-LSTM (Proposed) | 92.1 | 0.93 | 0.91 | 0.92 |
