Cybersecurity in Intelligent Transportation Systems:  A Comparative Study on AI-Based Anomaly Detection and  Threat Analysis

recep arslan; turgut özseven; metin mutlu aydın; yasin çelik

Outline

Open Access

Research article

Cybersecurity in Intelligent Transportation Systems: A Comparative Study on AI-Based Anomaly Detection and Threat Analysis

Recep Arslan¹^*

,

Turgut Özseven²

,

Metin Mutlu Aydın³

,

Yasin Çelik⁴

¹

Graduate School of Natural and Applied Sciences, Department of Computer Engineering, Tokat Gaziosmanpaşa University, 60250 Tokat, Turkey

²

Department of Computer Engineering, Faculty of Engineering and Architecture, Tokat Gaziosmanpaşa University, 60250 Tokat, Turkey

³

Department of Civil Engineering, Faculty of Engineering, Ondokuz Mayıs University, 55139 Samsun, Turkey

⁴

BRE Institute of Sustainable Engineering, School of Engineering, Cardiff University, CF24 3AA Cardiff, United Kingdom

Mechatronics and Intelligent Transportation Systems

|

Volume 5, Issue 1, 2026

|

Pages 11-30

https://doi.org/10.56578/mits050102

Received: 12-15-2025,

Revised: 02-16-2026,

Accepted: 02-27-2026,

Available online: 03-05-2026

View Full Article|

Download PDF

Abstract:

The rapid integration of technology, with increasing speeds, has transformed vehicles into cyber-physical systems by connecting them to each other Vehicle-to-Everything (V2X), significantly expanding the attack surface and leaving them vulnerable to network-based threats. Current cyber intrusion detection systems (CIDS) exhibit performance degradation due to significant class imbalance, limited resilience against adversarial attacks, and insufficient interpretability for security-critical environments. To overcome the identified issues in this study, we propose Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection (HCABS-NID), a hierarchical classifier-agnostic boosted stacking architecture for network intrusion detection in connected device ecosystems. The proposed framework adds the Synthetic Minority Over-sampling Technique for Nominal and Continuous features (SMOTENC)-based adaptive class balancing to increase minority attack detection and TreeSHAP to make it multi-level interpretable. As a hierarchical stacking strategy, a two-layer structure includes heterogeneous learners together with meta-learning, calibrated with LightGBM, XGBoost, CatBoost, and TabNet to take advantage of the complementary decision boundaries. Extensive experiments performed on the benchmark dataset from University of New South Wales Network-Based 15 (UNSW-NB15) should enhance generalization performance. HCABS-NID achieved 98.20% accuracy, 97.10% macro F1 score, and 0.989 macro Receiver Operating Characteristic Area Under the Curve (ROC-AUC), in contrast to the latest community-based methods found in the literature. The proposed model achieves 3.40 ms average inference latency, satisfying the real-time processing requirement of the V2X safety systems. Indeed, other analysis architectures show the same 96.8% accuracy at 5% corruption, which underscores their practicality. The results validate that hierarchical ensemble learning, with adaptive imbalance management artificial intelligence (AI) mechanisms, provides a sound, interpretable, and ready-to-use intelligent transportation security package.

Keywords: Network intrusion detection, Vehicle-to-Everything security, Connected vehicles, Hierarchical stacking ensemble, Explainable artificial intelligence, Adversarial robustness, Intelligent transportation systems, Internet of Things security

1. Introduction

In terms of the transportation industry, automotive industry and its derivatives are turning vehicles from mechanized units into integrated cyber-physical systems. To this end the paradigm shift is being rapidly reinforced by the extensive use of Vehicle-to-Everything (V2X) communication protocols such as Vehicle-to-Vehicle (V2V), Vehicle-to-Infrastructure (V2I), Vehicle-to-Pedestrian (V2P), and Vehicle-to-Network (V2N) [1], [2]. Built for connected vehicle systems, V2X tools can moderate autonomous operating systems and also improve autonomous driving for more comfort, safety, and energy efficiency. By 2025, the global connected vehicle market will be worth $166 billion. Although such growth is the key opportunity to provide efficiency and safety benefits for transportation, it also brings significant safety problems [3], [4].

Research in the field of connected vehicle safety has gained significant momentum over the past decade. Early demonstrations of remote vehicle exploitation have raised the urgency of automotive cybersecurity on a global scale. Researchers managed to control the vehicle remotely by using a vulnerability in the infotainment system of a Jeep Cherokee; this incident led to the recall of 1.4 million vehicles. Koscher et al. [5] experimentally demonstrated the security vulnerabilities of the in-vehicle Controller Area Network (CAN) bus; it has shown that the absence of authentication and encryption mechanisms poses serious threats. Checkoway et al. [6] have expanded this attack surface to include telematics, Bluetooth, and in-car entertainment systems. Nie et al. documented remote control vulnerabilities with their security analysis on Tesla Model S. Additional security analyses on production vehicles documented remote control vulnerabilities. Their scope of work clearly shows that connected vehicles need multi-layered security mechanisms. Modern connected vehicles include more than 100 Electronic Control Units (ECUs), more than 150 million lines of code, and a large number of wireless interfaces according to recent industry assessments [7]. This complexity significantly expands the attack surface. In-vehicle networks CAN, Local Interconnect Network, FlexRay, Ethernet have traditionally been designed without security in mind, making them vulnerable to modern cyber threats. Although V2X communication standards Dedicated Short-Range Communication/Wireless Access in Vehicular Environments (DSRC/WAVE), Cellular Vehicle-to-Everything (C-V2X) contain security mechanisms, there are vulnerabilities at the application level [8], [9]. Recent automotive cybersecurity assessments highlight a substantial increase in cyber incidents over recent years; this situation highlights the urgency of security solutions [7].

Traditional signature-based intrusion detection systems can only recognize known attack patterns; It is insufficient to detect zero-day attacks and polymorphic threats [10], [11], [12]. These systems require constant updating of attack signatures; it offers a reactive approach in the face of new threats. Machine learning methods have been proposed to overcome these limitations with anomaly detection systems [13], [14]. These different approaches can detect deviations by learning normal behavior systems. However, single-model approaches struggle to deliver consistent performance when faced with different attacks [15]. A particularly high-performance drop has been observed in the detection of low-level attack types (Worms, Shellcode, Backdoor) [16], [17]. The proposed community learning in this study offers a strategy to eliminate the limitations of single model proposals by combining the strengths of different learning paradigms. The success of community approaches is based on the diversity and complementarity of individual models [18]. Bagging [19], boosting [20], [21] and stacking [22] have shown promising results in network attack detection [23], [24], [25]. Gradient boosting decision trees such as XGBoost, LightGBM [26], and CatBoost [27], are among the most influential algorithms in this field. These algorithms correct previous errors in each iteration with sequential tree training; It provides high forecasting performance. On the other hand, TabNet [28] offers interpretable deep learning by adapting attention mechanisms to table data; performs attribute selection dynamically (Figure 1).

Figure 1. Vehicle-to-Everything (V2X) communication architecture and AI-based security system

Explainable AI (XAI) is a critical component in safety applications, ensuring the transparency of model decisions [29], [30]. With the proliferation of black box models, the need to understand and validate model decisions has increased. SHapley Additive exPlanations (SHAP) calculates feature contributions using Shapley values from game theory [31]. TreeSHAP provides optimized computational power for tree-based models and delivers more accurate values in polynomial time [32]. In security-centric operations (in SOCs), the explainability of machine learning models is an important requirement in terms of prioritizing generated alarms more accurately and quickly (alarm triage), investigating possible proactive attacks (threat hunting), and meeting legal/regulatory requirements [33], [34], [35]. The European Union (EU) General Data Protection Regulation (GDPR) grants the right to explanation in autonomous decision-making systems [36].

The main contributions of the research can be summarized as follows: (1) unique Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection (HCABS-NID) architecture that combines four different learning paradigms (LightGBM, XGBoost, CatBoost, TabNet) in a hierarchical community structure; (2) improving learning of rare attack types with Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTENC)-based adaptive class balancing; (3) interpretability of model decisions at global, local, and classroom level with TreeSHAP integration; (4) evaluation of the reliability of the model under perturbations by adversarial durability analysis; (5) meeting V2X safety requirements with 3.40 ms inference latency. The rest of the article is organized as follows: In the second part, the theoretical foundations of network intrusion detection systems (NIDS) and related studies are discussed comprehensively. The third chapter details the research methodology and experimental design. In the fourth chapter, experimental findings are presented and conclusion and suggestions are summarized in the fifth chapter.

2. Theoretical Framework and Related Research

NIDS are security mechanisms which is designed to determine various activities in computer networks. In this section, the historical development of NIDS, machine learning-based applications, community learning paradigms, and XAI concepts are systematically discussed. The existing literature provides the positioning of the current study in the context of connected vehicle safety; reveals research gaps and original contributions.

2.1 Historical Evolution of Network Intrusion Detection Systems

The conceptual foundations of intrusion detection systems (IDS) are based on Anderson’s [37] work in security threat modeling. Anderson systematically classified threats originating from within and outside; suggested that abnormal user behavior could be detected by statistical methods. Within the scope of the study, he established the theoretical foundations of intrusion detection and guided the research of the following decades. Denning [10] developed the Intrusion Detection Expert System (IDES) model by concretizing these suggestions. IDES uses statistical techniques to create user activity profiles; It marks deviations from normal behavior as anomalies. This approach was the pioneer of anomaly-based attack detection and is still relevant today. The 1990s saw the rise of network-based intrusion detection. Heberlein et al. [38] introduced the concept of network layer monitoring with Network Security Monitor (NSM); this paradigm has formed the basis of today's NIDS architectures. NSM performs protocol analysis by capturing network packets; It looks for known attack patterns. Paxson [39] developed the Bro (now Zeek) network analysis framework, integrating protocol analysis with intrusion detection. Bro allows flexible rule definition with its policy-based detection mechanism. Roesch [40] presented the Snort open source intrusion detection system; This tool is still widely used. Vigna and Kemmerer [41] applied state transition analysis to network security with the NetSTAT system; It has made it possible to detect complex attack scenarios. The Event Monitoring Enabling Responses to Anomalous Live Disturbances (EMERALD) framework also advanced distributed event-correlation for intrusion response [42]. Benchmarking datasets have been critical in standardizing progress in the field. The The KDD Cup 1999 dataset provided the first comprehensive evaluation framework [43]. This study, supported by DARPA, includes 4.9 million network connection records and covers four key attack categories (Denial of Service (DoS), Probe, Remote-to-Local (R2L), User-to-Root (U2R)). However, this dataset has become outdated over time; it has been determined that it contains excess and artificial patterns [44]. When McHugh critiqued the methodological difficulties presented by the datasets, the community and researchers needed new datasets. Though the NSL-KDD dataset partially addressed these shortcomings, it remains inadequate to present conditions in today’s network traffic. The other generated datasets (Canadian Institute for Cybersecurity Intrusion Detection System 2017 (CICIDS2017) [45] and University of New South Wales Network-Based 15 (UNSW-NB15) [46]) are newer and more effective in real-time attacks on current attack types. With high data size and abundant training-test set, they are now recognized as industry standard benchmark datasets used for today's tools. For security research in any location or field where Internet of Things (IoT) devices are used, the Botnet Internet of Things (Bot-IoT) [47] and Telemetry and Operating Network Internet of Things (ToN-IoT）[48] datasets are specifically designed (Table 1).

Table 1. Comparison of publicly available intrusion detection datasets

Dataset	Year	Registration	Attribute	Class	Timeliness
KDD Cup 99	1999	4.9 M	41	5	Low
NSL-KDD	2009	150 K	41	5	Medium
CICIDS2017	2017	2.8 M	80	15	High
UNSW-NB15	2015	2.5 M	49	10	High
CSE-CIC-IDS2018	2018	16 M	80	7	High
Bot-IoT	2019	73 M	46	5	High
ToN-IoT	2021	22 M	44	10	High

Note: KDD Cup 1999: Knowledge Discovery and Data Mining Cup 1999; CICIDS2017: Canadian Institute for Cybersecurity Intrusion Detection System 2017; UNSW-NB15: University of New South Wales Network-Based 2015; CSE-CIC-IDS2018: Canadian Security Establishment – CIC Intrusion Detection System 2018; Bot-IoT: Botnet Internet of Things; ToN-IoT: Telemetry and Operating Network Internet of Things.

2.2 Machine Learning-Based Detection Approaches

The field of machine learning for intrusion detection dates from the late 1990s when researchers first realised the potential of data-driven methodologies in detecting malicious behaviour on networks [49], [50]. A key contribution was by Lee and Stolfo [51] who showed that data mining techniques—specifically sequential pattern analysis and association rule mining—can be systematically applied on raw network traffic for obtaining meaningful discriminative features. Their work set the stage for an early conceptual understanding that later became a cornerstone for much of the work that followed. When the field matured, Support Vector Machines (SVMs) arose as a popular system that can handle high-dimensional classification tasks [52], [53]. By transforming input features into a higher-dimensional space with the help of kernel functions, SVMs are capable of creating a non-linear decision boundary and generalising nicely in scenarios where labelled training samples are difficult to come by—an attribute of significant practical importance in security domains where attack instances are inherently rare. Decision trees and ensemble methods based on them became quite popular, mainly due to their interpretability and computational tractability, making them attractive for operational deployment [54], [55], [56]. The Random Forest algorithm, in particular, addressed a well-known limitation of individual decision trees by applying the bagging principle—training multiple trees on bootstrapped subsets of the data and aggregating their predictions—thereby substantially reducing variance without a corresponding increase in bias.Since the mid-2010s, deep learning (DL) approaches have been applied in intrusion detection [57]. Javaid et al. [58] achieved 88.39% accuracy in NSL-KDD on an autonomous system using supervised learning and sparse autoencoders. Related deep-learning IDS studies report consistent gains in real-time settings [59]. Autoencoders learn normal traffic patterns and detect anomalies through reconstruction error. Kim et al. [60] demonstrated the potential of long-short-term memory (LSTM) networks in time-series-based attack detection. LSTM can model sequential network traffic by capturing long-term dependencies; It is especially advantageous in detecting slow attacks. Wang et al. [61] processed with convolutional neural networks (CNN) by converting raw network traffic into visual representations; This approach has achieved over 99% accuracy. Hybrid models combine the strengths of traditional and deep learning approaches [62], [63], [64], [65]. CNN-LSTM hybrid architectures are able to capture spatial and temporal patterns simultaneously.

Transfer learning and pre-trained models constitute a new area of research in attack detection [66], [67]. Zhao et al. [66] performed model transfer between different network environments using domain adaptation techniques. Wu et al. [49] applied the federated learning approach to connected vehicle security. Blockchain-supported federated paradigms have also been explored for autonomous vehicles [68]. Federated learning protects data privacy with decentralized model training; it makes it possible for vehicles to train a common model without sharing their own data. Attention mechanisms are used to dynamically calculate feature importance [69], [70]. These mechanisms automatically learn which features the model should focus on. The basic model architectures and classifications used in network attack detection are explained below:

a. Sparse Autoencoder: It is a deep learning-based unsupervised architecture. Learns feature representations compressed by sparsity constraint. It compresses the input data into the bottleneck layer with the encoder and reconstructs it with the decoder. By learning normal traffic patterns, it detects anomalies from reconstruction errors. It is particularly effective in detecting zero-day attacks.

b. LSTM: It is a DL-based recurrent neural network (RNN) architecture. It is designed to model long-term temporal dependencies in ordinal data. It includes three gate mechanisms: forget gate, input gate and output gate. It is used to detect temporal patterns and attack sequences in network traffic.

c. CNN-LSTM Hybrid: A hybrid approach method which combines the strengths of two deep learning architectures. The CNN component extracts spatial features from used network traffic data. The LSTM component models temporal dependencies in these properties. This combination enhances detection performance by capturing both local patterns and sequential relationships.

d. DNN + Feature Selection: It is a combination of Deep Neural Network (DNN) and machine learning-based feature selection methods. Dimension reduction is implemented through methods such as Chi-square, Mutual Information, or Recursive Feature Elimination. Then, classification is done with multi-layer fully connected DNN. This approach allows for a focus on meaningful features while reducing computational cost [71].

e. Two-Level Stacking: It is a hierarchical method based on ensemble learning. It is not deep learning, but a meta-learning approach. At the first level, more than one base learner is trained. At the second level, the meta-learner makes the final decision by taking the predictions of the basic learners as input. It reduces variance and improves generalization performance.

f. Ensemble Learning: It is a machine learning paradigm that combines the predictions of multiple models. It does not involve deep learning, it uses classical machine learning models (Random Forest, gradient boosting, SVM). It includes three basic strategies: Bagging (bootstrap aggregating), Boosting (sequential learning), and Stacking (meta-learning). It compensates for the weaknesses of singular models in practice.

g. HCABS-NID (This Study): Hierarchical Cross-Attention Based Stacking for Network Intrusion Detection. It is a hybrid community architecture. It uses LightGBM, XGBoost, CatBoost (gradient boosting), and TabNet (attention-based DL) as core learners. It implements two levels of meta-learning: probability calibration with Logistic Regression at Level-1, ultimate incorporation with gradient boosting at Level-2. It provides explainability integration with TreeSHAP.

2.3 Community Learning and Hierarchical Stacking

Ensemble learning is a paradigm which combines the predictions of different models to overcome the limitations of singular models. The success of community approaches is based on the diversity and complementarity of individual models [18]. When various models exhibit different error patterns, combining the predictions reduces the overall error. Bagging (Bootstrap Aggregating) combines the predictions of parallel trained models with majority voting or averaging; prevents overfitting by reducing variance [19]. Random Forests provides strong generalization by applying the bagging principle to decision trees and is widely used in network attack detection [25]. Boosting trains models in a sequential manner; Each model focuses on the errors of the previous model. AdaBoost [20] provides iterative improvement by assigning higher weight to misclassified samples. This approach transforms weak learners into a strong community model.

Gradient Boosting [21] produces more powerful models by optimizing the gradient of the loss function. XGBoost improves gradient boosting with regularization terms and parallel computing. The terms L1 and L2 regularization prevent overfitting by controlling model complexity. LightGBM [26] provides computational efficiency with its histogram-based approach and leaf-based growth strategy. The histogram approach reduces memory usage and computational time by dividing continuous values into discrete boxes. CatBoost [27] uses special processing and sequential boosting for categorical features. Sequential boosting improves generalization performance by preventing target leakage. Stacking provides hierarchical aggregation with a meta-learner using the predictions of different models as input [22]. This approach offers the potential to combine the strengths of different learning paradigms. Hierarchical stacking architectures make it possible to learn more complex patterns with multi-layered model fusion. Tama and Rhee [24] achieved 96.2% accuracy on the UNSW-NB15 with two levels of stacking. Gao et al. [23] addressed the problem of class imbalance with the heterogeneous community model. Lazzarini et al. [72] propose stacking-ensemble deep learning for IoT intrusion detection. Class imbalance is a critical issue in network intrusion detection. Normal traffic is much higher than attack traffic; rare types of attacks are even less represented [17]. This limitation has also been discussed in the literature [73]. This imbalance causes standard machine learning algorithms to gravitate towards the majority class; resulting in low sensitivity for minority classes. SMOTE (Synthetic Minority Over-sampling Technique) reduces this imbalance by producing synthetic samples for minority class samples [74]. SMOTE creates new examples by interpolating between existing minority samples. SMOTENC is the SMOTE variant that provides special handling for categorical attributes; For categorical values, the mode value of the nearest neighbors is used. ADASYN [75] performs adaptive sampling by focusing on hard-to-learn samples. Cost-sensitive learning deals with misclassification costs in an asymmetrical way [76] (Table 2).

Table 2. Comparison of model architectures used in network intrusion detection

Model	Type	Deep Learning	Key Ingredients	Benefits
Sparse Autoencoder	Unsupervised DL	Yes	Encoder-Encoder, Sparsity	Anomaly detection, zero-day
LSTM	RNN (DL)	Yes	Forget/Entry/Exit Gates	Temporal dependence
CNN-LSTM Hybrid	Hybrid DL	Yes	Conv $+$ LSTM Layers	Spatial $+$ temporal
DNN $+$ Self. Selection	DL $+$ ML	Yes	Feature Selection $+$ MLP	Dimension reduction
Two-Level Stacking	Ensemble	No	Basic $+$ Meta Learners	Variance reduction
Community Learning	Ensemble	No	Bagging/Boosting/Stacking	Durability
HCABS-NID	Hybrid Community	Partially*	LightGBM $+$ XGB $+$ CatBoost $+$ TabNet	High performance $+$ XAI

Note: DL: Deep Learning; ML: Machine Learning; LSTM: Long Short-Term Memory; CNN: Convolutional Neural Network; DNN: Deep Neural Network; MLP: Multi-Layer Perceptron; LightGBM: Light Gradient Boosting Machine; XGB: eXtreme Gradient Boosting; CatBoost: Categorical Boosting; XAI: Explainable Artificial Intelligence; HCABS-NID: Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection.

The data preprocessing pipeline was implemented with the following steps: (1) Categorical attributes (protocol_type, service, state) were converted to numeric values by label encoding and categorical indexes were recorded. (2) The dataset was divided into stratified at the rate of 70% training and 30% testing. (3) SMOTENC was applied only to the training set, categorical indices were passed as parameters. (4) The required features after sampling were expanded with one-hot encoding and scaled with RobustScaler. No sampling or conversion was applied to the test set; only the parameters learned from the training set were used. The relevant approach prevents data leakage [17]. The effectiveness of this strategy is also supported in prior studies [74] (Table 3).

Table 3. Performance comparison of different models on intrusion detection datasets

Researchers	Year	Method	Dataset	Accuracy (%)
Javaid et al. [58]	2016	Sparse Autoencoder	NSL-KDD	88.39
Kim et al. [60]	2016	LSTM	KDD Cup 99	96.93
Zhou et al. [63]	2020	CNN-LSTM Hybrid	UNSW-NB15	94.25
Kasongo & Sun [71]	2020	DNN $+$ Self. Selection	UNSW-NB15	96.84
Jiang et al. [16]	2020	Hybrid Sampling	UNSW-NB15	95.12
Vinayakumar et al. [64]	2019	DNN	UNSW-NB15	93.48
Tama & Rhee [24]	2019	Two-Level Stacking	UNSW-NB15	96.20
HCABS-NID	2025	Hierarchical Community	UNSW-NB15	98.20

Note: LSTM: Long Short-Term Memory; CNN-LSTM Hybrid: Convolutional Neural Network – Long Short-Term Memory Hybrid; DNN: Deep Neural Network; DNN + Self. Selection: DNN with Self-Selection; HCABS-NID: Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection.

2.4 Explainable Artificial Intelligence and Model Interpretability

XAI is a set of methods that ensure human interpretability of model decisions [29], [30]. As the complexity of machine learning models increases, it becomes more difficult to understand and assess the reliability of decisions. Deep learning models, in particular, can contain millions of parameters, and decision processes remain opaque in many practical settings. XAI addresses this black box issue to enhance model transparency; It ensures that stakeholders have confidence in the model. Model-agnostic methods can be applied to any machine learning model. Local Interpretable Model-agnostic Explanations (LIME) explains individual predictions with locally interpretable models [77]. LIME trains a simple model (usually linear regression) by creating perturbation samples around the prediction point. Permutation significance analysis measures the change in model performance by random shuffling of features [54]. Complementary feature-importance analyses are also discussed in [78].

SHAP calculates feature contributions using Shapley values from game theory [31]. Shapley values fairly distribute the marginal contribution of each player in coalition games; This mathematical basis guarantees the consistency of the explanations. This approach is the only method that provides local accuracy, consistency, and missingness. Local accuracy means that the sum of the explanation is equal to the actual estimate; consistency refers to the logical change in feature importance in model changes; Deficiency guarantees that missing attributes receive zero contributions. TreeSHAP offers optimized SHAP computation for tree-based models, ensuring precise values in polynomial time [32]. While traditional SHAP computation has exponential complexity, TreeSHAP leverages tree structure to perform efficient computation. KernelSHAP uses a sampling-based approach for model-agnostic computation. DeepSHAP provides gradient-based SHAP computation for deep learning models. In security applications requiring regulatory compliance, explainability, alarm triage, and threat hunting are crucial. Analysts in SOC centers face thousands of alarms daily and explainable models are key to prioritizing these alarms. Model explanations enhance security teams by minimizing the false positive rate. This is allowing analysts to concentrate on real-time events. And to this point, the EU GDPR upholds the right to explanation in autonomous decision-making systems [36]. Article of the regulation requires that individuals have the right not to be subject to decisions based solely on automated processing. This has made algorithm transparency a legal obligation. Transparency of models has now become a legal requirement in healthcare, financial services, and critical infrastructure sectors.

2.5 Vehicle-to-Everything Security and Connected Vehicle Threats

V2X communication systems are exposed to numerous different security threats [79], [80]. These threats can be addressed in terms of confidentiality, integrity, and availability. This triad forms the fundamental principles of information security. In autonomous driving contexts, the SAE J3016 taxonomy also assists in defining operational scenarios and risk boundaries [81]. Authentication attacks (Sybil attacks) threaten network security by creating fake vehicle identities [82], By creating multiple fake identities, an attacker can spread misleading information about traffic volume, which may cause drivers to make incorrect route choices and vehicles to malfunction. Denial-of-Service (DoS) attacks overload communication channels, causing service interruptions [3]. In V2X services, DoS attacks can prevent the transmission of security messages, causing vehicles to crash. Message forgery and tampering attacks can manipulate critical security messages, creating dangerous situations. An example of this is fake brake warning messages, which can cause sudden braking and chain collisions.

Vehicle tracking data and data traffic as a way of privacy attack violate drivers’ privacy [83]. V2X messages contain information such as vehicle location, speed, and direction, and combining these data may reveal individuals’ movement patterns. Transmission attacks experienced in data transfer cause communication to be repeated, creating a false sense of security. Physical network layer attacks directly target wireless signals with jamming or spoofing. They can manipulate vehicle location (GPS) information to mislead navigation systems. In-vehicle network security is an important aspect of V2X security. GAN-based intrusion detection approaches have also been investigated for in-vehicle networks [84]. The CAN protocol was designed in the 1980s and was not developed with security in mind; it lacks authentication and encryption mechanisms [5], [6]. The corresponding situation allows any ECU to send fraudulent messages over the network. In-car entertainment systems (infotainment) create a remote attack surface [6]. These systems communicate with the outside world via internet connection, Bluetooth and USB interfaces; vulnerabilities can provide access to critical vehicle systems. Diagnostic port (OBD-II) access offers a physical attack vector; direct access to vehicle ECUs is possible through this port. Wireless switch systems are vulnerable to relay attacks [85]. By relaying the key signal, attackers can remotely unlock and steal vehicles. Autonomous driving systems may be vulnerable to manipulative sensor spoofing attacks [86]. LiDAR, radar, and camera sensors may be blinded by hostile attacks. This could cause autonomous vehicles to make incorrect movements and decisions.

3. Methodology

With the structure provided by the HCABS-NID framework, the design of the framework is organized as follows. The research method is outlined with four interdependent phase processes that form an overall experimental pipeline as follows: (i) dataset selection and preprocessing, encompassing class imbalance mitigation through SMOTENC-based adaptive resampling; (ii) hierarchical ensemble architecture design, integrating four heterogeneous base learners within a two-level meta-learning framework; (iii) training strategy, employing stratified k-fold cross-validation with out-of-fold prediction generation to prevent data leakage; and (iv) evaluation criteria, utilizing a multi-metric assessment protocol that combines threshold-dependent, threshold-independent, and chance-corrected measures. The process for each step has been explained adequately for full replication of the experimental pipeline, and the design aspects are justified with reference to the established methodological principles and the specific characteristics of real-time V2X intrusion detection. The overarching goal is to strike a balance between the best predictive performance, computational feasibility, and model interpretability—three needs often only discussed in isolation in the existing literature and at the same time to be addressed in the proposed framework.

3.1 Dataset and Preprocessing

The UNSW-NB15 dataset is used to build this study's evaluation framework [46]. This dataset is created by the Australian Cyber Security Centre (ACCS) via the IXIA PerfectStorm tool. It has normal traffic and nine types of attacks (Fuzzers, Analysis, Backdoor, DoS, Exploits, Generic, Reconnaissance, Shellcode, Worms). All in all 2,540,044 records and 49 attributes. There are flow attributes, basic attributes, content attributes, temporal attributes, and additional attributes. Abnormal IoT activity data set creation schemes provide complementary guidance for producing realistic evaluation data [87]. The training set contains 175,341 records and the test set contains 82,332 records [88]. This separation ensures the reliability of model evaluation.

In the data preprocessing phase, a systematic approach was followed. The missing values are filled with median substitution for each attribute; This method provides resilience against outliers and has minimal impact on data distribution. Categorical attributes (proto, service, state) have been transformed by one-hot coding; This transformation allows categorical information to be processed by numerical models. Numeric attributes are scaled using RobustScaler; this method uses median and interquartile range (IQR) to reduce the impact of outliers and produce more reliable results compared to standard scaling. Recursive Feature Elimination with Cross-Validation (RFECV) has been applied for attribute selection; 42 features were determined as optimal. Correlation analysis was performed to detect the problem of multiple linear dependencies; One of the feature pairs with a correlation of over 0.95 has been removed (Figure 2).

Figure 2. Class distribution in the UNSW-NB15 dataset

The problem of class imbalance has been addressed with the SMOTENC algorithm [74]. SMOTENC is the SMOTE variant that provides special handling for categorical attributes; For categorical values, the mode value of the nearest neighbors is used. When producing synthetic samples for minority classes, the nearest neighbor parameter k = 5 was used. Synthetic samples are created by interpolating between existing minority samples. The majority class was balanced by sub-sampling; random sub-sampling strategy was applied. The sampling strategy aimed for rare attack types (Worms: 174, Shellcode: 1,511, Backdoor: 2,329) to reach at least 5,000 instances. After balancing, the total training cluster size reached 250,000 samples (Table 4).

Table 4. UNSW-NB15 dataset class distribution and Synthetic Minority Over-sampling Technique for Nominal and Continuous (SMOTENC) post-sampling distribution (training set only)

Class	Original	Ratio (%)	After Sampling	New Rate (%)
Normal	93,000	37.19	50,000	20.00
General	58,871	23.55	35,000	14.00
Achievements	44,525	17.81	35,000	14.00
Tulus	24,246	9.70	25,000	10.00
Denial-of-Service (DoS)	16,353	6.54	20,000	8.00
Discovery	13,987	5.59	20,000	8.00
Analysis	2,677	1.07	15,000	6.00
Backdoor	2,329	0.93	15,000	6.00
Shellcode	1,511	0.60	15,000	6.00
Worms	174	0.07	10,000	4.00

Note: Distributions only belong to the TRAINING set; No sampling was applied to the TEST set.

3.2 Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection Architecture

Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection (HCABS-NID) is a hierarchical ensemble architecture that includes four basic learners and two levels of meta-learners. Core learners represent different learning paradigms: LightGBM [26], XGBoost, CatBoost [27], and TabNet [28]. This diversity increases the generalization capacity of the community model; enables the capture of different data patterns [18]. Each core learner has unique strengths and generates complementary predictions. This heterogeneity ensures that ensemble performance exceeds individual model performances. The first-level meta-learner produces combined predictions by taking the soft predictions of the basic learners as input. Each basic learner deduces probability vectors for 10 classes; A total of 40 dimensional feature vectors are formed. Logistic regression was chosen as the first-level meta-learner; This choice offers advantages in preventing overfitting and interpretability. Model complexity is controlled by applying L2 regularization ($C$ = 1.0). The second-level meta-learner performs final predictions using the first-level outputs and a subset of the original features (the 20 most important features). Gradient boosting was used as a second-level meta-learner; This choice offers the capacity to capture non-linear relationships (Figure 3).

Figure 3. Hybrid ensemble model architecture

LightGBM is a histogram-based gradient boosting algorithm with leaf-wise growth approach [26]. This method converges faster and uses less memory than level-wise growth. It uses a histogram approach to partition continuous inputs into discrete boxes and increases computational efficiency, which is quite advantageous in large datasets. Hyperparameters: num_leaves = 63, max_depth = 10, learning_rate = 0.05, n_estimators = 500, min_child_samples = 20, feature_fraction = 0.8, bagging_fraction = 0.8. XGBoost uses gradient boosting with L1 and L2 regularization. Regularization terms help avoid overfitting by penalizing models with more complexity. Hyperparameters: max_depth = 8, learning_rate = 0.05, n_estimators = 500, reg_alpha = 0.1, reg_lambda = 1.0, colsample_bytree = 0.8. CatBoost uses special processing and ordered boosting for categorical attributes [27]. Sequential boosting improves generalization performance by preventing target leakage. Instead of target encoding for categorical attributes, statistics based on sampling order are calculated; This approach reduces overfit. Hyperparameters: depth = 8, learning_rate = 0.05, iterations = 500, l2_leaf_reg = 3, random_strength = 1. TabNet is an interpretable deep learning architecture that performs feature selection by sequential attention mechanism [28]. The sparse attention mechanism selects different subsets of features at each step; This feature provides interpretability. Hyperparameters: n_d = 64, n_a = 64, n_steps = 5, gamma = 1.5, lambda_sparse = 0.001, momentum = 0.02.

3.3 Training Strategy and Cross-Validation

Model training was carried out using Stratified K-Fold Cross-Validation. Stratified sampling ensures that the class distribution remains close to the original distribution on each floor; This feature is critical for imbalanced datasets. Basic learners generate out-of-fold predictions during cross-validation; These predictions are used for meta-learner training. This approach provides reliable performance estimation by preventing data leakage [22], [24]. The test set was not used in the model selection process; reserved for final consideration only. This distinction avoids the optimistic bias of performance estimates. Early stopping was implemented to prevent overfitting [89]. If the validation loss does not improve for 50 iterations, the training is terminated. This strategy automatically determines the optimal number of iterations; It avoids unnecessary computational costs and reduces the risk of overfitting. Learning rate scheduling provides rapid convergence by using a high learning rate at the beginning; It decreases the learning rate as education progresses. The cosine annealing strategy allows to avoid local minibuses by periodically decreasing and increasing the learning rate. Gradient clipping is applied to prevent gradient explosion; The maximum gradient norm is set at 1.0.

3.4 TreeSHAP Explainability Integration

TreeSHAP provides optimized SHAP value calculation for tree-based models [32]. Although conventional SHAP calculation has a complexity of $\mathrm{O}\left(\mathrm{TL} \cdot 2^{\wedge} \mathrm{M}\right)$, TreeSHAP achieves accurate values in polynomial time, with complexity expressed as $\mathrm{O}\left(\mathrm{TLD}^2\right)$, where $T$ represents the number of trees, $L$ denotes the number of leaves, $M$ stands for the number of attributes, and $D$ indicates the maximum depth. This makes it feasible for practical computation on large-scale datasets as well as ensemble models. By leveraging the tree structure, it effectively identifies every feature coalition, allowing for real-time annotation formation.

TreeSHAP in the HCABS-NID architecture provides multiple levels of explainability. Global feature significance is computed according to average absolute SHAP values across the entire dataset, revealing the best features in general and guiding feature engineering research. Local descriptions describe what features contributed to individual estimates; for each prediction, it identifies which attributes drive the chosen decision, helping facilitate alarm triage for security analysts. Class-level analyses distinguish features for each attack type, with patterns specific to the attack type used to aid in threat hunting activities. The visual descriptions are represented by force plots, waterfall plots, and summary plots.

3.5 Evaluation Metrics

The performance of the proposed HCABS-NID architecture was evaluated by utilizing a well-organised and methodologically robust set of classification metrics selected to solve the specific evaluation task posed by multi-class network intrusion detection with class imbalance. In the following notation, True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN) stand for true positives, true negatives, false positives, and false negatives. Accuracy was the main global performance measure; however, in an imbalanced attack detection environment, where normal traffic is the majority class, accuracy alone may overestimate model effectiveness by masking poor detection of minority attack classes. Precision and recall provided a subtler picture of detection capability, where precision and recall could be used as complementary metrics. Precision quantifies the fraction of attacks that are correctly detected versus positive predictions, closely tracking the ability of the model to reduce false alarms—an imperative for Security Operations Centers, where the overreliance on false positives can create alert fatigue and hamper analyst agility. On the other hand, recall quantifies the proportion of actual attacks detected correctly, and the effectiveness of the model to minimize missed intrusions, an error threshold with potential threats to security in V2X environments. To balance these competing goals under conditions of class imbalance, the F1-score used here was defined as the harmonic mean of precision and recall, and thus it was adopted as the primary balanced metric.

As the UNSW-NB15 dataset has nine classes of attacks and normal traffic, macro-averaged and microaveraged versions of all metrics were reported to provide clear evaluation visibility into the system performance. Macro-averaging places equal weight on all the classes, independent of their respective frequency, thus making the detection of underrepresented attack types, such as Worms and Shellcode, a priority given that these classes are often penalized via frequency-weighted schemes. In contrast, micro-averaging aggregates the impact of each instance, prior to determining the metric, to achieve naturally weighted class prevalence results - overall systemlevel performance, essentially. Threshold-independent evaluation was performed based on the Area Under the Receiver Operating Characteristic Curve (ROC-AUC) through a One-vs-Rest (OvR) approach that yields a separate ROC curve for each class. Although ROC-AUC provides a reliable estimate of discriminative potential over all decision levels, it can produce optimistically distorted values when class imbalances are severe because true negatives dominate in the false positive rate denominator. To alleviate this limitation, the Precision-Recall AUC (PR-AUC) was also indicated, since PR-AUC is more sensitive to minority class detection performance and offers a conservative evaluation of the model under imbalanced conditions. Lastly, two chance-correction measures complement the threshold-based and threshold-independent measures stated above. Cohen's Kappa ($\kappa$) measures observed agreement ($p\_o$) by the level of agreement expected by chance ($p\_e$), and allows us to quantify classification performance greater than what the random assignment would produce. Thus the Matthews Correlation Coefficient (MCC) combines the four elements of the confusion matrix to yield a balanced score from $-$1 (complete misclassification) to 0 (random prediction) to $+$1 (perfect classification). Unlike accuracy, MCC is still informative even under extreme class distribution skew and is therefore a highly dependable single-score summary measure when evaluating intrusion detection mechanisms having an attack frequency that varies several orders of magnitude apart.

4. Findings

In this section, we present the experimental results gained from the optimization of the proposed HCABS-NID architecture in the held-out set of UNSW-NB15 test, which was excluded from training and model selection in order to achieve an unbiased assessment. The evaluation is undertaken in six separate and complementary subsections as a multi-faceted analytical framework. Section 4.1 reports the overall performance metrics—accuracy, macro-averaged F1-score, ROC-AUC, Cohen’s Kappa, and MCC—together with a cross‐validation stability analysis that provides the overall generalization ability. Section 4.2 class-level disaggregation of these findings analyzes detection performance in all nine attack classes but focuses on the rare attack types with the most difficulty from class imbalance. Section 4.3 compares HCABS-NID with single-model and ensemble based approaches obtained from literature, showing evidence by statistical significance tests (McNemar, Wilcoxon signed-rank) and effect size analysis, which prove that the observed enhancement is not due to random variability. Section 4.4 assesses the computational efficiency (in terms of inference latency, throughput, and memory footprint) of this method and then situates that metrics within the real-time, safety-sensitive conditions of V2X applications. Section 4.5 uses TreeSHAP explainability analysis to interpret the model reasoning: the method of decision-making by the model on a global, local, and class-level to extract the discriminative network traffic features in attack detection. Finally, Section 4.6 compares adversarial resilience in gradient perturbation through Fast Gradient Sign Method (FGSM) and Projected Gradient Descent (PGD) attacks. The results are accompanied by 95% confidence intervals and cross-validation variance estimates on all analyses.

4.1 Overall Performance Metrics

The HCABS-NID model achieved 98.20% accuracy, 97.10% F1-score (macro-mean), ROC-AUC of 0.989, and Cohen’s Kappa values of 0.976 on the UNSW-NB15 test set. The results obtained exceed the best performances in the existing literature and confirm the effectiveness of the hierarchical community approach. The 5-fold cross-validation shows that the model exhibits consistent performance; standard deviation values remain below 0.3%. This low variance confirms that the model produces reliable results across different subsets of data and is resilient to overfitting. MCC is calculated as 0.971; This value confirms the balanced classification performance and shows that the effect of class imbalance is under control. 95% confidence intervals confirm that the results are statistically significant (Table 5, Figure 4).

Table 5. Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection (HCABS-NID) overall performance metrics

Metric	Value	Std. Deviation	95% CI
Accuracy	0.982	$\pm$0.18	[98.02–98.38]
Precision (Macro)	0.974	$\pm$0.22	[97.13–97.57]
Sensitivity (Macro)	0.969	$\pm$0.25	[96.64–97.14]
F1-Score (Macro)	0.971	$\pm$0.21	[96.89–97.31]
ROC-AUC (Macro)	0.989	$\pm$0.004	[0.985–0.993]
PR-AUC (Macro)	0.982	$\pm$0.006	[0.976–0.988]
Cohen Kappa	0.976	$\pm$0.003	[0.973–0.979]
MCC	0.971	$\pm$0.004	[0.967–0.975]

Note: ROC-AUC: Area Under the Receiver Operating Characteristic Curve; PR-AUC: Precision-Recall Area Under the Curve; MCC: Matthews Correlation Coefficient.

Figure 4. Model performance metrics comparison

4.2 Class-Based Performance Analysis

Class-wise analysis reveals the model's performance for different types of attacks. The normal traffic class performs the most, with 99.1% accuracy and 99.3% sensitivity; This result shows that the model correctly recognizes legitimate traffic and provides a low false positive rate. Generic and Exploits attacks achieve F1-score values of 98.5% and 97.8%, respectively; High performance is expected because these attack types have sufficient representation in the dataset. Although performance degradation is observed in rare attack types, the values remain at acceptable levels: Worms 91.2%, Shellcode 93.5%, Backdoor 94.1% F1-score. The results show that SMOTENC sampling provides benefits for rare classes and significantly alleviates the problem of class imbalance, as demonstrated in Table 6 and Figure 5. The normalized complexity matrix with class metrics is shown in Figure 6: diagonal elements represent correct classification ratios (recall/sensitivity), while non-diagonal elements show incorrect classification patterns. On the right panel, Precision (P), Recall (R), and F1-Score values are reported for each class.

Table 6. Class-based performance metrics

Class	Accuracy (%)	Sensitivity (%)	F1-Score (%)	Support
Normal	99.1	99.3	99.2	56.000
General	98.7	98.3	98.5	18.871
Achievements	97.9	97.7	97.8	11.132
Tulus	96.8	96.2	96.5	6.062
Denial-of-Service (DoS)	96.5	97.1	96.8	4.089
Discovery	97.2	96.8	97.0	3.496
Analysis	94.8	95.3	95.0	677
Backdoor	94.3	93.9	94.1	583
Shellcode	93.8	93.2	93.5	378
Worms	91.5	90.9	91.2	44

Figure 5. Receiver Operating Characteristic (ROC) curves by attack classes: true positive rate and false positive rate relation for each attack type

Figure 6. Normalized complexity matrix with class metrics

4.3 Comparative Evaluation

HCABS-NID has been compared with methods in the existing literature. The comparison includes both single-model approaches [58], [59], [60] and community methods [24], [72]. All methods were evaluated on the same dataset (UNSW-NB15); Results are from original publications. HCABS-NID outperforms all benchmarking methods. The accuracy increase reaches 1.75% points compared to the nearest competitor [72] (96.45%). F1-score improvement is more pronounced in rare attack types; An increase of 8.3% was observed in the Worms class. The ROC-AUC value exhibits the highest performance with 0.989; This result confirms that the model performs reliably at different thresholds (Table 7).

Table 7. Comparison of model performance reported in the literature

Method	Accuracy (%)	F1-Score (%)	ROC-AUC	Year
Sparse Autoencoder	88.39	85.20	0.912	2016
LSTM	96.93	94.50	0.965	2016
CNN-LSTM Hybrid	94.25	92.80	0.951	2020
DNN $+$ Self. Selection	96.84	95.10	0.972	2020
Two-Level Stacking	96.20	94.80	0.968	2019
Community Learning	96.45	95.30	0.975	2024
HCABS-NID	98.20	97.10	0.989	2025

Note: The source publication for each line is indicated in parentheses. HCABS-NID results were obtained from this study. LSTM: Long Short-Term Memory; DNN: Deep Neural Network; HCABS-NID: Hierarchical Classifier-Agnostic Boosted Stacking for Network Intrusion Detection.

Statistical significance tests confirm that performance differences are not due to random variability. The McNemar test shows that the difference between HCABS-NID and the closest competitor is statistically significant at the level of p < 0.001. The Wilcoxon signed-rank test confirms consistent superiority over cross-validation folds. Cohen's d effect size was calculated as 0.82; This value corresponds to the large effect size. The results obtained show that the performance advantage of HCABS-NID is of practical importance.

4.4 Computational Efficiency

The computational efficiency of HCABS-NID is critical for V2X security applications. The mean inference time was measured as 3.40 ms; this value is below the threshold of approximately 10-100 ms required by V2X safety standards [3]. The DSRC and C-V2X standards require security messages to be processed within the range of approximately 10–100 ms [3]; HCABS-NID meets this requirement. Real-time sensor anomaly detection studies in automated vehicles also emphasize low-latency decision loops [90]. Training time was recorded as 4 hours and 23 minutes on NVIDIA RTX 3090 GPU. The model size is 847 MB, including four basic learners and two meta-learners. Memory usage measured at 2.1 GB at inference; This value is acceptable for modern in-car computers. Throughput achieves a processing capacity of 294 samples per second; This capacity provides sufficient performance in heavy network traffic scenarios (Table 8).

Table 8. Computational efficiency metrics

Metric	Value	V2X Requirement	Harmony
Inference time	3.40 ms	$<$10 ms	✓
Training period	4 h 23 min	-	-
Model size	847 MB	-	-
Memory usage	2.1 GB	$<$4 GB	✓
Data	294 samples/h	$>$100 samples/h	✓

Note: V2X: Vehicle-to-Everything.

When the individual inference times of the basic learners were examined, LightGBM was measured as 0.8 ms, XGBoost as 1.1 ms, CatBoost as 0.9 ms, and TabNet as 0.6 ms. Meta-learners add an additional 0.4 ms total time. With parallel inference optimization, the total time can be further reduced; however, the current performance meets V2X requirements. When model compression techniques (pruning, quantization) are applied, it has been observed that inference time can be reduced to around 1.8 ms with an estimated performance trade-off.

4.5 TreeSHAP Explainability Analysis

The TreeSHAP analysis explains the decision-making mechanisms of the model. In the global attribute importance ranking, sbytes (source bytes), dbytes (destination bytes), sttl (source TTL), and ct_state_ttl (connection state TTL) have the highest SHAP values. These attributes reflect the basic characteristics of network traffic. The sbytes attribute represents the amount of data sent; It plays a critical role in detecting DoS and data exfiltration attacks. The STTL attribute indicates the package lifetime; This value may reflect the default configurations of attack tools. Protocol and service type attributes play a critical role in attack type differentiation; shows that certain attacks are associated with specific protocols (Figure 7).

Figure 7. TreeSHAP global trait importance: Top 15 traits ranked by average absolute SHapley Additive exPlanations (SHAP) value

Explainable features in class-level analysis show distinct characteristics of each of the attack types. DoS attacks include a high sbytes and a low dbytes; These patterns represent the traffic being primarily one-way and indicative of the target being overloaded. Reconnaissance attacks are characterized by low packet size and high number of connections; This is representative of the characteristic behaviour of discovery traffic with port scanning and network mapping. Backdoor attacks show abnormal service port usage and low TTL values, which are similar to latent communication channels. Security analysts use these insights to help them with their threat hunting. They speed up the alarm triage process and reduce the false positive rate.

4.6 Adversarial Resilience Analysis

For adversarial robustness, we investigated the model in the FGSM and Projected Gradient Descent (PGD) attacks, which are two popular gradient-based perturbation strategies applied to produce misclassification through imperceptibly small perturbations in the input feature space. The performance was high, it returned 97.8% of classification accuracy, which means the model is generally resistant to low-intensity noise. Further perturbation increased to 5% had a slight reduced accuracy of 96.8%, indicating some performance degradation but it is still acceptable in practice. On the other hand, at 10% accuracy was still 94.2% indicating a much more extreme sensitivity to this large size feature distortion, confirming the practical limits of the adversary tolerance we propose at the moment. These results collectively provide evidence showing that HCABS-NID has reasonable adversarial durability under conditions of a wide range of perturbation (e.g., high energy and fine noise), but also an observable vulnerability at severe distortion, still worth further attention. Therefore, adversarial training and input sanitisation as candidate defence mechanisms of the framework with the ability to resist higher-order attack regimes should be explored in future works.

But one fundamental methodological limitation needs to be explicitly acknowledged. Gradient-based adversarial evaluation was limited to the TabNet part, as gradient boosting models — LightGBM, XGBoost and CatBoost — are non-differentiable and hence cannot be directly attacked by FGSM or PGD. Such architectural variation implies that the adversarial resilience presented here is a partial white-box scenario and not a full-system assessment. To get a comprehensive view of ensemble-level vulnerability, future studies can extend the evaluation to gradient-free attack methods, such as Zeroth-Order Optimisation (ZOO) [91] and Simultaneous Perturbation Stochastic Approximation (SPSA) [92], which operate without access to model gradients and are thus applicable to the boosting components.

5. Conclusion and Suggestion

HCABS-NID deals with network-based attack and anomaly detection issues in the context of connected vehicle ecosystems. It combines LightGBM, XGBoost, CatBoost, and TabNet into a hierarchical stacking framework built without sacrificing interpretability for predictive power. The integration of TreeSHAP allows explainability at the global and local scale — security teams now know what makes certain alerts trigger, so they can be triaged more efficiently. The results from experiments on UNSW-NB15 are instructive: 98.20% accuracy, 97.10% F1-score, and 0.989 ROC-AUC. These numbers exceed most other benchmarks reported in the literature. The two-tier meta-learner design is responsible for their achievement. To mitigate overfitting there is logistic regression at the first level which combines base learner outputs in a single linear manner. Second stage, gradient boosting detects non-linear interactions that the simplest models simply fail to capture. Real world deployments also appear feasible, with average inference latency being just 3.40 ms—within V2X security requirements.

However, as it was developed in the UNSW-NB15 laboratory environment, limitations can impact generalizability. It may not fully reflect real-world V2X traffic dynamics. The lack of connected vehicle specific datasets is a key limitation. Rare attack types show long-term consequences of class imbalance with an immediate observed performance drop. While SMOTENC algorithms mitigate this problem, attacks simulated with synthetic samples cannot fully capture the diversity of attacks encountered in the real world. Additionally, robustness has been assessed in limited perturbation scenarios conducted by cyber adversaries. The model size (847 MB) may prevent its use in vehicle-based environments with limited resources due to hardware limitations and dataset constraints.

Going forward, there are a number of lines of communication worth paying attention to. Better V2X datasets will still be needed—combining Combining Simulation of Urban MObility (SUMO) and Vehicles in Network Simulation (VEINS) simulations with empirical data from test tracks or live deployments would allow us to capture real-world complexity much more faithfully. simulations with empirical data from test tracks or live deployments would allow us to capture real-world complexity much more faithfully. There are also needs for rare-class detection, such as cost-sensitive learning, focal loss, and few-shot methods that generalize from small samples. Privacy-preserving decentralized training via federated learning could alleviate some data sharing concerns but will necessitate aggressive model compression via pruning, quantization, and knowledge distillation for practical deployment. This is where strong defenses to inference attacks matter as well. Automotive cybersecurity becomes increasingly important as technology continues to evolve as V2X networks spread and HCABS-NID offers both researchers and practitioners a workable foundation for securing next-generation transportation systems.

Author Contributions

Conceptualization, R.A.; methodology, R.A.; software, R.A.; validation, R.A., T.Ö., and M.M.A.; formal analysis, R.A.; investigation, R.A.; resources, T.Ö., M.M.A., and Y.Ç.; data curation, R.A.; writing—original draft preparation, R.A.; writing—review and editing, T.Ö., M.M.A., and Y.Ç.; visualization, R.A.; supervision, T.Ö. and Y.Ç.; project administration, T.Ö. All authors have read and agreed to the published version of the manuscript.

Data Availability

The data supporting the research findings are publicly available. The UNSW-NB15 dataset used in this study was originally created by the Australian Centre for Cyber Security (ACCS) and is publicly available at https://research.unsw.edu.au/projects/unsw-nb15-dataset. The processed data and code used to support the findings of this study are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

1.

N. Lu, N. Cheng, N. Zhang, X. Shen, and J. W. Mark, “Connected vehicles: Solutions and challenges,” IEEE Internet Things J., vol. 1, no. 4, pp. 289–299, 2014. [Google Scholar] [Crossref]

2.

J. B. Kenney, “Dedicated short-range communications (DSRC) standards in the United States,” Proc. IEEE, vol. 99, no. 7, pp. 1162–1182, 2011. [Google Scholar] [Crossref]

3.

J. Petit and S. E. Shladover, “Potential cyberattacks on automated vehicles,” IEEE Trans. Intell. Transp. Syst., vol. 16, no. 2, pp. 546–556, 2015. [Google Scholar] [Crossref]

4.

F. Sakiz and S. Sen, “A survey of attacks and detection mechanisms on intelligent transportation systems: VANETs and IoV,” Ad Hoc Netw., vol. 61, pp. 33–50, 2017. [Google Scholar] [Crossref]

5.

K. Koscher, A. Czeskis, F. Roesner, S. Patel, T. Kohno, S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, et al., “Experimental security analysis of a modern automobile,” in Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 2010, pp. 447–462. [Google Scholar] [Crossref]

6.

S. Checkoway, D. McCoy, B. Kantor, D. Anderson, H. Shacham, S. Savage, K. Karl, and T. Kohno, “Comprehensive experimental analyses of automotive attack surfaces,” in Proceedings of the SENIX ecurity Symposium, 2011, pp. 77–92. [Online]. Available: https://www.usenix.org/legacy/events/sec11/tech/full_papers/Checkoway.pdf [Google Scholar]

7.

M. Wurm, Automotive Cybersecurity. Springer, 2022. [Online]. Available: [Google Scholar] [Crossref]

8.

M. Raya and J. P. Hubaux, “Securing vehicular ad hoc networks,” J. Comput. Security, vol. 15, no. 1, pp. 39–68, 2007. [Google Scholar] [Crossref]

9.

R. W. van der Heijden, S. Dietzel, T. Leinmuller, and F. Kargl, “Survey on misbehavior detection in cooperative intelligent transportation systems,” IEEE Commun. Surveys Tuts., vol. 21, no. 1, pp. 779–811, 2019. [Google Scholar] [Crossref]

10.

D. E. Denning, “An intrusion-detection model,” IEEE Trans. Softw. Eng., vol. SE-13, no. 2, pp. 222–232, 1987. [Google Scholar] [Crossref]

11.

H. J. Liao, C. H. R. Lin, Y. C. Lin, and K. Y. Tung, “Intrusion detection system: A comprehensive review,” J. Netw. Comput. Appl., vol. 36, no. 1, pp. 16–24, 2013. [Google Scholar] [Crossref]

12.

P. Garcia-Teodoro, J. Diaz-Verdejo, G. Macia-Fernandez, and E. Vazquez, “Anomaly-based network intrusion detection: Techniques, systems and challenges,” Comput. Security, vol. 28, no. 1–2, pp. 18–28, 2009. [Google Scholar] [Crossref]

13.

A. L. Buczak and E. Guven, “A survey of data mining and machine learning methods for cyber security intrusion detection,” IEEE Commun. Surveys Tuts., vol. 18, no. 2, pp. 1153–1176, 2016. [Google Scholar] [Crossref]

14.

R. Sommer and V. Paxson, “Outside the closed world: On using machine learning for network intrusion detection,” in Proceedings of the IEEE Symposium on Security and Privacy, Oakland, CA, USA, 2010, pp. 305–316. [Google Scholar] [Crossref]

15.

O. Sagi and L. Rokach, “Ensemble learning: A survey,” WIREs Data Mining Knowl. Discovery, vol. 8, no. 4, p. e1249, 2018. [Google Scholar] [Crossref]

16.

K. Jiang, W. Wang, A. Wang, and H. Wu, “Network intrusion detection combined hybrid sampling with deep hierarchical network,” IEEE Access, vol. 8, pp. 32464–32476, 2020. [Google Scholar] [Crossref]

17.

H. He and E. A. Garcia, “Learning from imbalanced data,” IEEE Trans. Knowl. Data Eng., vol. 21, no. 9, pp. 1263–1284, 2009. [Google Scholar] [Crossref]

18.

T. G. Dietterich, “Ensemble methods in machine learning,” in Proceedings of the International Workshop on Multiple Classifier Systems, Cagliari, Italy, 2000, pp. 1–15. [Google Scholar] [Crossref]

19.

L. Breiman, “Bagging predictors,” Mach. Learn., vol. 24, no. 2, pp. 123–140, 1996. [Google Scholar] [Crossref]

20.

Y. Freund and R. E. Schapire, “A decision-theoretic generalization of on-line learning and an application to boosting,” J. Comput. Syst. Sci., vol. 55, no. 1, pp. 119–139, 1997. [Google Scholar] [Crossref]

21.

J. H. Friedman, “Greedy function approximation: A gradient boosting machine,” Ann. Statist., vol. 29, no. 5, pp. 1189–1232, 2001. [Google Scholar] [Crossref]

22.

D. H. Wolpert, “Stacked generalization,” Neural Netw., vol. 5, no. 2, pp. 241–259, 1992. [Google Scholar] [Crossref]

23.

X. Gao, C. Shan, C. Hu, Z. Niu, and Z. Liu, “An adaptive ensemble machine learning model for intrusion detection,” IEEE Access, vol. 7, pp. 82512–82521, 2019. [Google Scholar] [Crossref]

24.

B. A. Tama and K.-H. Rhee, “An in-depth experimental study of anomaly detection using gradient boosted machine,” Neural Comput & Applic, vol. 31, no. 4, pp. 955–965, 2017. [Google Scholar] [Crossref]

25.

P. A. A. Resende and A. C. Drummond, “A survey of random forest based methods for intrusion detection systems,” ACM Comput. Surveys, vol. 51, no. 3, pp. 1–36, 2018. [Google Scholar] [Crossref]

26.

G. Ke, Q. Meng, T. Finley, T. Wang, W. Chen, W. Ma, Q. Ye, and T. Y. Liu, “LightGBM: A highly efficient gradient boosting decision tree,” in Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 3146–3154. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/6449f44a102fde848669bdd9eb6b76fa-Paper.pdf [Google Scholar]

27.

L. Prokhorenkova, G. Gusev, A. Vorobev, A. V. Dorogush, and A. Gulin, “CatBoost: Unbiased boosting with categorical features,” in Advances in Neural Information Processing Systems, Montreal, Canada, 2018, pp. 6638–6648. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2018/file/14491b756b3a51daac41c24863285549-Paper.pdf [Google Scholar]

28.

S. O. Arik and T. Pfister, “TabNet: Attentive interpretable tabular learning,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, pp. 6679–6687. [Google Scholar] [Crossref]

29.

A. B. Arrieta, N. Diaz-Rodriguez, J. Del Ser, A. Bennetot, S. Tabik, A. Barbado, S. Garcia, S. Gil-Lopez, D. Molina, R. Benjamins, et al., “Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI,” Inf. Fusion, vol. 58, pp. 82–115, 2020. [Google Scholar] [Crossref]

30.

R. Guidotti, A. Monreale, S. Ruggieri, F. Turini, F. Giannotti, and D. Pedreschi, “A survey of methods for explaining black box models,” ACM Comput. Surveys, vol. 51, no. 5, pp. 1–42, 2018. [Google Scholar] [Crossref]

31.

S. M. Lundberg and S. I. Lee, “A unified approach to interpreting model predictions,” in Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017, pp. 4765–4774. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf [Google Scholar]

32.

S. M. Lundberg, G. Erion, H. Chen, A. DeGrave, J. M. Prutkin, B. Nair, R. Katz, J. Himmelfarb, N. Bansal, and S. I. Lee, “From local explanations to global understanding with explainable AI for trees,” Nature Mach. Intell., vol. 2, no. 1, pp. 56–67, 2020. [Google Scholar] [Crossref]

33.

L. H. Gilpin, D. Bau, B. Z. Yuan, A. Bajwa, M. Specter, and L. Kagal, “Explaining explanations: An overview of interpretability of machine learning,” in Proceedings of the IEEE 5th International Conference on Data Science and Advanced Analytics (DSAA), Turin, Italy, 2018, pp. 80–89. [Google Scholar] [Crossref]

34.

B. Goodman and S. Flaxman, “European Union regulations on algorithmic decision-making and a right to explanation,” AI Mag., vol. 38, no. 3, pp. 50–57, 2017. [Google Scholar] [Crossref]

35.

Z. C. Lipton, “The mythos of model interpretability,” Queue, vol. 16, no. 3, pp. 31–57, 2018. [Google Scholar] [Crossref]

36.

S. Wachter, B. Mittelstadt, and L. Floridi, “Why a right to explanation of automated decision-making does not exist in the general data protection regulation,” Int. Data Privacy Law, vol. 7, no. 2, pp. 76–99, 2017. [Google Scholar] [Crossref]

37.

J. P. Anderson, “Computer security threat monitoring and surveillance,” 1980. https://cir.nii.ac.jp/crid/1573950399661362176 [Google Scholar]

38.

L. T. Heberlein, G. V. Dias, K. N. Levitt, B. Mukherjee, J. Wood, and D. Wolber, “A network security monitor,” 1989. [Google Scholar] [Crossref]

39.

V. Paxson, “Bro: A system for detecting network intruders in real-time,” Comput. Netw., vol. 31, no. 23–24, pp. 2435–2463, 1999. [Google Scholar] [Crossref]

40.

M. Roesch, “Snort: Lightweight intrusion detection for networks,” in Proceedings of the Large Installation System Administration Conference (LISA), Seattle, Washington, USA, 1999, pp. 229–238. [Online]. Available: https://www.usenix.org/legacy/event/lisa99/full_papers/roesch/roesch.pdf?utm_source=chatgpt.com [Google Scholar]

41.

G. Vigna and R. A. Kemmerer, “NetSTAT: A network-based intrusion detection approach,” in Proceedings of the 15th Annual Computer Security Applications Conference (ACSAC), Phoenix, AZ, USA, 1999, pp. 25–34. [Google Scholar] [Crossref]

42.

P. A. Porras and P. G. Neumann, “EMERALD: Event monitoring enabling responses to anomalous live disturbances,” in Proceedings of the National Information Systems Security Conference (NISSC), 1997, pp. 353–365. [Online]. Available: https://csrc.nist.gov/files/pubs/conference/1997/10/10/proceedings-of-the-20th-nissc-1997/final/docs/353.pdf [Google Scholar]

43.

S. J. Stolfo, W. Fan, W. Lee, A. Prodromidis, and P. K. Chan, “Cost-based modeling for fraud and intrusion detection: Results from the JAM project,” in Proceedings of the DARPA Information Survivability Conference and Exposition, Hilton Head, SC, USA, 2000, pp. 130–144. [Google Scholar] [Crossref]

44.

M. Tavallaee, E. Bagheri, W. Lu, and A. A. Ghorbani, “A detailed analysis of the KDD CUP 99 data set,” in Proceedings of the IEEE Symposium on Computational Intelligence for Security and Defense Applications (CISDA), Ottawa, ON, Canada, 2009, pp. 1–6. [Google Scholar] [Crossref]

45.

I. Sharafaldin, A. H. Lashkari, and A. A. Ghorbani, “Toward generating a new intrusion detection dataset and intrusion traffic characterization,” in Proceedings of the 4th International Conference on Information Systems Security and Privacy (ICISSP), Funchal, Madeira, Portugal, 2018, pp. 108–116. [Google Scholar] [Crossref]

46.

N. Moustafa and J. Slay, “UNSW-NB15: A comprehensive data set for network intrusion detection systems,” in Proceedings of the Military Communications and Information Systems Conference (MilCIS), Canberra, ACT, Australia, 2015, pp. 1–6. [Google Scholar] [Crossref]

47.

N. Koroniotis, N. Moustafa, E. Sitnikova, and B. Turnbull, “Towards the development of realistic botnet dataset in the Internet of Things for network forensic analytics: Bot-IoT dataset,” Future Gener. Comput. Syst., vol. 100, pp. 779–796, 2019. [Google Scholar] [Crossref]

48.

N. Moustafa, “A new distributed architecture for evaluating AI-based security systems at the edge: Network TON_IoT datasets,” Sustain. Cities Soc., vol. 72, p. 102994, 2021. [Google Scholar] [Crossref]

49.

S. X. Wu and W. Banzhaf, “The use of computational intelligence in intrusion detection systems: A review,” Appl. Soft Comput., vol. 10, no. 1, pp. 1–35, 2010. [Google Scholar] [Crossref]

50.

Y. Xin, L. Kong, Z. Liu, Y. Chen, Y. Li, H. Zhu, M. Gao, H. Hou, and C. Wang, “Machine learning and deep learning methods for cybersecurity,” IEEE Access, vol. 6, pp. 35365–35381, 2018. [Google Scholar] [Crossref]

51.

W. Lee and S. J. Stolfo, “Data mining approaches for intrusion detection,” 1998. https://www.usenix.org/legacy/publications/library/proceedings/sec98/full_papers/lee/lee.pdf [Google Scholar]

52.

S. Mukkamala, G. Janoski, and A. Sung, “Intrusion detection using neural networks and support vector machines,” in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Honolulu, HI, USA, 2002, pp. 1702–1707. [Online]. Available: http://congres.cran.univ-lorraine.fr/2002/WCCI2002/IJCNN02/PDFFiles/Papers/1434.pdf [Google Scholar]

53.

W. Hu, Y. Liao, and V. R. Vemuri, “Robust support vector machines for anomaly detection in computer security,” in Proceedings of the International Conference on Machine Learning and Applications (ICMLA), Los Angeles, CA, USA, 2003, pp. 168–174. [Online]. Available: https://web.cs.ucdavis.edu/~vemuri/papers/rvsm.pdf [Google Scholar]

54.

L. Breiman, “Random forests,” Mach. Learn., vol. 45, no. 1, pp. 5–32, 2001. [Google Scholar] [Crossref]

55.

N. B. Amor, S. Benferhat, and Z. Elouedi, “Naive Bayes vs decision trees in intrusion detection systems,” in Proceedings of the 2004 ACM symposium on Applied computing, Nicosia, Cyprus, 2004, pp. 420–424. [Google Scholar] [Crossref]

56.

G. Stein, B. Chen, A. S. Wu, and K. A. Hua, “Decision tree classifier for network intrusion detection with GA-based feature selection,” in Proceedings of the 43rd annual ACM Southeast Conference, Raleigh, NC, USA, 2005, pp. 136–141. [Google Scholar] [Crossref]

57.

Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. [Google Scholar] [Crossref]

58.

A. Javaid, Q. Niyaz, W. Sun, and M. Alam, “A deep learning approach for network intrusion detection system,” EAI Endorsed Trans. Security Safety, vol. 3, no. 9, p. 21, 2016. [Google Scholar]

59.

A. Kim, M. Park, and D. H. Lee, “AI-IDS: Application of deep learning to real-time web intrusion detection,” IEEE Access, vol. 8, pp. 70245–70261, 2020. [Google Scholar] [Crossref]

60.

J. Kim, J. Kim, H. L. T. Thu, and H. Kim, “Long short-term memory recurrent neural network classifier for intrusion detection,” in 2016 International Conference on Platform Technology and Service (PlatCon), Jeju, Korea (South), 2016, pp. 1–5. [Google Scholar] [Crossref]

61.

W. Wang, M. Zhu, J. Wang, X. Zeng, and Z. Yang, “End-to-end encrypted traffic classification with one-dimensional convolution neural networks,” in Proceedings of the IEEE International Conference on Intelligent Security Informatics (ISI), Beijing, China, 2017, pp. 43–48. [Google Scholar] [Crossref]

62.

J. Zhang, M. Zulkernine, and A. Haque, “Random-forests-based network intrusion detection systems,” IEEE Trans. Syst., Man, Cybern. C, Appl. Rev., vol. 38, no. 5, pp. 649–659, 2008. [Google Scholar] [Crossref]

63.

Y. Zhou, M. Kantarcioglu, and B. Xi, “A survey of game theoretic approach for adversarial machine learning,” WIREs Data Mining Knowl. Discovery, vol. 10, no. 3, p. e1259, 2020. [Google Scholar] [Crossref]

64.

R. Vinayakumar, M. Alazab, K. P. Soman, P. Poornachandran, A. Al-Nemrat, and S. Venkatraman, “Deep learning approach for intelligent intrusion detection system,” IEEE Access, vol. 7, pp. 41525–41550, 2019. [Google Scholar] [Crossref]

65.

D. Li, D. Chen, B. Jin, L. Shi, J. Goh, and S. K. Ng, “MAD-GAN: Multivariate anomaly detection for time-series data with generative adversarial networks,” in Proceedings of the International Conference on Artificial Neural Networks (ICANN), Munich, Germany, 2019, pp. 703–716. [Google Scholar] [Crossref]

66.

J. Zhao, S. Shetty, J. W. Pan, C. Kamhoua, and K. Kwiat, “Transfer learning for detecting unknown network attacks,” EURASIP J. Inf. Security, vol. 2019, no. 1, pp. 1–12, 2019. [Google Scholar] [Crossref]

67.

G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, “Continuous lifelong learning with neural networks: A review,” Neural Netw., vol. 113, pp. 54–71, 2019. [Google Scholar] [Crossref]

68.

S. R. Pokhrel and J. Choi, “Federated learning with blockchain for autonomous vehicles: Analysis and design challenges,” IEEE Trans. Commun., vol. 68, no. 8, pp. 4734–4746, 2020. [Google Scholar] [Crossref]

69.

Z. Hu, L. Wang, L. Qi, Y. Li, and W. Yang, “A novel wireless network intrusion detection method based on adaptive synthetic sampling and an improved convolutional neural network,” IEEE Access, vol. 8, pp. 195741–195751, 2020. [Google Scholar] [Crossref]

70.

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, Long Beach, CA, USA, 2017. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html [Google Scholar]

71.

S. M. Kasongo and Y. Sun, “A deep learning method with wrapper based feature extraction for wireless intrusion detection system,” Comput. Security, vol. 92, p. 101752, 2020. [Google Scholar] [Crossref]

72.

R. Lazzarini, H. Tianfield, and V. Charissis, “A stacking ensemble of deep learning models for IoT intrusion detection,” Knowl. Based Syst., vol. 279, p. 110941, 2023. [Google Scholar] [Crossref]

73.

B. Krawczyk, “Learning from imbalanced data: Open challenges and future directions,” Prog. Artif. Intell., vol. 5, no. 4, pp. 221–232, 2016. [Google Scholar] [Crossref]

74.

N. V. Chawla, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique,” J. Artif. Intell. Res., vol. 16, pp. 321–357, 2002. [Google Scholar] [Crossref]

75.

H. He, Y. Bai, E. A. Garcia, and S. Li, “ADASYN: Adaptive synthetic sampling approach for imbalanced learning,” in Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), Hong Kong, China, 2008, pp. 1322–1328. [Google Scholar] [Crossref]

76.

C. X. Ling and V. S. Sheng, “Cost-sensitive learning and the class imbalance problem,” Encyclopedia Mach. Learn., vol. 2, no. 3, pp. 231–235, 2008, [Online]. Available: https://www.csd.uwo.ca/~xling/papers/cost_sensitive.pdf [Google Scholar]

77.

M. T. Ribeiro, S. Singh, and C. Guestrin, “Why should I trust you?: Explaining the predictions of any classifier,” in Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 2016, pp. 1135–1144. [Google Scholar] [Crossref]

78.

A. Fisher, C. Rudin, and F. Dominici, “All models are wrong, but many are useful: Learning a variable’s importance by studying an entire class of prediction models simultaneously,” J. Mach. Learn. Res., vol. 20, no. 177, pp. 1–81, 2019, [Online]. Available: http://jmlr.org/papers/v20/18-760.html [Google Scholar]

79.

S. Parkinson, P. Ward, K. Wilson, and J. Miller, “Cyber threats facing autonomous and connected vehicles: Future challenges,” in Proceedings of the IEEE Transactions on Intelligent Transportation Systems, Washington, DC, USA, 2017, pp. 2898–2915. [Google Scholar] [Crossref]

80.

V. L. L. Thing and J. Wu, “Autonomous vehicle security: A taxonomy of attacks and defences,” in Proceedings of the 2016 IEEE International Conference on Internet of Things (iThings), IEEE GreenCom, IEEE CPSCom, and IEEE SmartData, Guangzhou, China, 2016, pp. 164–170. [Google Scholar] [Crossref]

81.

S. International, “Taxonomy and definitions for terms related to driving automation systems for on-road motor vehicles,” in SAE Standard J3016, Warrendale, PA, USA, 2016. [Google Scholar] [Crossref]

82.

R. John Douceur, “The Sybil Attack,” in Proceedings of the 1st International Workshop on Peer-to-Peer Systems (IPTPS), 2002, pp. 251–260. [Google Scholar] [Crossref]

83.

K. Lim and D. Manivannan, “An efficient protocol for authenticated and secure message delivery in vehicular ad hoc networks,” Veh. Commun., vol. 4, pp. 30–37, 2016. [Google Scholar] [Crossref]

84.

E. Seo, H. M. Song, and H. K. Kim, “GIDS: GAN based intrusion detection system for in-vehicle networking,” in Proceedings of the IEEE International Conference on Privacy, Security and Trust (PST), Harlow, UK, 2018, pp. 1–6. [Google Scholar] [Crossref]

85.

F. D. Garcia, D. Oswald, T. Kasper, and P. Pavlides, “Lock it and still lose it—On the (in) security of automotive remote keyless entry systems,” in 25th USENIX security symposium (USENIX Security 16), 2016. [Online]. Available: https://www.usenix.org/conference/usenixsecurity16/technical-sessions/presentation/garcia [Google Scholar]

86.

J. Sun, Y. Cao, Q. A. Chen, and Z. M. Mao, “Towards robust LiDAR-based perception in autonomous driving: General black-box adversarial sensor attack and countermeasures,” in 29th USENIX Security Symposium (USENIX Security 20), 2020, pp. 877–894. [Online]. Available: https://www.usenix.org/conference/usenixsecurity20/presentation/sun [Google Scholar]

87.

I. Ullah and Q. H. Mahmoud, “A scheme for generating a dataset for anomalous activity detection in IoT networks,” in Advances in Artificial Intelligence (Canadian AI), Montreal, QC, Canada, 2020, pp. 508–520. [Google Scholar] [Crossref]

88.

N. Moustafa and J. Slay, “The evaluation of network anomaly detection systems: Statistical analysis of the UNSW-NB15 data set and the comparison with the KDD99 data set,” Inf. Secur. J.: A Glob. Perspect., vol. 25, no. 1–3, pp. 18–31, 2016. [Google Scholar] [Crossref]

89.

N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” J. Mach. Learn. Res., vol. 15, no. 1, pp. 1929–1958, 2014. [Google Scholar]

90.

F. V. Wyk, Y. Wang, A. Khojandi, and N. Masoud, “Real-time sensor anomaly detection and identification in automated vehicles,” in IEEE Transactions on Intelligent Transportation Systems, Piscataway, NJ, USA, 2020, pp. 1264–1276. [Google Scholar] [Crossref]

91.

I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explaining and harnessing adversarial examples,” arXiv:1412.6572, 2014. [Google Scholar] [Crossref]

92.

K. Yang, S. Kpotufe, and S. Feizi, “Defending multimodal fusion models against single-source adversaries,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021, pp. 3340–3349. [Online]. Available: https://openaccess.thecvf.com/content/CVPR2021/html/Yang_Defending_Multimodal_Fusion_Models_Against_Single-Source_Adversaries_CVPR_2021_paper.html [Google Scholar]

Cite this:

APA Style

IEEE Style

BibTex Style

MLA Style

Chicago Style

GB-T-7714-2015

Arslan, R., Özseven, T., Aydın, M. M., & Çelik, Y. (2026). Cybersecurity in Intelligent Transportation Systems: A Comparative Study on AI-Based Anomaly Detection and Threat Analysis. Mechatron. Intell Transp. Syst., 5(1), 11-30. https://doi.org/10.56578/mits050102

cc

©2026 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.

pdf

Figure 1. Vehicle-to-Everything (V2X) communication architecture and AI-based security system

Table 1. Comparison of publicly available intrusion detection datasets

Citations

Crossref: 0