Optimization of Last-Mile Joint Delivery in Express Logistics Based on Complex Networks and Machine Learning
Abstract:
With the rapid expansion of e-commerce, last-mile delivery in express logistics faces significant challenges, including low efficiency and high operational costs. Taking the Xiqing District of Tianjin as a case study, this research proposes a three-stage framework integrating complex network theory and machine learning. First, the Louvain algorithm is employed to achieve intelligent partitioning of delivery areas, resulting in a modularity increase to 0.789. Second, an eXtreme Gradient Boosting (XGBoost) model is utilized to predict terminal service modes, achieving an accuracy of 87.8%. Finally, a route planning model is constructed using Particle Swarm Optimization (PSO). To validate these methods, a three-day logistics system simulation was conducted via AnyLogic to evaluate the effectiveness of different delivery policies. The results demonstrate that, compared to traditional independent delivery, the joint delivery approach reduces total costs by 25.32%. Furthermore, by introducing a carbon emission accounting model, leading to an estimated 25% reduction in daily carbon emissions, achieving a win-win situation for both economic and environmental benefits.
1. Introduction
According to the latest statistics, the express delivery industry achieved new breakthroughs again in 2025. The data show that in 2025, the total volume of postal industry delivery services in China reached 216.5 billion items, with a year-on-year increase of 11.5%. Among them, the volume of express delivery services reached 199.0 billion items, with a year-on-year increase of 13.7%. The annual revenue of the postal industry was 1.8 trillion yuan, and the revenue of express delivery services was 1.5 trillion yuan, with year-on-year increases of 6.4% and 6.5%, respectively. The quality of express delivery services was further improved. In December 2025, the public satisfaction of express delivery services is estimated to be 85 points, an increase of 0.6 points year-on-year. The 72-hour on-time delivery rate in key regions is estimated to be 86.6%, an increase of 1.7 percentage points year-on-year.
Although the scale of the express delivery industry continues to expand, the construction of the last-mile delivery system has not kept pace, forming an increasingly prominent contradiction. With the acceleration of urbanization and continuous population growth, problems such as delivery congestion and increased customer complaints have appeared in urban areas, while in rural areas, due to the sparse distribution of delivery outlets, the problem of insufficient delivery coverage has become more serious. The last-mile delivery service of express logistics is facing unprecedented pressure, and it is urgent to improve service efficiency and quality. In order to solve this problem, promoting the optimization of the last-mile delivery system has become an urgent task for the development of the industry. Compared with the independent delivery mode in which enterprises deliver separately without sharing resources, joint delivery [1] is a cooperative logistics mode aimed at reducing costs and improving efficiency through resource sharing [2] and management optimization [3]. Only by accelerating the construction of the last-mile joint delivery mode and improving the coverage and response speed of delivery services can the actual needs of the public be effectively met.
In terms of the division of last-mile delivery regions, traditional studies mostly rely on administrative boundary divisions or geographic spatial partitioning based on clustering algorithms such as $k$-means [4] and Fuzzy C-Means (FCM) [5]. In recent years, with the rise of complex network theory, some scholars have begun to attempt to use network topology structures to optimize logistics networks. For example, some studies abstract logistics nodes as complex networks [6] and analyze their small-world characteristics [7] and scale-free characteristics [8]. However, most existing region division methods only consider geographic distance and often ignore the actual interaction strength and community structure characteristics among nodes within the logistics network. As a result, although the division results are compact in physical space, there are frequent cross-region interactions and low resource coordination in actual delivery circulation.
For the selection of last-mile delivery service modes, existing literature mostly focuses on qualitative analysis or simple statistical regression. With the development of big data technology, machine learning algorithms (such as Random Forest [9], support vector machine [10], and Neural Networks [11]) have gradually been applied to logistics demand forecasting and customer churn prediction [12]. However, research on fine-grained classification prediction for “last-mile service mode selection” (such as home delivery, smart locker self-pickup, and station collection) is relatively scarce. Most path planning studies usually assume that the customer service acceptance mode is statically known or randomly distributed, and fail to use machine learning techniques to mine the implicit user behavior preferences and community attribute features in historical data, resulting in the allocation of delivery resources often lagging behind the actual demand changes.
In the field of joint delivery path optimization, heuristic algorithms such as Genetic Algorithm (GA) [13], Ant Colony Optimization (ACO) [14], and Particle Swarm Optimization (PSO) [15] have been widely applied and achieved significant results [16]. Existing studies mostly focus on improving algorithm convergence speed and avoiding local optimal solutions, such as PSO algorithms with improved inertia weights. Although the optimization of a single link is relatively mature, there are still few studies that conduct full-chain collaborative optimization of “region division—mode prediction—path planning.”
In view of this, this paper proposes a joint optimization framework integrating the Louvain algorithm of complex networks, the eXtreme Gradient Boosting (XGBoost) machine learning model, and an improved PSO algorithm, aiming to make up for the deficiencies of existing research in mining regional topological structures and accurately predicting service modes, and to realize cost reduction and efficiency improvement in the joint last-mile delivery of express logistics.
2. Related Theories and Methods
Complex networks [17] are structures composed of a large number of nodes and their interconnections, and play an important role in many fields. In network analysis, nodes are the basic units in the network structure, which can represent individuals, devices, genes, etc.; edges are used to connect nodes and usually represent relationships, communication lines, or gene interactions. A path is a sequence of edges connecting two nodes in the network, in which the shortest path is the connection with the least number of edges. The connectivity of a network describes the reachability between nodes; if there exists a path between any pair of nodes, the network is a connected network.
The Louvain algorithm is a community detection algorithm [18], which is specifically used to find clustered communities in graph-structured data. A community can be regarded as a group of nodes that are highly interconnected in the graph, and the goal of the Louvain algorithm is to discover these communities so as to better understand the structure of the graph and the relationships between nodes. The Louvain algorithm performs community detection based on modularity maximization. Modularity measures the difference between the connection density of nodes within a community and that of random connections. By maximizing modularity, the Louvain algorithm can find the natural aggregation of nodes, where nodes within communities are densely connected, while connections between communities are relatively sparse. In this way, the algorithm reveals the structure and information flow in complex networks.
Modularity is one of the important indicators used to evaluate the quality of network community structure. It is used to evaluate whether the division of nodes in a network forms a good community structure, and is usually used in community detection algorithms to measure the density of connections within communities relative to random graphs.
The formula of modularity is shown in Eq. (1):
where, $A_{i j}$ indicates whether there is an edge between node $i$ and node $j, 1$ indicates that there is an edge, and 0 indicates that there is no edge; $k_i$ and $k_j$ represent the degrees of node $i$ and node $j$ respectively (i.e., the number of connections of the node); $m$ is the total number of edges in the graph; $\delta\left(C_i, C_j\right)$ is an indicator function, which is 1 if $i$ and $j$ belong to the same community, otherwise it is 0 . Modularity calculates the deviation between such an actual network partition and the expected partition in a random network. The fast evaluation formula of modularity is shown in Eq. (2):
XGBoost is a machine learning algorithm based on the gradient boosting framework. It constructs a model by iteratively training a series of decision trees. The core idea is to continuously build new weak learners by fitting the residuals in the negative gradient direction (the difference between the true value and the predicted value) on the basis of the existing model, so as to gradually optimize the model. In modern logistics systems, facing the high dynamics of last-mile delivery and the heterogeneity of user demand, traditional linear programming or simple statistical models are difficult to capture high-dimensional and nonlinear feature relationships. However, introducing the XGBoost machine learning framework into logistics scheduling can significantly improve the decision-making efficiency of the system under complex constraints [19].
XGBoost constructs a strong learner by iteratively training a series of weak learners. Its core idea is to adopt a forward stepwise algorithm and gradually approximate the true value by continuously fitting the negative gradient residuals generated by the model in the previous round. In this study, XGBoost is used to construct a classification prediction model. By analyzing historical delivery data and outlet characteristics, the optimal service mode (such as home delivery, smart locker self-pickup, and station collection) in a specific region is identified. This data-driven prediction method makes up for the deficiency of the “static mode assumption” in traditional path planning models, thereby realizing the precise matching between delivery resources and actual demand in a complex urban joint delivery system.
The basic idea of the PSO algorithm is to simulate group behaviors in nature (such as bird flock foraging and fish schooling), and to search for the optimal solution through information sharing and cooperation among particles. PSO is widely applied in fields such as function optimization, machine learning, path planning, and image processing. The core of PSO lies in the update of particle velocity and position. The update Eq. (3) guides the search process by considering the current position, velocity of the particle, and the best position in the group.
where, $v_i^{t+1}$ represents the velocity of the particle at generation $t+1$; wis the inertia weight, which controls the influence of the current velocity on the new velocity; $c_1$ and $c_2$ are learning factors, which are used to control the dependence of the particle on its own historical best position and the global best position; $r_1$ and $r_2$ are two random numbers, usually generated in the range of [0,1], which are used to introduce a certain degree of randomness, thereby enhancing the global exploration ability. The position update formula is shown in Eq. (4):
In addition, by introducing formulas, the process of particle velocity and position update can be understood more intuitively. Variants such as adaptive PSO, hybrid PSO [20], and multi-objective PSO have improved the global search ability and convergence speed of PSO.
3. Model Construction
(1) The express deliveries involved in this study are standard express parcels, and it is assumed that all express goods are homogeneous products.
(2) Each express parcel is delivered only once per day, and all parcels will be directly delivered to the corresponding last-mile joint delivery outlet or returned directly, without considering special situations such as re-delivery.
(3) The locations and demand volumes of last-mile joint delivery outlets are known, and the business volume of each outlet is relatively stable within a certain period, without large fluctuations.
(4) The efficiency of delivery vehicles arriving at last-mile joint delivery outlets is directly affected by factors such as travel distance and road conditions.
(5) The fixed cost, operating cost, and maximum express handling capacity of candidate locations for the collaborative sorting and joint delivery grassroots distribution centers are all known.
(6) Transportation cost and congestion cost are both linearly proportional to the vehicle travel distance.
(7) The starting point of all delivery tasks is the collaborative sorting and joint delivery grassroots distribution center, and delivery vehicles must return to the center after completing the tasks.
(8) The travel distance of vehicles must not exceed the specified maximum travel range.
(9) The carbon emissions of delivery vehicles are mainly determined by the travel mileage, and it is assumed that the carbon emission factor per unit mileage of all light logistics trucks is a constant.
This study collected the longitude and latitude coordinates of the distribution of 86 delivery nodes from 4 companies in Xiqing District. ArcMap was used to generate the location map of the region, as shown in Figure 1, to visually display the geographic distribution and relative positional relationships of each delivery node.

This study constructs a three-level optimization framework of “region division—mode prediction—path planning,” as shown in Figure 2. First, a logistics network model is constructed based on complex network theory, where delivery nodes are abstracted as nodes and road relationships between nodes are abstracted as edges. Second, the Louvain algorithm is applied for community detection to realize intelligent region division. Then, the results are used as spatial constraints and input into the XGBoost model to predict the optimal service mode for each region. Finally, the parameters of the PSO path optimization model are adjusted according to the predicted modes to achieve full-chain optimization.
In view of the particularity of express logistics in Xiqing District, this study introduces three transformation rules according to the characteristics of logistics distribution:
(1) Each residential area, commercial area, and other delivery regions in Xiqing District are transformed into nodes in the network;

(2) The logistics relationships between delivery regions are transformed into edges in the network;
(3) The weights of edges correspond to the difficulty of delivery between regions, such as distance and traffic factors.
According to statistical data, the network model constructed in this study consists of 86 nodes and 134 edges, which reflects the road connection conditions among the 86 communities. To reflect the actual delivery difficulty, roads are divided into three types: main roads, secondary roads, and branch roads. The Analytic Hierarchy Process (AHP) and entropy weight method are combined for weighting, and the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method is used to obtain the results to determine the edge weights. The final weights of the three types of roads are shown in Table 1.
Road Type | Weight | Description |
Type 1 | 0.2675 | Main roads, good road conditions |
Type 2 | 0.2760 | Secondary roads, moderate congestion |
Type 3 | 0.4565 | Branch roads, high delivery difficulty |
Based on the above rules, the network modeling and visualization are completed, providing a basis for subsequent Louvain community detection. According to the road conditions and routes from the Gaode Map, the complex network modeling, route assignment, and community node settings are completed, with a total of 134 rows, thereby constructing the community network model of the region and visualizing it. The effect is shown in Figure 3.
The traditional Louvain algorithm is prone to falling into local optimal solutions during the community merging stage. This study introduces a modularity gain acceleration strategy. According to the above method, two-stage community detection is carried out on the complex network. The modularity in the first stage is 0.685, and the result is not ideal, so a second network reconstruction is performed. Finally, the modularity is improved to 0.789, and the optimization effect is significant. The final division results are shown in Figure 4.
At the same time, mainstream clustering algorithms are compared together, as shown in Table 2. The Louvain algorithm shows outstanding performance in handling large-scale networks due to its high efficiency and scalability, and it can discover multi-level community structures.
After the screening and evaluation of multiple clustering algorithms, the region division model for joint last-mile delivery of express logistics based on the improved Louvain algorithm successfully realizes the community division of the complex network, and reasonably divides the 86 nodes into 11 closely related delivery regions in the case area, achieving fast and efficient processing of the delivery process.


Algorithm | Modularity | Number of Communities | Coverage | Internal Density | Scalability | Average Path Length |
Louvain | 0.789 | 11 | 1.0 | 0.388 | 0.132 | 1.961 |
Label propagation algorithm | 0.639 | 19 | 1.0 | 0.703 | 0.334 | 1.393 |
Local centrality-based community detection algorithm | 0.689 | 7 | 1.0 | 0.277 | 0.093 | 2.279 |
Girvan-Newman | 0.184 | 2 | 1.0 | 0.101 | 0.012 | 4.765 |
Walktrap | 0.107 | 17 | 1.0 | 0.107 | 0.817 | 1.500 |
Fast Newman | 0.113 | 13 | 1.0 | 0.086 | 0.763 | 5.231 |
Based on the aforementioned algorithm principles and schemes, this study collected relevant information from communities near some A/C/S/J outlets and ordinary communities in Xiqing District, Tianjin, through field investigation and map data analysis. The selected variables cover seven indicators, including community main type, number of residential areas, number of buildings, community coverage area, population density level, age of residential areas, and number of nearby subway stations. At the same time, combined with publicly available data on fresh express delivery, the relevant data of 204 last-mile communities were organized. These data will be used as samples input into the XGBoost algorithm to train the final classifier model.
The model uses OneHotEncoder for one-hot encoding to process categorical features, and divides 70% of the data into the training set and 30% into the test set for model construction and evaluation.
It can be seen from Figure 5 that the error rate performance of Random Forest, Adaptive Boosting (AdaBoost), and XGBoost models shows significant differences under different numbers of iterations. XGBoost (green curve) exhibits the lowest and relatively stable error rate at all iteration stages. Especially at 50 iterations, its error rate remains below 0.1, showing strong model robustness and high efficiency.
Based on the above analysis, it is recommended to give priority to XGBoost as the main model in practical applications. According to the test results, when the mfinal parameter is set to 11, the accuracy of the test set reaches 0.878, and the combined classifier model is finally obtained. The indicator data of each algorithm are shown in Table 3.
Model | Accuracy | Recall (0) | F1-score (0) | Recall (1) | F1-score (1) | Recall (2) | F1-score (2) |
Random forest | 0.829 | 0.93 | 0.90 | 0.87 | 0.79 | 0.67 | 0.80 |
Adaptive boosting | 0.829 | 0.93 | 0.90 | 0.87 | 0.79 | 0.67 | 0.80 |
eXtreme Gradient Boosting | 0.878 | 0.93 | 0.90 | 0.80 | 0.83 | 0.92 | 0.92 |
Through in-depth comparative analysis, the XGBoost algorithm in this study, compared with other classification methods, not only shows more obvious advantages, but also has more prominent accuracy performance.
In this case, the relevant data of 11 communities are input into the previously constructed combined classifier model, and relevant functions are used to conduct in-depth prediction and analysis of these datasets. As shown in Table 4, regions with predicted mode label 1 are regarded as large-scale logistics regions, and the adopted operation strategy is the exclusive operation strategy; regions with predicted logistics scale of 2 are regarded as medium-scale logistics regions, and the adopted operation strategy is intelligent dynamic combined delivery; regions with predicted logistics scale of 3 are regarded as small-scale logistics regions, and the adopted operation strategy is the smart locker strategy.

Community Main Type | Number of Residential Areas | Number of Buildings | Community Coverage Area | Population Density Level | Age of Residential Areas | Number of Nearby Subway Stations | Predicted Mode Label |
1 | 6 | 268 | 4.1 | 1 | 3 | 2 | 1 |
1 | 5 | 157 | 2.7 | 2 | 1 | 0 | 3 |
2 | 7 | 258 | 2.7 | 3 | 2 | 0 | 1 |
1 | 5 | 327 | 1.4 | 1 | 3 | 0 | 2 |
2 | 10 | 339 | 2.7 | 3 | 1 | 2 | 2 |
3 | 15 | 426 | 3.4 | 2 | 1 | 3 | 1 |
1 | 8 | 87 | 1.4 | 1 | 3 | 0 | 2 |
2 | 7 | 211 | 1.4 | 3 | 1 | 1 | 2 |
2 | 9 | 134 | 3.4 | 3 | 2 | 0 | 1 |
2 | 7 | 196 | 2 | 1 | 3 | 2 | 2 |
2 | 7 | 99 | 2 | 1 | 2 | 1 | 3 |
At the same time, the regions corresponding to specific community numbers are summarized, as shown in Table 5. Large-scale logistics regions (No. 1, 3, 6, 9) adopt the exclusive operation strategy for delivery, which can provide more professional and efficient services. Small-scale logistics regions (No. 2, 11) adopt the smart locker strategy, which is suitable for areas with small order volumes and improves delivery convenience. Medium-scale logistics regions (No. 4, 5, 7, 8, 10) adopt the intelligent dynamic combined delivery strategy, which can be flexibly adjusted according to real-time demand and can maximize delivery efficiency. The XGBoost model divides communities into three modes: exclusive operation, intelligent dynamic combination, and smart locker. In PSO path planning, a higher vehicle load rate threshold is set for exclusive operation regions, and the door-to-door time constraint is removed for smart locker regions, thereby achieving full-chain optimization adaptation.
Community Number | Community Logistics Scale Prediction Result |
1 | Large-scale logistics region (exclusive operation strategy) |
2 | Small-scale logistics region (smart locker strategy) |
3 | Large-scale logistics region (exclusive operation strategy) |
4 | Medium-scale logistics region (intelligent dynamic combined delivery) |
5 | Medium-scale logistics region (intelligent dynamic combined delivery) |
6 | Large-scale logistics region (exclusive operation strategy) |
7 | Medium-scale logistics region (intelligent dynamic combined delivery) |
8 | Medium-scale logistics region (intelligent dynamic combined delivery) |
9 | Large-scale logistics region (exclusive operation strategy) |
10 | Medium-scale logistics region (intelligent dynamic combined delivery) |
11 | Small-scale logistics region (smart locker strategy) |
To realize resource optimization of last-mile joint delivery, this section first constructs a grassroots distribution center location selection model, and then conducts path planning based on the PSO algorithm. The location selection result is used as the delivery starting point of PSO to ensure full-chain connection.
Combined with the 11 communities divided by Louvain and the service modes predicted by XGBoost, this study uses the $k$-means clustering algorithm combined with the Haversine formula to optimize the location of smart lockers. The $k$-means algorithm is used to group communities according to the number of smart lockers (1 to 4), and the Haversine formula calculates spherical distances to reflect actual travel distances.
Mathematical Model:
where, Eq. (5) represents an objective function, and its purpose is to minimize the total cost. Eq. (6) indicates that the business demand of all last-mile joint delivery outlets is satisfied. Eq. (7) indicates that the maximum processing capacity of the selected candidate points for the collaborative sorting and joint delivery grassroots distribution center can meet the total demand of the last-mile joint delivery outlets. Eq. (8) ensures that only when the candidate point of the collaborative sorting and joint delivery grassroots distribution center is selected, all express parcels of the last-mile joint delivery outlets can be assigned to the distribution center for processing. Eq. (9) indicates that only one collaborative sorting and joint delivery grassroots distribution center needs to be selected. Eq. (10) indicates that each last-mile joint delivery outlet must be served and can only be served by one collaborative sorting and joint delivery grassroots distribution center.
Through the location selection regions G2, G4, G5, G7, G8, G10, and G11, the results of the location selection model are obtained.
According to Figure 6 and Table 6, the placement under different numbers of smart lockers in the G2 region shows that selecting 2 smart lockers is a reasonable decision. The selection distribution diagrams of other regions are not shown additionally. The G4 region is configured with 2 smart lockers to cover most buildings; the G5 region is configured with 3 smart lockers to cover most buildings; the G7 region is configured with 2 smart lockers to improve service coverage; the G8 region is configured with 3 smart lockers to meet the needs of most buildings; the G10 region is recommended to be configured with 2 smart lockers to achieve moderate coverage; the G11 region is configured with 3 smart lockers to ensure extensive service. Such configurations aim to effectively balance service coverage and cost, improve the efficiency and economy of express delivery services, and meet the needs of buildings within the regions.

Index | No. | Longitude | Latitude | Distance to Locker1 | Distance to Locker2 | Distance to Locker3 | Distance to Locker4 |
0 | 2 | 117.0601 | 39.1021 | 1445.388 | 2289.447 | 0 | 0 |
1 | 9 | 117.0993 | 39.0978 | 5567.843 | 2110.434 | 1052.048 | 0 |
2 | 29 | 117.0825 | 39.1171 | 3586.371 | 616.541 | 1051.98 | 0 |
3 | 48 | 117.0300 | 39.1419 | 2498.775 | 551.508 | 551.508 | 551.50 |
4 | 54 | 117.0177 | 39.1355 | 3700.108 | 1109.043 | 1109.043 | 1109.043 |
5 | 55 | 117.0323 | 39.1275 | 2028.524 | 581.407 | 581.407 | 581.407 |
6 | 57 | 117.0303 | 39.1245 | 2231.502 | 501.522 | 501.522 | 501.522 |
This path optimization model aims to optimize delivery routes and reduce transportation cost and time consumption by calculating the distance matrix between delivery points and combining it with optimization algorithms.
Mathematical Model:
where, $j_{k c}$ is the unit transportation cost from the integrated sorting and distribution hub ($j$) to the last-mile joint delivery outlet ($k$) (yuan/(parcel·km)); $j_{k s}$ is the congestion cost coefficient (the value refers to the peak and off-peak time coefficients of working days in Xiqing District, ranging from [1.0, 1.5]); $j_{k m}$ is the fixed cost coefficient of vehicle operation (unit: yuan), which can reflect vehicle depreciation, maintenance, and dispatching costs.
Eq. (11) represents the objective function to minimize the total transportation cost of delivery routes, including transportation cost and congestion cost. Eq. (12) ensures that the demand of each lastmile outlet ($k$) can be satisfied by the delivery quantity from different distribution hubs ($j$). Eq. (13) indicates that the maximum processing capacity of the selected distribution center can meet the total demand of all outlets. Eq. (14) ensures that only the selected distribution centers can allocate express parcels for processing. Eq. (15) specifies that the demand of each outlet is served by only one distribution center. Eq. (16) ensures that all delivery quantities are non-negative, which conforms to practical operation requirements, and ensures that the sum of weight coefficients of all distribution hubs is 1 , thereby ensuring the consistency and rationality of the model. Finally, in order to further reduce empty driving and ineffective transportation, it is required that a certain proportion of vehicles in the transportation capacity of each distribution hub must operate with full load as much as possible. These formulas jointly ensure the efficient operation and cost control of the delivery network.
In view of the high energy consumption problem caused by the traditional single objective of cost minimization, this study introduces carbon emission accounting indicators after the completion of path planning to evaluate the green environmental benefits of joint delivery. The calculation formula of total carbon emissions is as follows:
In Eq. (18), $E$ is the total carbon emissions generated by a single delivery process (kg CO$_2$); $d_{j k}$ is the travel distance between nodes; $e_f$ is the carbon emission factor per unit mileage of standard light logistics vehicles. Referring to relevant carbon emission standards, this study takes $e_f= 0.24 \mathrm{~kg} \mathrm{CO}_2 / \mathrm{km}$. This indicator, together with transportation cost, constitutes a comprehensive evaluation system for assessing the performance of the delivery model.
In the solution process of this model, the PSO algorithm is used. The path optimization regions are G1, G3, G4, G5, G6, G7, G8, G9, and G10. The following will present the model analysis process and results, where the specific coordinates of the distribution centers can refer to the calculation method of the location selection model.
Figure 7 shows the geographic layout of a regional delivery network constructed using the AnyLogic model, clearly indicating the location information of key delivery points. Next, path optimization will be carried out for each divided community set, and a three-day continuous simulation experiment will be conducted.
In each iteration, particles approach the global optimum through the velocity update mechanism, and finally form an optimal path. These nodes are the position update results of particles in different iterations, reflecting the process of searching for the optimal solution. The optimal paths of regions G1, G3, G6, and G9 are shown in Figure 8.

Based on the simulation results of AnyLogic, after optimization by the three-level framework, the performance of the last-mile delivery system in Xiqing District is significantly improved, as shown in Table 7. Taking the G1 region (Zhongbei Town and surrounding communities) as an example, the total length of the optimized joint delivery path is reduced from 145.6 km under the independent delivery mode to 108.7 km. Substituting into the aforementioned carbon emission accounting formula, it can be seen that a single delivery in the G1 region alone can reduce approximately 8.86 kg of carbon dioxide emissions. The cost per delivery route and carbon footprint are both significantly reduced.

Region | Path Length Optimization Rate | Time Window Satisfaction Improvement | Average Vehicle Load Rate | Estimated Carbon Reduction |
G1 | 25.34% | 12.5% | 88.6% | 8.86 |
G3 | 22.10% | 9.8% | 85.2% | 7.42 |
G6 | 28.45% | 15.2% | 91.3% | 11.20 |
G9 | 21.80% | 11.4% | 84.7% | 6.55 |
In the initial express logistics background, after incorporating the corresponding company information, a comparative experiment of joint delivery models is carried out. By independently running the delivery schemes of four companies and combining the above mathematical models, the Vehicle Routing Problem (VRP) routes and costs of each company are calculated. In these independent delivery schemes, each company plans its vehicle delivery routes according to its own business needs and customer distribution, and reduces its independent operating transportation cost through optimization algorithms. Each company is responsible for delivering goods from the warehouse to multiple customer nodes without sharing resources with other companies. According to the above research, the complete optimization diagram of joint delivery based on region division and the Pickup and Delivery Vehicle Routing Problem (PO-VRP) model is shown in Figure 9 and Figure 10.


As shown in Table 8, compared with independent allocation, the vehicle routing cost is reduced by 13.19% through joint allocation based on region division and the PO-VRP model. Compared with independent allocation, the overall allocation cost is reduced by 25.32%. Compared with independent allocation, the allocation cost is reduced by 19.73%. Therefore, adopting a joint delivery strategy for common customers is an effective way for logistics enterprises to reduce delivery costs.
Delivery Mode | C1 | C2 | C3 | C4 | TC | E () |
A Independent delivery | 300 | 504.03 | 12.83 | 0 | 816.86 | 58.2 |
S Independent delivery | 300 | 532.32 | 13.54 | 0 | 845.87 | 61.3 |
J Independent delivery | 200 | 448.65 | 11.42 | 0 | 660.07 | 47.3 |
C Independent delivery | 300 | 571.97 | 14.58 | 0 | 886.56 | 64.7 |
Total Independent delivery | 1100 | 2056.97 | 52.38 | 0 | 3209.36 | 231.5 |
Joint delivery based on region division and pickup and delivery vehicle routing problem model | 1200 | 565.39 | 14.36 | 700.05 | 3079.80 | 173.6 |
It can be seen from Figure 11 that with the increase of fixed vehicle cost, the total delivery cost of both strategies shows an upward trend. However, the joint delivery mode represented by the blue line always maintains a lower total delivery cost and is more economical compared with the independent delivery mode represented by the black line. In addition, as the fixed vehicle cost increases, the gap between the two costs gradually widens, which indicates that when the fixed cost is high, the advantage of the cost-sharing strategy is greater.
As shown in Figure 12, with the increase of demand, the shared delivery mode shows higher stability in cost and a relatively lower cost growth curve. In scenarios with large demand fluctuations, the shared delivery strategy has obvious advantages in cost saving and resource allocation. The sensitivity analysis results of demand fluctuation indicate that when facing uncertain market demand, the shared delivery mode has stronger adaptability and cost control ability.


4. Conclusion
The three-level optimization framework proposed in this study not only verifies the collaborative potential of complex networks and machine learning in theory, but also, based on the prediction results of the XGBoost model, the delivery modes of communities of different scales (such as exclusive operation strategy for large-scale regions, intelligent dynamic combined delivery for medium-scale regions, and smart locker strategy for small-scale regions), as well as sensitivity analysis, provide operational guidance for the actual operation of the express logistics industry.
First, for old communities with large demand fluctuations, logistics enterprises should give priority to adopting the intelligent dynamic combined delivery strategy. This can flexibly adjust resource allocation, for example, increasing shared vehicles and temporary stations during peak periods, and avoiding resource idleness or overload under the traditional exclusive operation mode. In this way, enterprises can reduce the risk of delivery delay and improve customer satisfaction without large-scale infrastructure investment.
Second, in an environment with high fixed costs, logistics enterprise managers should actively promote joint delivery cooperation among multiple enterprises. By sharing grassroots sorting centers and optimizing routes, enterprises can share vehicle maintenance and operation costs and achieve cost reductions of more than 25%. It is recommended that enterprises establish cross-company data sharing platforms and use XGBoost prediction tools to dynamically adjust modes in real time to cope with market uncertainty.
In addition, this study incorporates carbon emission reduction indicators into the evaluation of delivery optimization in this region, demonstrating that the joint delivery mode can not only achieve corporate business benefits (cost reduction), but also create significant ecological environmental value (carbon reduction). Future research can further introduce a dynamic carbon tax mechanism to provide more comprehensive quantitative decision support for low-carbon city construction.
Finally, government policy makers can promote the implementation of last-mile joint delivery through subsidy mechanisms. For example, financial support can be provided for the construction and maintenance of grassroots sorting centers to encourage resource integration among enterprises. This can not only alleviate urban traffic congestion, but also improve rural coverage and promote sustainable logistics development. Overall, these insights emphasize the transformation from data-driven to collaborative optimization, helping the industry cope with the challenges brought by the growth of e-commerce.
Conceptualization, L.L.; methodology, L.L. and P.Z.; software, P.Z. and J.X.; validation, P.Z. and J.X; formal analysis, P.Z. and J.X.; investigation, J.X.; resources, L.L.; data curation, P.Z. and J.X.; writing—original draft preparation, J.X.; writing—review and editing, L.L. and P.Z.; visualization, P.Z.; supervision, L.L.; project administration, L.L. All authors have read and agreed to the published version of the manuscript.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.
