Optimization of Last-Mile Joint Delivery in Express Logistics Based on Complex Networks and Machine Learning

liang liu; penghui zhang; jiawei xu

Outline

Open Access

Research article

Optimization of Last-Mile Joint Delivery in Express Logistics Based on Complex Networks and Machine Learning

Liang Liu^*

,

Penghui Zhang

,

Jiawei Xu

School of Economics and Management, Tianjin Polytechnic University, 300387 Tianjin, China

Journal of Operational and Strategic Analytics

|

Volume 4, Issue 1, 2026

|

Pages 1-16

https://doi.org/10.56578/josa040101

Received: 11-19-2025,

Revised: 12-27-2025,

Accepted: 01-07-2026,

Available online: 01-11-2026

View Full Article|

Download PDF

Abstract:

With the rapid expansion of e-commerce, last-mile delivery in express logistics faces significant challenges, including low efficiency and high operational costs. Taking the Xiqing District of Tianjin as a case study, this research proposes a three-stage framework integrating complex network theory and machine learning. First, the Louvain algorithm is employed to achieve intelligent partitioning of delivery areas, resulting in a modularity increase to 0.789. Second, an eXtreme Gradient Boosting (XGBoost) model is utilized to predict terminal service modes, achieving an accuracy of 87.8%. Finally, a route planning model is constructed using Particle Swarm Optimization (PSO). To validate these methods, a three-day logistics system simulation was conducted via AnyLogic to evaluate the effectiveness of different delivery policies. The results demonstrate that, compared to traditional independent delivery, the joint delivery approach reduces total costs by 25.32%. Furthermore, by introducing a carbon emission accounting model, leading to an estimated 25% reduction in daily carbon emissions, achieving a win-win situation for both economic and environmental benefits.

Keywords: Joint delivery, Complex network, Machine learning, Louvain algorithm, Particle Swarm Optimization

1. Introduction

According to the latest statistics, the express delivery industry achieved new breakthroughs again in 2025. The data show that in 2025, the total volume of postal industry delivery services in China reached 216.5 billion items, with a year-on-year increase of 11.5%. Among them, the volume of express delivery services reached 199.0 billion items, with a year-on-year increase of 13.7%. The annual revenue of the postal industry was 1.8 trillion yuan, and the revenue of express delivery services was 1.5 trillion yuan, with year-on-year increases of 6.4% and 6.5%, respectively. The quality of express delivery services was further improved. In December 2025, the public satisfaction of express delivery services is estimated to be 85 points, an increase of 0.6 points year-on-year. The 72-hour on-time delivery rate in key regions is estimated to be 86.6%, an increase of 1.7 percentage points year-on-year.

Although the scale of the express delivery industry continues to expand, the construction of the last-mile delivery system has not kept pace, forming an increasingly prominent contradiction. With the acceleration of urbanization and continuous population growth, problems such as delivery congestion and increased customer complaints have appeared in urban areas, while in rural areas, due to the sparse distribution of delivery outlets, the problem of insufficient delivery coverage has become more serious. The last-mile delivery service of express logistics is facing unprecedented pressure, and it is urgent to improve service efficiency and quality. In order to solve this problem, promoting the optimization of the last-mile delivery system has become an urgent task for the development of the industry. Compared with the independent delivery mode in which enterprises deliver separately without sharing resources, joint delivery [1] is a cooperative logistics mode aimed at reducing costs and improving efficiency through resource sharing [2] and management optimization [3]. Only by accelerating the construction of the last-mile joint delivery mode and improving the coverage and response speed of delivery services can the actual needs of the public be effectively met.

In terms of the division of last-mile delivery regions, traditional studies mostly rely on administrative boundary divisions or geographic spatial partitioning based on clustering algorithms such as $k$-means [4] and Fuzzy C-Means (FCM) [5]. In recent years, with the rise of complex network theory, some scholars have begun to attempt to use network topology structures to optimize logistics networks. For example, some studies abstract logistics nodes as complex networks [6] and analyze their small-world characteristics [7] and scale-free characteristics [8]. However, most existing region division methods only consider geographic distance and often ignore the actual interaction strength and community structure characteristics among nodes within the logistics network. As a result, although the division results are compact in physical space, there are frequent cross-region interactions and low resource coordination in actual delivery circulation.

For the selection of last-mile delivery service modes, existing literature mostly focuses on qualitative analysis or simple statistical regression. With the development of big data technology, machine learning algorithms (such as Random Forest [9], support vector machine [10], and Neural Networks [11]) have gradually been applied to logistics demand forecasting and customer churn prediction [12]. However, research on fine-grained classification prediction for “last-mile service mode selection” (such as home delivery, smart locker self-pickup, and station collection) is relatively scarce. Most path planning studies usually assume that the customer service acceptance mode is statically known or randomly distributed, and fail to use machine learning techniques to mine the implicit user behavior preferences and community attribute features in historical data, resulting in the allocation of delivery resources often lagging behind the actual demand changes.

In the field of joint delivery path optimization, heuristic algorithms such as Genetic Algorithm (GA) [13], Ant Colony Optimization (ACO) [14], and Particle Swarm Optimization (PSO) [15] have been widely applied and achieved significant results [16]. Existing studies mostly focus on improving algorithm convergence speed and avoiding local optimal solutions, such as PSO algorithms with improved inertia weights. Although the optimization of a single link is relatively mature, there are still few studies that conduct full-chain collaborative optimization of “region division—mode prediction—path planning.”

In view of this, this paper proposes a joint optimization framework integrating the Louvain algorithm of complex networks, the eXtreme Gradient Boosting (XGBoost) machine learning model, and an improved PSO algorithm, aiming to make up for the deficiencies of existing research in mining regional topological structures and accurately predicting service modes, and to realize cost reduction and efficiency improvement in the joint last-mile delivery of express logistics.

2. Related Theories and Methods

2.1 Complex Networks and Louvain Algorithm

Complex networks [17] are structures composed of a large number of nodes and their interconnections, and play an important role in many fields. In network analysis, nodes are the basic units in the network structure, which can represent individuals, devices, genes, etc.; edges are used to connect nodes and usually represent relationships, communication lines, or gene interactions. A path is a sequence of edges connecting two nodes in the network, in which the shortest path is the connection with the least number of edges. The connectivity of a network describes the reachability between nodes; if there exists a path between any pair of nodes, the network is a connected network.

The Louvain algorithm is a community detection algorithm [18], which is specifically used to find clustered communities in graph-structured data. A community can be regarded as a group of nodes that are highly interconnected in the graph, and the goal of the Louvain algorithm is to discover these communities so as to better understand the structure of the graph and the relationships between nodes. The Louvain algorithm performs community detection based on modularity maximization. Modularity measures the difference between the connection density of nodes within a community and that of random connections. By maximizing modularity, the Louvain algorithm can find the natural aggregation of nodes, where nodes within communities are densely connected, while connections between communities are relatively sparse. In this way, the algorithm reveals the structure and information flow in complex networks.

Modularity is one of the important indicators used to evaluate the quality of network community structure. It is used to evaluate whether the division of nodes in a network forms a good community structure, and is usually used in community detection algorithms to measure the density of connections within communities relative to random graphs.

The formula of modularity is shown in Eq. (1):

$Q=\frac{1}{2 m} \sum i j\left\lceil A_{i j}-\frac{k_i k_j}{2 m}\right\rceil \delta\left(C_i, C_j\right)$

(1)

where, $A_{i j}$ indicates whether there is an edge between node $i$ and node $j, 1$ indicates that there is an edge, and 0 indicates that there is no edge; $k_i$ and $k_j$ represent the degrees of node $i$ and node $j$ respectively (i.e., the number of connections of the node); $m$ is the total number of edges in the graph; $\delta\left(C_i, C_j\right)$ is an indicator function, which is 1 if $i$ and $j$ belong to the same community, otherwise it is 0 . Modularity calculates the deviation between such an actual network partition and the expected partition in a random network. The fast evaluation formula of modularity is shown in Eq. (2):

$\Delta Q=\left[\frac{\sum i n+k_{I, i n}}{2 m}-\left(\frac{\sum t o t+k_i}{2 m}\right)^2\right]-\left[\frac{\sum i n}{2 m}-\left(\frac{\sum t o t}{2 m}\right)^2-\left(\frac{k_i}{2 m}\right)^2\right]=\left[\frac{k_{I, i n}}{2 m}-\frac{\sum t o t \cdot k_i}{2 m^2}\right]$

(2)

2.2 eXtreme Gradient Boosting Machine Learning Model

XGBoost is a machine learning algorithm based on the gradient boosting framework. It constructs a model by iteratively training a series of decision trees. The core idea is to continuously build new weak learners by fitting the residuals in the negative gradient direction (the difference between the true value and the predicted value) on the basis of the existing model, so as to gradually optimize the model. In modern logistics systems, facing the high dynamics of last-mile delivery and the heterogeneity of user demand, traditional linear programming or simple statistical models are difficult to capture high-dimensional and nonlinear feature relationships. However, introducing the XGBoost machine learning framework into logistics scheduling can significantly improve the decision-making efficiency of the system under complex constraints [19].

XGBoost constructs a strong learner by iteratively training a series of weak learners. Its core idea is to adopt a forward stepwise algorithm and gradually approximate the true value by continuously fitting the negative gradient residuals generated by the model in the previous round. In this study, XGBoost is used to construct a classification prediction model. By analyzing historical delivery data and outlet characteristics, the optimal service mode (such as home delivery, smart locker self-pickup, and station collection) in a specific region is identified. This data-driven prediction method makes up for the deficiency of the “static mode assumption” in traditional path planning models, thereby realizing the precise matching between delivery resources and actual demand in a complex urban joint delivery system.

2.3 Particle Swarm Optimization Algorithm

The basic idea of the PSO algorithm is to simulate group behaviors in nature (such as bird flock foraging and fish schooling), and to search for the optimal solution through information sharing and cooperation among particles. PSO is widely applied in fields such as function optimization, machine learning, path planning, and image processing. The core of PSO lies in the update of particle velocity and position. The update Eq. (3) guides the search process by considering the current position, velocity of the particle, and the best position in the group.

$v_i^{t+1}=w \cdot v_i^t+c_1 \cdot r_1 \cdot\left({ pbest }_i-x_i^t\right)+c_2 \cdot r_2 \cdot\left({ gbest }-x_i^t\right)$

(3)

where, $v_i^{t+1}$ represents the velocity of the particle at generation $t+1$; wis the inertia weight, which controls the influence of the current velocity on the new velocity; $c_1$ and $c_2$ are learning factors, which are used to control the dependence of the particle on its own historical best position and the global best position; $r_1$ and $r_2$ are two random numbers, usually generated in the range of [0,1], which are used to introduce a certain degree of randomness, thereby enhancing the global exploration ability. The position update formula is shown in Eq. (4):

$x_i^{t+1}=x_i^t+v_i^{t+1}$

(4)

In addition, by introducing formulas, the process of particle velocity and position update can be understood more intuitively. Variants such as adaptive PSO, hybrid PSO [20], and multi-objective PSO have improved the global search ability and convergence speed of PSO.

3. Model Construction

3.1 Model Assumptions

(1) The express deliveries involved in this study are standard express parcels, and it is assumed that all express goods are homogeneous products.

(2) Each express parcel is delivered only once per day, and all parcels will be directly delivered to the corresponding last-mile joint delivery outlet or returned directly, without considering special situations such as re-delivery.

(3) The locations and demand volumes of last-mile joint delivery outlets are known, and the business volume of each outlet is relatively stable within a certain period, without large fluctuations.

(4) The efficiency of delivery vehicles arriving at last-mile joint delivery outlets is directly affected by factors such as travel distance and road conditions.

(5) The fixed cost, operating cost, and maximum express handling capacity of candidate locations for the collaborative sorting and joint delivery grassroots distribution centers are all known.

(6) Transportation cost and congestion cost are both linearly proportional to the vehicle travel distance.

(7) The starting point of all delivery tasks is the collaborative sorting and joint delivery grassroots distribution center, and delivery vehicles must return to the center after completing the tasks.

(8) The travel distance of vehicles must not exceed the specified maximum travel range.

(9) The carbon emissions of delivery vehicles are mainly determined by the travel mileage, and it is assumed that the carbon emission factor per unit mileage of all light logistics trucks is a constant.

3.2 Case Description

This study collected the longitude and latitude coordinates of the distribution of 86 delivery nodes from 4 companies in Xiqing District. ArcMap was used to generate the location map of the region, as shown in Figure 1, to visually display the geographic distribution and relative positional relationships of each delivery node.

Figure 1. Distribution map of all communities in Xiqing District and its surrounding areas

3.3 Method Framework Construction

This study constructs a three-level optimization framework of “region division—mode prediction—path planning,” as shown in Figure 2. First, a logistics network model is constructed based on complex network theory, where delivery nodes are abstracted as nodes and road relationships between nodes are abstracted as edges. Second, the Louvain algorithm is applied for community detection to realize intelligent region division. Then, the results are used as spatial constraints and input into the XGBoost model to predict the optimal service mode for each region. Finally, the parameters of the PSO path optimization model are adjusted according to the predicted modes to achieve full-chain optimization.

3.4 Complex Network Construction

In view of the particularity of express logistics in Xiqing District, this study introduces three transformation rules according to the characteristics of logistics distribution:

(1) Each residential area, commercial area, and other delivery regions in Xiqing District are transformed into nodes in the network;

Figure 2. Overall optimization framework flowchart

(2) The logistics relationships between delivery regions are transformed into edges in the network;

(3) The weights of edges correspond to the difficulty of delivery between regions, such as distance and traffic factors.

According to statistical data, the network model constructed in this study consists of 86 nodes and 134 edges, which reflects the road connection conditions among the 86 communities. To reflect the actual delivery difficulty, roads are divided into three types: main roads, secondary roads, and branch roads. The Analytic Hierarchy Process (AHP) and entropy weight method are combined for weighting, and the Technique for Order Preference by Similarity to Ideal Solution (TOPSIS) method is used to obtain the results to determine the edge weights. The final weights of the three types of roads are shown in Table 1.

Table 1. Summary of road types in the network model

Road Type	Weight	Description
Type 1	0.2675	Main roads, good road conditions
Type 2	0.2760	Secondary roads, moderate congestion
Type 3	0.4565	Branch roads, high delivery difficulty

Based on the above rules, the network modeling and visualization are completed, providing a basis for subsequent Louvain community detection. According to the road conditions and routes from the Gaode Map, the complex network modeling, route assignment, and community node settings are completed, with a total of 134 rows, thereby constructing the community network model of the region and visualizing it. The effect is shown in Figure 3.

3.5 Region Division Model Based on Louvain Algorithm

The traditional Louvain algorithm is prone to falling into local optimal solutions during the community merging stage. This study introduces a modularity gain acceleration strategy. According to the above method, two-stage community detection is carried out on the complex network. The modularity in the first stage is 0.685, and the result is not ideal, so a second network reconstruction is performed. Finally, the modularity is improved to 0.789, and the optimization effect is significant. The final division results are shown in Figure 4.

At the same time, mainstream clustering algorithms are compared together, as shown in Table 2. The Louvain algorithm shows outstanding performance in handling large-scale networks due to its high efficiency and scalability, and it can discover multi-level community structures.

After the screening and evaluation of multiple clustering algorithms, the region division model for joint last-mile delivery of express logistics based on the improved Louvain algorithm successfully realizes the community division of the complex network, and reasonably divides the 86 nodes into 11 closely related delivery regions in the case area, achieving fast and efficient processing of the delivery process.

Figure 3. Network connection diagram

Figure 4. Modular aggregation scheme designed by the Louvain algorithm

Table 2. Aggregation index comparison of different algorithms

Algorithm	Modularity	Number of Communities	Coverage	Internal Density	Scalability	Average Path Length
Louvain	0.789	11	1.0	0.388	0.132	1.961
Label propagation algorithm	0.639	19	1.0	0.703	0.334	1.393
Local centrality-based community detection algorithm	0.689	7	1.0	0.277	0.093	2.279
Girvan-Newman	0.184	2	1.0	0.101	0.012	4.765
Walktrap	0.107	17	1.0	0.107	0.817	1.500
Fast Newman	0.113	13	1.0	0.086	0.763	5.231

3.6 Service Mode Prediction Model Based on XGBoost

3.6.1 Data collection

Based on the aforementioned algorithm principles and schemes, this study collected relevant information from communities near some A/C/S/J outlets and ordinary communities in Xiqing District, Tianjin, through field investigation and map data analysis. The selected variables cover seven indicators, including community main type, number of residential areas, number of buildings, community coverage area, population density level, age of residential areas, and number of nearby subway stations. At the same time, combined with publicly available data on fresh express delivery, the relevant data of 204 last-mile communities were organized. These data will be used as samples input into the XGBoost algorithm to train the final classifier model.

3.6.2 Model training and optimization

The model uses OneHotEncoder for one-hot encoding to process categorical features, and divides 70% of the data into the training set and 30% into the test set for model construction and evaluation.

It can be seen from Figure 5 that the error rate performance of Random Forest, Adaptive Boosting (AdaBoost), and XGBoost models shows significant differences under different numbers of iterations. XGBoost (green curve) exhibits the lowest and relatively stable error rate at all iteration stages. Especially at 50 iterations, its error rate remains below 0.1, showing strong model robustness and high efficiency.

Based on the above analysis, it is recommended to give priority to XGBoost as the main model in practical applications. According to the test results, when the mfinal parameter is set to 11, the accuracy of the test set reaches 0.878, and the combined classifier model is finally obtained. The indicator data of each algorithm are shown in Table 3.

Table 3. Indicator data corresponding to algorithms

Model	Accuracy	Recall (0)	F1-score (0)	Recall (1)	F1-score (1)	Recall (2)	F1-score (2)
Random forest	0.829	0.93	0.90	0.87	0.79	0.67	0.80
Adaptive boosting	0.829	0.93	0.90	0.87	0.79	0.67	0.80
eXtreme Gradient Boosting	0.878	0.93	0.90	0.80	0.83	0.92	0.92

Through in-depth comparative analysis, the XGBoost algorithm in this study, compared with other classification methods, not only shows more obvious advantages, but also has more prominent accuracy performance.

3.6.3 Prediction performance of application model based on eXtreme Gradient Boosting

In this case, the relevant data of 11 communities are input into the previously constructed combined classifier model, and relevant functions are used to conduct in-depth prediction and analysis of these datasets. As shown in Table 4, regions with predicted mode label 1 are regarded as large-scale logistics regions, and the adopted operation strategy is the exclusive operation strategy; regions with predicted logistics scale of 2 are regarded as medium-scale logistics regions, and the adopted operation strategy is intelligent dynamic combined delivery; regions with predicted logistics scale of 3 are regarded as small-scale logistics regions, and the adopted operation strategy is the smart locker strategy.

Figure 5. Classifier number iteration diagram

Note: AdaBoost = Adaptive Boosting; XGBoost = eXtreme Gradient Boosting.

Table 4. Logistics scale prediction results

Community Main Type	Number of Residential Areas	Number of Buildings	Community Coverage Area	Population Density Level	Age of Residential Areas	Number of Nearby Subway Stations	Predicted Mode Label
1	6	268	4.1	1	3	2	1
1	5	157	2.7	2	1	0	3
2	7	258	2.7	3	2	0	1
1	5	327	1.4	1	3	0	2
2	10	339	2.7	3	1	2	2
3	15	426	3.4	2	1	3	1
1	8	87	1.4	1	3	0	2
2	7	211	1.4	3	1	1	2
2	9	134	3.4	3	2	0	1
2	7	196	2	1	3	2	2
2	7	99	2	1	2	1	3

At the same time, the regions corresponding to specific community numbers are summarized, as shown in Table 5. Large-scale logistics regions (No. 1, 3, 6, 9) adopt the exclusive operation strategy for delivery, which can provide more professional and efficient services. Small-scale logistics regions (No. 2, 11) adopt the smart locker strategy, which is suitable for areas with small order volumes and improves delivery convenience. Medium-scale logistics regions (No. 4, 5, 7, 8, 10) adopt the intelligent dynamic combined delivery strategy, which can be flexibly adjusted according to real-time demand and can maximize delivery efficiency. The XGBoost model divides communities into three modes: exclusive operation, intelligent dynamic combination, and smart locker. In PSO path planning, a higher vehicle load rate threshold is set for exclusive operation regions, and the door-to-door time constraint is removed for smart locker regions, thereby achieving full-chain optimization adaptation.

Table 5. Text description of prediction results

Community Number	Community Logistics Scale Prediction Result
1	Large-scale logistics region (exclusive operation strategy)
2	Small-scale logistics region (smart locker strategy)
3	Large-scale logistics region (exclusive operation strategy)
4	Medium-scale logistics region (intelligent dynamic combined delivery)
5	Medium-scale logistics region (intelligent dynamic combined delivery)
6	Large-scale logistics region (exclusive operation strategy)
7	Medium-scale logistics region (intelligent dynamic combined delivery)
8	Medium-scale logistics region (intelligent dynamic combined delivery)
9	Large-scale logistics region (exclusive operation strategy)
10	Medium-scale logistics region (intelligent dynamic combined delivery)
11	Small-scale logistics region (smart locker strategy)

3.7 Joint Delivery Location Selection and Path Planning Model

To realize resource optimization of last-mile joint delivery, this section first constructs a grassroots distribution center location selection model, and then conducts path planning based on the PSO algorithm. The location selection result is used as the delivery starting point of PSO to ensure full-chain connection.

3.7.1 Location selection model

Combined with the 11 communities divided by Louvain and the service modes predicted by XGBoost, this study uses the $k$-means clustering algorithm combined with the Haversine formula to optimize the location of smart lockers. The $k$-means algorithm is used to group communities according to the number of smart lockers (1 to 4), and the Haversine formula calculates spherical distances to reflect actual travel distances.

Mathematical Model:

$\operatorname{Min} C=\sum_j F_j W_j+\sum_j V_j W_j+\sum_j P_j X_j W_j+\sum_j \sum_j c_{j k} D_k Z_{j k} d_{j k}+\sum_j \sum_k s_{j k} D_k Z_{j k} d_{j k}$

(5)

$\sum_{i<J} X_{j k}=D_k(\forall k \in K)$

(6)

$\sum_{k c K} D_k \leq \sum_{j \in J} W_j C_j$

(7)

$Z_{j k} \leq W_j(\forall j \in J, \forall \boldsymbol{k} \in \boldsymbol{K})$

(8)

$\sum_{j \in J} W_j=1$

(9)

$\sum_{i \in I} Z_{j k}=1(\forall k \in K)$

(10)

where, Eq. (5) represents an objective function, and its purpose is to minimize the total cost. Eq. (6) indicates that the business demand of all last-mile joint delivery outlets is satisfied. Eq. (7) indicates that the maximum processing capacity of the selected candidate points for the collaborative sorting and joint delivery grassroots distribution center can meet the total demand of the last-mile joint delivery outlets. Eq. (8) ensures that only when the candidate point of the collaborative sorting and joint delivery grassroots distribution center is selected, all express parcels of the last-mile joint delivery outlets can be assigned to the distribution center for processing. Eq. (9) indicates that only one collaborative sorting and joint delivery grassroots distribution center needs to be selected. Eq. (10) indicates that each last-mile joint delivery outlet must be served and can only be served by one collaborative sorting and joint delivery grassroots distribution center.

Through the location selection regions G2, G4, G5, G7, G8, G10, and G11, the results of the location selection model are obtained.

According to Figure 6 and Table 6, the placement under different numbers of smart lockers in the G2 region shows that selecting 2 smart lockers is a reasonable decision. The selection distribution diagrams of other regions are not shown additionally. The G4 region is configured with 2 smart lockers to cover most buildings; the G5 region is configured with 3 smart lockers to cover most buildings; the G7 region is configured with 2 smart lockers to improve service coverage; the G8 region is configured with 3 smart lockers to meet the needs of most buildings; the G10 region is recommended to be configured with 2 smart lockers to achieve moderate coverage; the G11 region is configured with 3 smart lockers to ensure extensive service. Such configurations aim to effectively balance service coverage and cost, improve the efficiency and economy of express delivery services, and meet the needs of buildings within the regions.

Figure 6. Location selection study of G2 region

Table 6. Haversine distance of G2 region under given location points

Index	No.	Longitude	Latitude	Distance to Locker1	Distance to Locker2	Distance to Locker3	Distance to Locker4
0	2	117.0601	39.1021	1445.388	2289.447	0	0
1	9	117.0993	39.0978	5567.843	2110.434	1052.048	0
2	29	117.0825	39.1171	3586.371	616.541	1051.98	0
3	48	117.0300	39.1419	2498.775	551.508	551.508	551.50
4	54	117.0177	39.1355	3700.108	1109.043	1109.043	1109.043
5	55	117.0323	39.1275	2028.524	581.407	581.407	581.407
6	57	117.0303	39.1245	2231.502	501.522	501.522	501.522

3.7.2 Path planning model

This path optimization model aims to optimize delivery routes and reduce transportation cost and time consumption by calculating the distance matrix between delivery points and combining it with optimization algorithms.

Mathematical Model:

$\operatorname{Minimize} C=\sum_{j \in J} \sum_{k \in K}\left(j_{k c} \cdot z_{j k} \cdot d_{j k}+j_{k s} \cdot z_{j k} \cdot d_{j k}+j_{k m} \cdot z_{j k}\right)$

(11)

where, $j_{k c}$ is the unit transportation cost from the integrated sorting and distribution hub ($j$) to the last-mile joint delivery outlet ($k$) (yuan/(parcel·km)); $j_{k s}$ is the congestion cost coefficient (the value refers to the peak and off-peak time coefficients of working days in Xiqing District, ranging from [1.0, 1.5]); $j_{k m}$ is the fixed cost coefficient of vehicle operation (unit: yuan), which can reflect vehicle depreciation, maintenance, and dispatching costs.

$\sum_{j \in J} z_{j k}=k D_k \quad \forall k \in K$

(12)

$\sum_{k \in K} z_{j k} \leq C_j \quad \forall j \in J$

(13)

$z_{j k} \leq W_j \cdot k D_k \quad \forall j \in J, \forall k \in K$

(14)

$z_{j k} \geq 0 \quad \forall j \in J, \forall k \in K$

(15)

$\sum_{j \in J} W_j=1$

(16)

$\sum_{k \in K} z_{j k} \geq \alpha \cdot C_j \quad \forall j \in J$

(17)

Eq. (11) represents the objective function to minimize the total transportation cost of delivery routes, including transportation cost and congestion cost. Eq. (12) ensures that the demand of each lastmile outlet ($k$) can be satisfied by the delivery quantity from different distribution hubs ($j$). Eq. (13) indicates that the maximum processing capacity of the selected distribution center can meet the total demand of all outlets. Eq. (14) ensures that only the selected distribution centers can allocate express parcels for processing. Eq. (15) specifies that the demand of each outlet is served by only one distribution center. Eq. (16) ensures that all delivery quantities are non-negative, which conforms to practical operation requirements, and ensures that the sum of weight coefficients of all distribution hubs is 1 , thereby ensuring the consistency and rationality of the model. Finally, in order to further reduce empty driving and ineffective transportation, it is required that a certain proportion of vehicles in the transportation capacity of each distribution hub must operate with full load as much as possible. These formulas jointly ensure the efficient operation and cost control of the delivery network.

3.7.3 Carbon emission accounting indicator design

In view of the high energy consumption problem caused by the traditional single objective of cost minimization, this study introduces carbon emission accounting indicators after the completion of path planning to evaluate the green environmental benefits of joint delivery. The calculation formula of total carbon emissions is as follows:

$E=\sum_{j \in J} \sum_{k \in K}\left(d_{j k} \cdot z_{j k} \cdot e_f\right)$

(18)

In Eq. (18), $E$ is the total carbon emissions generated by a single delivery process (kg CO$_2$); $d_{j k}$ is the travel distance between nodes; $e_f$ is the carbon emission factor per unit mileage of standard light logistics vehicles. Referring to relevant carbon emission standards, this study takes $e_f= 0.24 \mathrm{~kg} \mathrm{CO}_2 / \mathrm{km}$. This indicator, together with transportation cost, constitutes a comprehensive evaluation system for assessing the performance of the delivery model.

In the solution process of this model, the PSO algorithm is used. The path optimization regions are G1, G3, G4, G5, G6, G7, G8, G9, and G10. The following will present the model analysis process and results, where the specific coordinates of the distribution centers can refer to the calculation method of the location selection model.

Figure 7 shows the geographic layout of a regional delivery network constructed using the AnyLogic model, clearly indicating the location information of key delivery points. Next, path optimization will be carried out for each divided community set, and a three-day continuous simulation experiment will be conducted.

In each iteration, particles approach the global optimum through the velocity update mechanism, and finally form an optimal path. These nodes are the position update results of particles in different iterations, reflecting the process of searching for the optimal solution. The optimal paths of regions G1, G3, G6, and G9 are shown in Figure 8.

Figure 7. Regional network based on AnyLogic modeling

Based on the simulation results of AnyLogic, after optimization by the three-level framework, the performance of the last-mile delivery system in Xiqing District is significantly improved, as shown in Table 7. Taking the G1 region (Zhongbei Town and surrounding communities) as an example, the total length of the optimized joint delivery path is reduced from 145.6 km under the independent delivery mode to 108.7 km. Substituting into the aforementioned carbon emission accounting formula, it can be seen that a single delivery in the G1 region alone can reduce approximately 8.86 kg of carbon dioxide emissions. The cost per delivery route and carbon footprint are both significantly reduced.

Figure 8. Optimal delivery paths under Particle Swarm Optimization (PSO) algorithm in G1, G3, G6, and G9 regions

Table 7. Comparative analysis of simulation data for four typical regions G1, G3, G6, and G9

Region	Path Length Optimization Rate	Time Window Satisfaction Improvement	Average Vehicle Load Rate	Estimated Carbon Reduction
G1	25.34%	12.5%	88.6%	8.86
G3	22.10%	9.8%	85.2%	7.42
G6	28.45%	15.2%	91.3%	11.20
G9	21.80%	11.4%	84.7%	6.55

3.8 Comparative Analysis of Joint Delivery Modes

In the initial express logistics background, after incorporating the corresponding company information, a comparative experiment of joint delivery models is carried out. By independently running the delivery schemes of four companies and combining the above mathematical models, the Vehicle Routing Problem (VRP) routes and costs of each company are calculated. In these independent delivery schemes, each company plans its vehicle delivery routes according to its own business needs and customer distribution, and reduces its independent operating transportation cost through optimization algorithms. Each company is responsible for delivering goods from the warehouse to multiple customer nodes without sharing resources with other companies. According to the above research, the complete optimization diagram of joint delivery based on region division and the Pickup and Delivery Vehicle Routing Problem (PO-VRP) model is shown in Figure 9 and Figure 10.

Figure 9. Community express optimization schemes under independent delivery

Figure 10. Community express optimization schemes under joint delivery modes

As shown in Table 8, compared with independent allocation, the vehicle routing cost is reduced by 13.19% through joint allocation based on region division and the PO-VRP model. Compared with independent allocation, the overall allocation cost is reduced by 25.32%. Compared with independent allocation, the allocation cost is reduced by 19.73%. Therefore, adopting a joint delivery strategy for common customers is an effective way for logistics enterprises to reduce delivery costs.

Table 8. Cost analysis under different delivery modes

Delivery Mode	C1	C2	C3	C4	TC	E ()
A Independent delivery	300	504.03	12.83	0	816.86	58.2
S Independent delivery	300	532.32	13.54	0	845.87	61.3
J Independent delivery	200	448.65	11.42	0	660.07	47.3
C Independent delivery	300	571.97	14.58	0	886.56	64.7
Total Independent delivery	1100	2056.97	52.38	0	3209.36	231.5
Joint delivery based on region division and pickup and delivery vehicle routing problem model	1200	565.39	14.36	700.05	3079.80	173.6

Note: A = A Company; S = S Company; J = J Company; C = C Company; TC = Total Cost; E = Carbon Emissions.

3.9 Sensitivity Analysis

It can be seen from Figure 11 that with the increase of fixed vehicle cost, the total delivery cost of both strategies shows an upward trend. However, the joint delivery mode represented by the blue line always maintains a lower total delivery cost and is more economical compared with the independent delivery mode represented by the black line. In addition, as the fixed vehicle cost increases, the gap between the two costs gradually widens, which indicates that when the fixed cost is high, the advantage of the cost-sharing strategy is greater.

As shown in Figure 12, with the increase of demand, the shared delivery mode shows higher stability in cost and a relatively lower cost growth curve. In scenarios with large demand fluctuations, the shared delivery strategy has obvious advantages in cost saving and resource allocation. The sensitivity analysis results of demand fluctuation indicate that when facing uncertain market demand, the shared delivery mode has stronger adaptability and cost control ability.

Figure 11. Sensitivity analysis trend chart based on fixed cost of delivery entities

Figure 12. Sensitivity analysis trend chart of delivery cost under demand fluctuation

4. Conclusion

The three-level optimization framework proposed in this study not only verifies the collaborative potential of complex networks and machine learning in theory, but also, based on the prediction results of the XGBoost model, the delivery modes of communities of different scales (such as exclusive operation strategy for large-scale regions, intelligent dynamic combined delivery for medium-scale regions, and smart locker strategy for small-scale regions), as well as sensitivity analysis, provide operational guidance for the actual operation of the express logistics industry.

First, for old communities with large demand fluctuations, logistics enterprises should give priority to adopting the intelligent dynamic combined delivery strategy. This can flexibly adjust resource allocation, for example, increasing shared vehicles and temporary stations during peak periods, and avoiding resource idleness or overload under the traditional exclusive operation mode. In this way, enterprises can reduce the risk of delivery delay and improve customer satisfaction without large-scale infrastructure investment.

Second, in an environment with high fixed costs, logistics enterprise managers should actively promote joint delivery cooperation among multiple enterprises. By sharing grassroots sorting centers and optimizing routes, enterprises can share vehicle maintenance and operation costs and achieve cost reductions of more than 25%. It is recommended that enterprises establish cross-company data sharing platforms and use XGBoost prediction tools to dynamically adjust modes in real time to cope with market uncertainty.

In addition, this study incorporates carbon emission reduction indicators into the evaluation of delivery optimization in this region, demonstrating that the joint delivery mode can not only achieve corporate business benefits (cost reduction), but also create significant ecological environmental value (carbon reduction). Future research can further introduce a dynamic carbon tax mechanism to provide more comprehensive quantitative decision support for low-carbon city construction.

Finally, government policy makers can promote the implementation of last-mile joint delivery through subsidy mechanisms. For example, financial support can be provided for the construction and maintenance of grassroots sorting centers to encourage resource integration among enterprises. This can not only alleviate urban traffic congestion, but also improve rural coverage and promote sustainable logistics development. Overall, these insights emphasize the transformation from data-driven to collaborative optimization, helping the industry cope with the challenges brought by the growth of e-commerce.

Author Contributions

Conceptualization, L.L.; methodology, L.L. and P.Z.; software, P.Z. and J.X.; validation, P.Z. and J.X; formal analysis, P.Z. and J.X.; investigation, J.X.; resources, L.L.; data curation, P.Z. and J.X.; writing—original draft preparation, J.X.; writing—review and editing, L.L. and P.Z.; visualization, P.Z.; supervision, L.L.; project administration, L.L. All authors have read and agreed to the published version of the manuscript.

Data Availability

The data used to support the research findings are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

1.

Q. Zhao, W. Wang, and R. De Souza, “A heterogeneous fleet two-echelon capacitated location-routing model for joint delivery arising in city logistics,” Int. J. Prod. Res., vol. 56, no. 15, pp. 5062–5080, 2017. [Google Scholar] [Crossref]

2.

Y. Wang, J. Duan, J. Sun, Q. Zhang, and T. Ye, “A green vehicle routing problem with time-varying speeds and joint distribution,” Sustainability, vol. 17, no. 16, p. 7515, 2025. [Google Scholar] [Crossref]

3.

F. Lu, Z. Gao, and H. Bi, “Bi-objective green vehicle routing problem with heterogeneous regular vehicles and occasional drivers joint delivery,” Comput. Manag. Sci., vol. 22, no. 2, 2025. [Google Scholar] [Crossref]

4.

D. Wang, “Application of improved binary K-means algorithm in time and cost optimization for regional logistics distribution center location,” Informatica, vol. 49, no. 6, 2025. [Google Scholar] [Crossref]

5.

H. Jiang, F. Jiang, and Z. Chen, “Green cold chain logistics routing optimization based on chaotic mechanism hybrid SSA algorithm with fuzzy time window multi-objective multi-distribution center,” IEEE Access, vol. 12, pp. 141540–141558, 2024. [Google Scholar] [Crossref]

6.

Y. Yang, B. Jia, X. Y. Yan, Y. Chen, L. Tavasszy, M. Bok, Z. Bai, E. Liu, and Z. Gao, “Structure and dynamics of urban freight truck movements: A complex network theory perspective,” Transp. Res. Part C Emerg. Technol., vol. 158, p. 104442, 2024. [Google Scholar] [Crossref]

7.

X. Liang, Y. Wang, and M. Yang, “Systemic modeling and prediction of port container throughput using hybrid link analysis in complex networks,” Systems, vol. 12, no. 1, p. 23, 2024. [Google Scholar] [Crossref]

8.

B. Feng, N. Zhang, Z. Zhang, and L. Zhang, “Modeling of emergency logistics system based on complex network theory,” in 2022 4th International Conference on Intelligent Information Processing (IIP), Guangzhou, China, 2022, pp. 316–319. [Google Scholar] [Crossref]

9.

H. Şahin and D. İçen, “Application of random forest algorithm for the prediction of online food delivery service delay,” Turk. J. Forecast., vol. 5, no. 1, pp. 1–11, 2021. [Google Scholar] [Crossref]

10.

N. Yu, W. Xu, and K. L. Yu, “Research on regional logistics demand forecast based on improved support vector machine: A case study of Qingdao city under the new free trade zone strategy,” IEEE Access, vol. 8, pp. 9551–9564, 2020. [Google Scholar] [Crossref]

11.

L. Munkhdalai, K. H. Park, E. Batbaatar, N. Theera-Umpon, and K. H. Ryu, “Deep learning-based demand forecasting for Korean postal delivery service,” IEEE Access, vol. 8, pp. 188135–188145, 2020. [Google Scholar] [Crossref]

12.

A. Kasoju, T. Vishwakarma, and A. Kasoju, “The role of AI-enhanced fast delivery services in strengthening customer retention and loyalty in competitive markets,” Front. Artif. Intell., vol. 8, 2025. [Google Scholar] [Crossref]

13.

N. Ouertani, H. Ben-Romdhane, I. Nouaouri, H. Allaoui, and S. Krichen, “A multi-compartment VRP model for the health care waste transportation problem,” J. Comput. Sci., vol. 72, p. 102104, 2023. [Google Scholar] [Crossref]

14.

M. W. Zhang, B. Li, X. L. Qu, and Y. Guo, “Research on low carbon VRP of heterogeneous fleet based on hybrid ant colony algorithm,” Comput. Eng. Appl., vol. 56, pp. 240–249, 2020, [Online]. Available: http://cea.ceaj.org/EN/Y2020/V56/I14/240 [Google Scholar]

15.

K. Leng and S. Li, “Distribution path optimization for intelligent logistics vehicles of urban rail transportation using VRP optimization model,” IEEE Trans. Intell. Transp. Syst., vol. 23, no. 2, pp. 1661–1669, 2022. [Google Scholar] [Crossref]

16.

H. Sumardiyanto, R. Rahmat, D. A. Putra, R. N. Laksana, J. Imron, and M. Natawibawa, “Application of particle swarm optimization method in fleet transportation systems of two-wheeled automotive industry,” J. Manaj. Inform. Teknol., vol. 5, no. 1, pp. 308–317, 2025. [Google Scholar] [Crossref]

17.

H. Cherifi, G. Palla, B. K. Szymanski, and X. Lu, “On community structure in complex networks: Challenges and opportunities,” Appl. Netw. Sci., vol. 4, no. 1, 2019. [Google Scholar] [Crossref]

18.

J. Zhang, J. Fei, X. Song, and J. Feng, “An improved Louvain algorithm for community detection,” Math. Probl. Eng., vol. 2021, pp. 1–14, 2021. [Google Scholar] [Crossref]

19.

A. Behrendt, M. Savelsbergh, and H. Wang, “A prescriptive machine learning method for courier scheduling on crowdsourced delivery platforms,” Transp. Sci., vol. 57, no. 4, pp. 889–907, 2023. [Google Scholar] [Crossref]

20.

M. D. Phung and Q. P. Ha, “Safety-enhanced UAV path planning with spherical vector-based particle swarm optimization,” Appl. Soft Comput., vol. 107, p. 107376, 2021. [Google Scholar] [Crossref]

Cite this:

APA Style

IEEE Style

BibTex Style

MLA Style

Chicago Style

GB-T-7714-2015

Liu, L., Zhang P H, & Xu J W (2026). Optimization of Last-Mile Joint Delivery in Express Logistics Based on Complex Networks and Machine Learning. J. Oper. Strateg Anal., 4(1), 1-16. https://doi.org/10.56578/josa040101

cc

©2026 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.

pdf

Figure 1. Distribution map of all communities in Xiqing District and its surrounding areas

Table 1. Summary of road types in the network model

Citations

Crossref: 0