Optimizing Time-Based Heuristics for Resilient VMI Replenishment: A Simulation-Optimization Approach

jamal musbah; ibrahim badi; mouhamed bayane bouraima

Outline

Open Access

Research article

Optimizing Time-Based Heuristics for Resilient VMI Replenishment: A Simulation-Optimization Approach

Jamal Musbah¹

,

Ibrahim Badi¹^*

,

Mouhamed Bayane Bouraima²

¹

School of Engineering and Applied Sciences, Libyan Academy, 2449 Misrata, Libya

²

School of Civil Engineering, Southwest Jiao Tong University, 610031 Chengdu, China

Journal of Industrial Intelligence

|

Volume 3, Issue 2, 2025

|

Pages 105-124

https://doi.org/10.56578/jii030204

Received: 04-27-2025,

Revised: 06-09-2025,

Accepted: 06-16-2025,

Available online: 06-24-2025

View Full Article|

Download PDF

Abstract:

In critical supply chains like pharmaceuticals, balancing operational cost with service resilience is paramount. While complex adaptive models dominate academic literature on inventory routing, the potential of simpler, managerially intuitive heuristics remains underexplored, creating a gap between theory and practice. This study investigates whether a rigorously optimized, simple time-based heuristic can achieve superior performance and robustness compared to a state-of-the-art, multi-parameter adaptive policy within a stochastic Vendor-Managed Inventory (VMI) system. We formalize a time-to-stockout rule into a novel, single-parameter metaheuristic called the Optimized Urgency Threshold (OUT) policy. Using a simulation-optimization framework powered by a Genetic Algorithm, we benchmarked the OUT policy against a non-optimized heuristic and a complex Dynamic Inertial policy across five problem instances subjected to environmental shocks. The OUT policy demonstrated superior performance, achieving the lowest average total cost (€ 58,595.46) and reducing stockouts by 66.3% compared to the Dynamic Inertial model. Sensitivity analysis confirmed the OUT policy's balanced robustness to demand and capacity shocks, whereas the complex policy exhibited service failures under demand surges. Our findings show that a parsimonious, optimized heuristic can outperform a complex adaptive model, challenging the assumption that parametric complexity is necessary for high performance in stochastic IRPs. The OUT policy provides a transparent, effective, and easily implementable solution for enhancing supply chain resilience and mitigating stockouts.

Keywords: Vendor-Managed Inventory, Inventory Routing Problem, Simulation Optimization, Metaheuristic, Supply Chain Resilience, Pharmaceutical Logistics

1. Introduction

The fundamental challenge in modern supply chain management lies in navigating the trade-off between economic efficiency and service level resilience, a dichotomy that is particularly acute in pharmaceutical distribution where stockouts can directly impact patient outcomes [1], [2]. The COVID-19 pandemic exposed systemic vulnerabilities, with global medicine shortages becoming a critical public health issue, prompting calls from organizations like the WHO for more resilient and responsive supply chains [3]. In this context, collaborative strategies such as Vendor-Managed Inventory (VMI) have become essential for mitigating the bullwhip effect and enhancing operational efficiency by centralizing replenishment decisions [4], [5], [6]. This centralization is operationalized through the Inventory Routing Problem (IRP), a complex optimization task that seeks to minimize total system costs while ensuring product availability. In the pharmaceutical sector, the imperative to prevent stockouts elevates the service level from a simple performance metric to a critical component of public health infrastructure [7], [8].

To address the stochastic IRP, academic literature has primarily developed along two distinct streams: the development of sophisticated, multi-parameter adaptive policies, and the application of simpler, more intuitive heuristics. This bifurcation forms the basis of the comparison in this study. The first stream focuses on creating complex, state-dependent policies that dynamically adjust their parameters based on real-time system information, such as the inertial models that use smoothed signals to filter demand noise and have been shown to be effective in various dynamic control contexts [9], [10], [11]. The second stream employs practical, often time-based heuristics, such as rules based on a pharmacy’s time-to-stockout, which are computationally simpler and more managerially intuitive [12].

Despite the richness of these two streams, neither fully addresses the practical trade-offs faced in pharmaceutical VMI systems. This reflects the prevailing paradigm in advanced IRP policy design: multi-parameter adaptive models optimized via computationally intensive methods. However, this very complexity can hinder implementation and interpretation in practice, making their logic opaque to practitioners, a significant barrier to real-world adoption [13], [14], [15]. Conversely, simple time-based heuristics are rarely subjected to rigorous optimization in the academic literature; their core parameters are often set based on arbitrary rules, which leaves their true potential unknown and prevents a fair comparison against optimized benchmarks [16]. This leads to a critical and unresolved question in the field.

It remains an open question whether the added complexity of state-of-the-art adaptive models is truly necessary, or if a simple-yet-optimized heuristic achieves superior performance. This study addresses this gap by asking: what is the true potential of a time-based policy if it were subjected to the same rigorous optimization as its more complex counterparts? This study aims to answer precisely this question by formalizing and rigorously optimizing a simple time-based heuristic. We introduced the Optimized Urgency Threshold (OUT) policy, a novel metaheuristic that transforms a simple time-based rule into a fully optimizable model with a single, powerful parameter. Through a comprehensive simulation-optimization study, we benchmark the OUT policy against both a non-optimized heuristic and a highly complex Dynamic Inertial policy. We demonstrate that this simple-yet-optimized model achieves a superior balance of cost and service level, challenging the prevailing assumption that parametric complexity is a prerequisite for high performance in stochastic VMI systems.

2. Literature Review

Research on the IRP aims to integrate two of the most critical functions in logistics management: inventory control and vehicle routing. The field is extensive, and comprehensive reviews by Mosca et al. [17] and, more recently, various researchers provide a thorough classification of IRP variants and solution methodologies [18], [19], [20]. In this section, we first review the evolution of replenishment policies within the IRP literature, highlighting the established strengths and weaknesses of the dominant adaptive control paradigm. We then critically examine the role of simpler, time-based heuristics, identifying a significant gap between their practical relevance and their formal optimization in academic research. Finally, we establish simulation-optimization as the state-of-the-art methodology for the fair and robust comparison of inventory policies, thereby positioning our contribution.

2.1 The State-of-the-Art in Adaptive Inventory Control

To overcome the limitations of rigid, static (s,S) policies in stochastic environments [21], [22], a significant stream of research has focused on dynamic and adaptive policies. These models adjust their replenishment decisions based on real-time system state information. The methodologies range from formal dynamic programming approaches to sophisticated adaptive metaheuristics. For example, Johnn et al. [23] developed an adaptive large neighborhood search (ALNS) for the IRP that modifies its own search operators based on their past success, demonstrating the principle of responsive adaptation. Building on this, recent studies have extended ALNS-based frameworks to address increasingly realistic and uncertain environments. One approach integrates forecasting models with ALNS to anticipate demand and minimize container overflow and route failures, showing superior robustness when tested on real-world data [24]. Another line of work combines chance-constrained programming with adaptive local and large neighborhood search heuristics to generate cyclic delivery schedules under non-stationary and correlated demand, achieving near-optimal results for large-scale IRP instances [25]. Similarly, Alarcon Ortega et al. [10] modeled intra-day stochastic demand using a finite-horizon dynamic program with iterative lookahead and ALNS, reporting more than 20% cost savings compared to traditional per-period planning. Extending this direction, Cuellar-Usaquén et al. [11] proposed a stochastic lookahead approach integrating purchasing, inventory, and routing decisions under uncertain demand, prices, and supply volumes, employing adaptive learning to approximate routing costs efficiently.

Collectively, these contributions underscore the growing reliance on adaptive metaheuristics, stochastic programming, and dynamic policies to capture real-world uncertainty in inventory routing. Parallel to this, exploratory work with Deep Reinforcement Learning (DRL) has emerged, where agents learn replenishment strategies directly from simulated environments [23]. While DRL holds promise, adaptive heuristics such as ALNS remain more interpretable, computationally efficient, and closer to industrial practice, making them appropriate benchmarks for simulation-based evaluations.

A notable feature of adaptive policies is their ability to respond to system-wide conditions, such as aggregate inventory levels or backorders across the network. The Dynamic Inertial policy, employed in this study as a benchmark, exemplifies this principle by using an exponentially smoothed signal of system-wide demand urgency to balance responsiveness with robustness to daily fluctuations. This reflects the prevailing paradigm in advanced IRP policy design: multi-parameter adaptive models optimized via computationally intensive methods. For instance, integrating VMI with a Consignment Stock policy under a Robust Stochastic Optimization framework with CVaR, achieving a 14.8% cost reduction in a healthcare supply chain while highlighting the parametric complexity of such adaptive approaches [26] . However, despite their effectiveness, these models face criticism for their parametric complexity, which can hinder implementation and interpretation in practice while demanding significant computational investment [27].

2.2 The Role of Heuristics and Time-Based Metrics in Logistics

Parallel to the development of complex adaptive models, a more practical stream of research and industry application has focused on simpler, more intuitive heuristics. A common and managerially resonant metric is the concept of days of supply or its inverse, the time-to-stockout (TTS). This metric translates a raw inventory quantity into a more actionable piece of information: time. While foundational IRP literature focused primarily on inventory levels [28], the use of time-based metrics is implicit in many practical systems and has been recognized as a key performance indicator in healthcare contexts [29]. This is because TTS serves as a powerful, forward-looking indicator of risk. Indeed, recent work has focused on developing advance stockout risk estimation systems for inventory control, where the time until the next stockout is a primary output [30].

However, in the academic literature, replenishment rules based on these metrics are often presented as fixed, non-optimized heuristics. For example, a system might be designed to simply serve the N most urgent customers each day based on their TTS, a logic we implement in our VMI Urgency Heuristic benchmark. This concept of using an urgency or criticality function to create a real-time replenishment sequence has proven highly effective in practice. For instance, Cao et al. [31] developed a priority-based policy for central fill pharmacy systems where a criticality function, incorporating inventory levels and consumption rates (the core components of TTS), was able to prevent over 90% of inventory shortages. Similarly, studies on emergency shipments often trigger replenishment when the inventory level falls below a certain threshold within a review period, which is conceptually a time-based trigger [32].

While the logic of these heuristic and time-based rules is sound and service-oriented, their academic treatment often leaves their core parameters (e.g., the number of customers to visit, the criticality threshold) to be set by arbitrary means or simple rules. Their performance against rigorously optimized policies is therefore rarely evaluated on a level playing field, as their full potential remains untapped. Mesquita and Tomotani [33] made progress in this regard, but broader comparative analyses remain scarce. For instance, Askin and Xia [34] developed hybrid heuristics for the infinite-horizon IRP, primarily focusing on routing tours and visit frequencies, yet without directly contrasting such heuristics with adaptive, multi-parameter policies. Our study addresses this gap by rigorously benchmarking an optimized time-based heuristic (the OUT policy) against a state-of-the-art adaptive model (Dynamic Inertial policy) under identical stochastic conditions. This direct comparison demonstrates that optimized simplicity can outperform parametric complexity, providing a novel contribution to IRP policy design.

2.3 Simulation-Optimization for Fair Policy Comparison

Given the NP-hard nature of the IRP and the added complexity of stochastic demand, analytical solutions are intractable for problems of realistic scale [35], [36]. Consequently, simulation-optimization has become the gold standard for designing and evaluating inventory policies [37], [38], due to its inherent capability to model complex stochastic systems and evaluate policy performance under uncertainty. Within this framework, metaheuristics like Genetic Algorithms (GA) are widely and successfully used for this purpose [39], [40]. The GA is particularly favored because they do not require gradient information, which is often unavailable in complex simulations, and can effectively explore large, non-convex solution spaces.

Crucially, this approach allows for a fair and robust comparison of different policy architectures. As argued by Sörensen [41], for a comparison to be scientifically valid, each heuristic or policy must be tuned to its highest potential. Simply comparing a new, highly tuned algorithm to a poorly parameterized benchmark is a common methodological flaw. By using a GA to optimize the parameters for all tunable policies under investigation, both the complex adaptive model and our proposed simple time-based model—and benchmarking these optimized policies against the fixed, non-optimized heuristic (which serves as a practical baseline), we adhere to this rigorous standard.

2.4 Research Gap and Contribution

The literature reveals a clear and compelling research gap at the intersection of these themes. While the field has produced increasingly sophisticated adaptive policies, their high parametric complexity often creates a barrier to practical implementation [27]. In parallel, simple and intuitive time-based heuristics have demonstrated strong performance in practical settings [31], but are seldom subjected to rigorous, comparative optimization within academic literature. Their true potential relative to state-of-the-art adaptive models therefore remains an open and critical question.

This study directly addresses this gap. While some studies have explored heuristic tuning or comparison of inventory policies (e.g., [16], [42]) or VMI systems (e.g., [43], [44]), none, to our knowledge, have simultaneously achieved the formalization of a time-based heuristic into a single-parameter, rigorously optimized policy (OUT policy) and directly benchmarked its performance against a state-of-the-art multi-parameter adaptive model under identical, stochastic conditions to challenge the complexity paradigm, which is the core of this paper's novelty. This unique combination of formalization, parsimony, and direct comparative benchmarking constitutes a significant leap beyond prior fragmented research. Table 1 summarizes key studies that highlight this gap and situate our research within the broader literature.

Table 1. Comparative summary of relevant literature (2018–2024) and this study’s contribution

Author(s) and Year	Core Focus	Methodology	Contribution to This Study’s Foundation
[35]	IRP state-of-the-art	Literature synthesis	Establishes the broad academic context and importance of advanced IRP models, including dynamic and stochastic variants.
[10]	Stochastic IRP with intra-day demand	Stochastic lookahead & ALNS	Represents the state-of-the-art in complex, adaptive solution methods that the dynamic inertial policy embodies.
[31]	Pharmacy replenishment	Simulation & priority heuristic	Provides empirical evidence for the real-world effectiveness of urgency-based (time-sensitive) heuristics in a pharmaceutical setting.
[14]	Multi-product IRP solution	Modified adaptive GA	Validates the use of GAs as a state-of-the-art method for optimizing policies in complex IRPs, justifying our methodological choice.
[45]	VMI for resilient supply chains	Robust stochastic optimization	Highlights the growing importance of service level and robustness, motivating the need for better-performing policies.
This study	Optimizing time-based heuristics	Simulation-optimization (GA)	Introduces the first formalization of a time-based heuristic as an optimizable policy and proves its superiority over complex adaptive models.

3. Modeling Framework and Policy Formulation

This section details the formal mathematical model of the multi-period IRP that underpins our simulation. It then provides a rigorous formulation of the three distinct policy architectures under investigation: the state-of-the-art adaptive benchmark, a practical time-based heuristic, and our novel, optimized time-based policy. Finally, it outlines the simulation-optimization framework used to determine the optimal parameters for each policy.

3.1 General Problem Formulation

We model a multi-period stochastic IRP defined over a discrete time horizon $T = \{1, \dots, H\}$. A central depot (node 0) serves a set of $N$ pharmacies, as shown in Table 2.

Table 2. Notation list

Notation	Description
Indices and sets
$N$	The set of all pharmacies.
$T$	The set of all time periods in the planning horizon.
$i, j$	Index for transportation nodes (pharmacies), where $i, j \in N$.
$t$	Index for time periods (days), where $t \in T$.
State variables
$I_{it}$	Inventory level of pharmacy $i$ at the end of day $t$.
$B_{it}$	Backorder level (unmet demand) for pharmacy $i$ at the end of day $t$.
Stochastic variable
$D_{it}$	Stochastic demand at pharmacy $i$ on day $t$, drawn from $N(\mu_i, \sigma_i^2)$.
General system parameters
$\mu_i$	Mean daily demand for pharmacy $i$.
$dist_{ij}$	Distance between locations $i$ and $j$.
$Q$	Vehicle capacity (units).
$Cap_i$	Maximum storage capacity at pharmacy $i$.
Policy-specific parameters
$\beta$	Sensitivity parameter for Dynamic Inertial Policy $(0 < \beta < 1)$.
$\alpha$	Smoothing factor for EWMA Dynamic Inertial Policy $(0 < \alpha < 1)$.
Cost parameters
$C_{holding,t}$	Unit holding cost [€/unit/day].
$C_{stockout,t}$	Unit stockout penalty [€/unit].
$C_{transport,t}$	Transportation cost per kilometer [€/km].
Decision variable
$a_{it}$	Binary replenishment decision for pharmacy $i$ on day $t$.

Objective function:

The objective is to minimize the expected total system cost over the planning horizon, which combines transportation, inventory holding, and stockout costs:

$Minimize E\left[\sum_{t=1}^H\left(C_{\text {tramsport },t}+C_{\text {holding }, \mathrm{t}}+C_{\text {stockout }, t}\right)\right]$

(1)

where, the daily costs are defined as:

Holding cost:

$C_{\text {holding},t}=C_h \times \sum_{i=1}^N I_{i t}, \quad where I_{i t}>0$

(2)

Transportation cost:

$C_{\text {transport},t}=C_t \times VRP_Cost \left(V_t\right., dist,\left.Q\right)$

(3)

Stockout cost:

$C_{\text {stockout},t}=C_s \times \sum_{i=1}^N \max \left(0, D_{i t}-I_{i, t-1}\right)$

(4)

The set of pharmacies visited on day t, $V_t$, is determined by the specific replenishment policy in effect, where :

$V_t=\left\{i \in N \mid a_{i t}=1\right\}$

(5)

3.2 Policy Architectures

The core of this study lies in the comparison of three distinct policy architectures that determine the daily replenishment decisions ($a_{it}$). Each policy's logic is designed to address the challenge of replenishment from a different conceptual standpoint.

3.2.1 The dynamic inertial policy (state-of-the-art adaptive benchmark)

This policy represents a sophisticated, multi-parameter adaptive model. It adjusts the base reorder point ($s_i^{\text{base}}$) using a smooth signal of system-wide stress. The parameter vector to be optimized is composed of $N$ base reorder points ($s_i^{\text{base}}$) plus $\beta$ and $\alpha$, totaling $N+2$ parameters (52 parameters for $N=50$ pharmacies). These $N$ base reorder points are optimized individually within the GA framework. First, the Demand-Weighted Urgency Index (DWUI) is calculated:

$U_{t, d w}=\frac{\sum_{i=1}^n \mu_i \cdot I\left(I_{i, t-1}<s_i^{\mathrm{base}}\right)}{\sum_{j=1}^n \mu_j}$

(6)

This signal is then smoothed using an EWMA:

$U_t^{\text {smooth }}=\alpha \cdot U_{t, d w}+(1-\alpha) \cdot U_{t-1}^{\text {smooth }}$

(7)

This smooth signal modulates a dynamic reorder point, $s_{it}$, for each pharmacy:

$s_{i t}=s_i^{\text {base }} \times\left(1+\beta \cdot U_t^{\text {smooth }}\right)$

(8)

The final replenishment decision is then made by comparing the current inventory to this daily dynamic threshold. The parameter vector to be optimized is:

$\theta_{\text {inertial }}=\left[s_1^{\text {base }}, s_2^{\text {base }}, \ldots, s_n^{\text {base }}, \beta, \alpha\right]$

(9)

The complete operational logic of this policy is visually represented in Figure 1.

Figure 1. Operational framework of the dynamic inertial policy

The process begins with the collection of daily inventory and demand data from the pharmacy network (stage 1). This data is aggregated to compute the system-wide DWUI, represented as a dynamic gauge (stage 2). This raw urgency signal, which can be volatile, is then passed through an EWMA filter to produce a stable, smoothed signal (stage 3). This smoothed signal is used to calculate the final, adaptive replenishment trigger for the current day, which is then routed efficiently by the VRP solver.

3.2.2 The VMI urgency heuristic (non-optimized heuristic benchmark)

This policy uses a simple, time-based heuristic common in practice. It does not use reorder points. Instead, it calculates the TTS for each pharmacy:

$T_{s, i, t}=\frac{I_{i, t-1}}{\mu_i}$

(10)

The continuous, priority-driven replenishment cycle of this policy is visualized in Figure 2. Each day, all pharmacies in the network are ranked by their TTS, creating a dynamic service queue that prioritizes the most urgent locations. The central depot's daily logistics capacity, represented by a pre-determined fixed parameter p = 20% of the total pharmacies, chosen to represent a typical operational capacity limit, is allocated to the most urgent locations. This policy has no optimizable parameters.

Figure 2. Conceptual framework of the VMI urgency heuristic

3.2.3 The OUT policy (proposed novel policy)

Our novel policy formalizes the time-based heuristic into a metaheuristic, transforming it into a parsimonious, optimizable model by encapsulating complex decision-making into a single, optimized parameter, the Urgency Threshold ($U_{th}$). This integration of an optimization problem (finding the optimal $U_{th}$) within a heuristic framework for replenishment decisions is central to its matheuristic character. It uses the same TTS calculation from Eq. (10) but replaces the arbitrary ranking rule with the GA-optimized threshold, ($U_{th}$).

The replenishment decision rule is:

$a_{i t}=I\left(T_{s, i, t}<U_{t h}\right)$

(11)

The parameter vector to be optimized contains only this single variable:

$\theta_{\text {OUT }}=\left[U_{\text {th }}\right]$

(12)

Its operational logic, visualized in Figure 3, makes it highly interpretable and actionable precisely because of its parsimonious, single-parameter design and intuitive time-based rule. Optimized via a GA, the threshold ensures both technical rigor and managerial relevance, aligning closely with decision-makers' intuitive models.

Figure 3. Operational visualization of the OUT policy

For all policies, when a visit is triggered ($a_{it}=1$), the replenishment quantity $q_{it}$ is determined by an Order-Up-To logic, refilling the pharmacy to its maximum capacity, $Cap_i$.

3.3 Simulation-Optimization Framework

The optimal parameter vectors $\theta$ for the Dynamic Inertial and OUT policies are found by solving the following optimization problem:

$E\left[C_{\text {total }}(\theta)\right] \quad minimize _\theta$

(13)

where, $C_{\text{total}}(\theta)$ is the total cost resulting from operating the system with a given parameter vector $\theta$.

Due to the stochastic and complex nature of the problem, this is solved using a simulation-optimization approach. A GA is employed to search the parameter space, where the fitness of each candidate vector $\theta$ is evaluated as the average total cost over multiple simulation replications.

4. Methodology

The experimental methodology is designed for rigor and reproducibility, following a structured, multi-stage process as visualized in the Simulation Methodology Framework (Figure 4).

Figure 4. Simulation methodology framework

4.1 Simulation Environment and Experimental Design

A discrete-time simulation model was developed in Python 3.9, leveraging the Pandas and NumPy libraries for data management and numerical operations.

Network Structure: The model simulates a VMI network consisting of a single central depot and $N=50$ pharmacies. The study utilizes five distinct problem instances, each with a unique spatial configuration of pharmacies, to ensure the findings are not specific to a single network topology.

Simulation Horizon: Each simulation run spans a time horizon of $T=30$ days. This period is sufficiently long to observe the dynamic effects of the replenishment policies and the impact of demand stochasticity.

Stochastic Demand: Daily demand, $D_{it}$, at each pharmacy is modeled as a stochastic variable, drawn from a truncated Gaussian distribution with a unique mean ($\mu_i$) and standard deviation ($\sigma_i$) for each pharmacy. To ensure the robustness and generalizability of our findings, the entire set of experiments was conducted across 10 global random seeds, each generating a unique, year-long sequence of daily demands.

Replications: To account for short-term stochastic variations, the fitness evaluation for each candidate solution within the GA is based on the average performance across 5 distinct simulation replications, each using a different internal random seed. Each internal random seed is deterministically derived from the global seed by adding an incremental offset for each replication (e.g., global seed + replication index), ensuring independence while maintaining a reproducible sequence per global seed. This setup contributes to reducing variance and increasing confidence in fitness evaluations for each GA individual.

The full experimental design thus consists of 10 global seeds $\times$ 5 instances $\times$ 3 policies, resulting in 150 independent optimization runs and a comprehensive dataset for analysis.

4.2 Vehicle Routing Heuristic

On each simulated day, the set of pharmacies selected for replenishment, $V_t$, is passed to a VRP solver to determine the delivery routes and calculate the transportation cost. Given the NP-hard nature of the VRP, we employ a two-phase heuristic approach that is standard in IRP literature. To ensure full reproducibility, we explicitly detail the heuristic specifications as implemented in our simulation code.

Route Construction: An initial set of routes is built using a Nearest Neighbor heuristic. This greedy algorithm constructs routes iteratively by starting at the depot and sequentially adding the closest unvisited pharmacy that can be serviced without exceeding the vehicle's capacity ($Q$). This process continues until no additional customers from the daily visit list can be feasibly assigned, at which point the vehicle returns to the depot and a new route begins.

Route Improvement: The initial solution is refined using a Tabu Search metaheuristic that operates on a 2-opt neighborhood, systematically exploring improvements by swapping pairs of edges within each constructed route. To guide the search and prevent cycling, moves are stored in a tabu list with a fixed tenure of five iterations. An aspiration criterion is implemented, allowing a tabu move to be selected if it yields a solution better than the current best-known. The improvement process for each individual route terminates after 50 iterations, providing a balance between solution quality and computational efficiency. These precise specifications ensure the clarity and replicability of our VRP solution framework.

4.3 Policy Optimization via Genetic Algorithm

A GA was chosen to optimize complex, multi-parameter vectors $\theta$ for Dynamic Inertial and OUT policies. The GA is particularly well-suited for this task for several reasons:

Global Search: It is a global search metaheuristic, making it less prone to getting trapped in local optima compared to simpler gradient-based methods.

Derivative-Free: It does not require gradient information, which is essential for simulation-based optimization problems where the fitness landscape is often a “black box”.

Robustness: Its population-based approach allows it to explore a wide range of solutions, making it effective for navigating the complex and non-convex search spaces typical of IRPs [14].

The GA was configured with the following parameters, determined through preliminary tuning experiments:

Population Size: 50 individuals.

Number of Generations: 40.

Selection: Tournament Selection (tournament size = 3).

Crossover: One-point crossover for (offspring inherits prefix from one parent and suffix from the other); for scalar parameters $(\beta, \alpha)$ offspring take the arithmetic mean of parents.

Elitism: top 2 individuals copied unchanged to the next generation.

Mutation: A small, fixed mutation rate of 0.02 (2%).

Fitness evaluation: each individual evaluated by averaging performance over 5 simulation replications ($num\_replications = 5$).

The complete algorithmic framework for the GA is detailed in Algorithm 1.

Algorithm 1. Robust GA for policy optimization
1: Initialize population $P$ with $\|P\| = 50$ random solutions $\theta$.
2: for generation $g = 1$ to $G$ ($G = 40$)do
3: for each individual $p \in P$ do
4: $replication\_costs \leftarrow []
5: for replication $r = 1$ to $R$ ($R = 5$) do
6: $cost \leftarrow RunSimulation(p, seed = r)$
7: Append $cost$ to $replication\_costs$
8: end for
9: $fitness(p) \leftarrow Average(replication\_costs)$
10: end for
11: $P_{next} \leftarrow GenerateNextPopulation(P)$ // Apply elitism, tournament selection, uniform crossover, and mutation
12: Copy top-2 individuals from $P$ into $P_{next}$ // elitism
13: while $\|P_{next}\| < \|P\|$ do
14: Select $parent1, parent2$ via tournament selection (size = 3)
15: $Offspring \leftarrow OnePointCrossover(parent1, parent2)$ // $s_i^{base}$ split at random point, $\beta$ and $\alpha$ averaged
16: Add $Offspring$ to $P_{next}$
17: end while
18: $P \leftarrow P_{next}$
19: // Log best solution $\theta_{best}$ (every 10 generations)
20: end for
21: return $\theta^*$ // Best solution found

4.4 Statistical Analysis

Upon completion of all simulation runs, the resulting dataset was analyzed to determine the statistical significance of the performance differences between the policies. A one-way Analysis of Variance (ANOVA) was performed on the two primary KPIs: Final Optimized Cost and Final Stockout Quantity. When the ANOVA F-test was significant ($p < 0.05$), a Tukey's Honestly Significant Difference (HSD) post-hoc test was conducted to perform all pairwise comparisons between the policies. A significance level of $\alpha = 0.05$ was used for all statistical tests.

4.5 Sensitivity Analysis

To assess the robustness and reliability of the top-performing policies beyond their optimized baseline performance, a comprehensive post-hoc sensitivity analysis was conducted. This analysis serves two purposes: (1) to evaluate the policies' resilience to significant environmental shocks, and (2) to understand the sensitivity of each policy to perturbations in its own optimized parameters.

Environmental Sensitivity: The champion policies were subjected to three distinct environmental shock scenarios, simulated on a representative problem instance ($instance\_1$). Each scenario was run for 30 replications to ensure stable results. The scenarios were:

Demand Surge: The mean demand ($\mu_i$) for all customers was multiplicatively increased by 25% ($\mu_{i\text{-new}} = \mu_i \times 1.25$).
Variability Shock: The demand standard deviation ($\sigma_i$) for all customers was multiplicatively increased by 50% ($\sigma_{i\text{-new}} = \sigma_i \times 1.50$).
Capacity Crunch: The vehicle capacity ($Q$) was reduced by 25%.

The performance degradation, measured by the percentage increase in total cost, was used to quantify each policy's resilience.

Parameter Sensitivity: To assess the robustness of the optimized policies, we perturbed key parameters by $\pm 20\%$ from their optimal settings, one at a time, and measured resulting changes in total cost and stockout quantity. Since the VMI Urgency Heuristic lacks tunable parameters, it is excluded from this procedure. For the OUT policy, only its single threshold parameter is varied. For the Dynamic Inertial Policy, perturbations are applied separately to $\beta$ and $\alpha$ while keeping the base reorder points $s_i^{base}$ fixed at their optimal values, to isolate the effect of dynamic adjustment. A policy is considered robust if performance degrades smoothly (i.e., no abrupt jumps in cost or stockouts) under these deviations. This sensitivity analysis protocol mirrors recent state-of-the-art IRP studies—for example, Lücker et al. [46] conduct extensive parameter perturbations to test policy brittleness across benchmark instances.

This two-pronged sensitivity analysis provides critical insights into the practical applicability of each policy, moving beyond simple cost-efficiency to evaluate their performance under the stress and uncertainty characteristic of real-world supply chains.

5. Results and Analysis

The experimental campaign was designed to rigorously evaluate the performance of the three distinct policy architectures: the adaptive benchmark (Dynamic Inertial), the non-optimized heuristic, and our proposed OUT policy. The analysis is presented in four stages: a comparative analysis of baseline performance, an evaluation using a unified KPI, a robustness assessment via environmental sensitivity analysis, and a final assessment of parameter sensitivity for the optimized policies.

5.1 Baseline Performance Analysis

The grand mean performance of the three policies, averaged across all seeds and instances, is summarized in Table 3. The results reveal a clear and significant performance stratification. The novel OUT policy achieved the lowest average total cost (€58,595.46), establishing a new benchmark for economic efficiency. At the other end of the spectrum, the VMI Urgency Heuristic incurred the highest cost, approximately 21.5% greater than the OUT policy, primarily due to its aggressive, service-driven replenishment logic which leads to higher transport and holding costs.

Table 3. Grand mean performance metrics per policy (baseline conditions)

Policy Model	Avg. Total Cost	Std. Total Cost	Avg. Stockout Qty.	Std. Stockout Qty.	Per. Stockout Qty. (%)
Dynamic (s, S) inertia	62,435.71	1,745.87	116.88	26.76	0.44%
VM urgency heuristic	74,680.29	3,125.95	2.80	1.82	0.01%
OUT policy	58,595.46	1,109.51	39.29	11.66	0.14%

Crucially, a stark inverse relationship between cost and service level is observed among the benchmarks. The VMI Urgency Heuristic, while most expensive, achieved a near-perfect service level with an average stockout quantity of only 2.80 units. The Dynamic Inertial policy offered a lower cost but at the expense of a significantly higher stockout level of 116.88 units. The OUT policy successfully breaks this trade-off, achieving a low stockout quantity of 39.29 units—a 66.3% reduction compared to the Dynamic Inertial policy—while also securing the lowest total cost.

To better understand the underlying factors contributing to total cost, we decomposed the average cost for each policy into its primary components—Transport, Holding, and Stockout Penalty costs—as depicted in Figure 5. This decomposition reveals that the main difference in total cost is not operational efficiency (with transport and holding costs remaining relatively stable across policies), but rather the capacity to prevent service failures. Specifically, the superior cost performance of the OUT and Dynamic Inertial policies is almost entirely explained by their substantial reduction in stockout penalty costs, underscoring the pivotal role of service level in determining overall economic efficiency (see Figure 5).

Figure 5. Comparative breakdown of baseline costs by category, including penalty stockout cost

A stark trade-off between cost and service is observed between the benchmarks. The VMI Urgency Heuristic achieved a near-perfect service level (99.99%), while the Dynamic Inertial policy offered a lower cost but at the expense of a significantly worse service level (99.56%). As illustrated by the service level comparison in Figure 6, the OUT policy successfully broke this trade-off, achieving a high service level (99.86%) while simultaneously securing the lowest total cost.

Figure 6. Benchmarking the average service level performance of the three policies

The distributions of the two primary KPIs are visualized in Figure 7. The box plots confirm the findings from Table 3, showing a clear cost advantage for the OUT policy (Figure 7a) and a dramatic service level advantage for the VMI Urgency Heuristic (Figure 7b). Crucially, the OUT policy exhibits a much tighter distribution for stockouts compared to the Dynamic Inertial policy, signifying more consistent and reliable performance.

(a)

(b)

Figure 7. Box plot analysis of (a) total cost ; (b) stockout quantity for the three policies.

To determine the statistical significance of these differences, a one-way ANOVA was performed, followed by a Tukey HSD post-hoc test. The ANOVA was highly significant for both Final Optimized Cost (p < 0.001) and Final Stockout Quantity (p < 0.001). The Tukey HSD results (Table 4) confirm that the OUT policy is statistically significantly cheaper than the VMI Urgency Heuristic and statistically superior to the Dynamic Inertial policy in reducing stockouts.

Table 4. Tukey HSD results for the 3 policies

Group1	Group2	Mean Diff.	P-adj	Lower	Upper	Reject
OUT Policy	VMI urgency heuristic	16399.53	0.0	15254.62	17544.4	True
OUT Policy	dynamic (s, S) inertia	4154.95	0.0	3010.04	5299.86	True
VMI urgency heuristic	dynamic (s, S) inertia	-12244.57	0.0	-13389.48	-11099.6	True

The trade-off between the policies is visualized on the efficient frontier in Figure 8. The plot clearly shows that the OUT policy establishes a new dominant point on the frontier, rendering the Dynamic Inertial policy obsolete.

Figure 8. Cost–service trade-off and efficient frontier positioning of policies

5.2 Unified KPI Analysis

To synthesize the cost-service trade-off into a single, managerially relevant metric, we formulated a Unified KPI Score. This score represents a penalty-adjusted total cost, where a strategic weight is applied to each unit of stockout. For this analysis, a weight of €50 per unit was chosen, representing a moderate business impact for a service failure. The formula is:

$Unified KPI Score = Final Optimized Cost +(50 \times Final Stockout Quantity )$

(14)

The results of this analysis are presented in Table 5. When both cost and service level are considered under this unified metric, a clear stratification of policy performance emerges. The OUT Policy achieves the lowest Unified KPI Score of €60,068.76, making it the most balanced and effective overall strategy under this strategic valuation.

Table 5. Unified KPI score analysis (lower is better)

Policy Architecture	Unified KPI Score (€)
Dynamic (s, S) inertia	68,279.71
VMI Urgency Heuristic	74,820.29
OUT Policy	60,068.76

This demonstrates its superior ability to manage the cost-service trade-off effectively. The Dynamic Inertial policy, while incurring a higher raw operational cost than OUT, positioned itself as the second-best option with a Unified KPI Score of €68,279.51. The VMI Urgency Heuristic, despite its near-perfect service level (as noted in prior sections), is heavily penalized by its inherently high operational costs, resulting in the highest Unified KPI Score of €74,820.29, making it the least effective option when stockout penalties are considered. The relative performance of each policy according to this unified KPI is visualized in Figure 9.

Figure 9. Bar chart of unified KPI scores for the three policies

Furthermore, to evaluate the robustness of the results, we analyzed the sensitivity of each policy’s total cost to variations in the stock-out penalty. Figure 10 presents the simulated total costs of the three policies across a penalty range from €25 to €500 per unit. The OUT policy consistently exhibits the lowest and most stable cost trajectory, confirming its resilience. In contrast, the Dynamic Inertial policy shows a clear inflection point at approximately €125 per unit (five times the base penalty), beyond which its costs increase sharply and diverge from the other strategies. The VMI Urgency Heuristic policy, while achieving near-perfect service levels, remains relatively flat but persistently more expensive than the OUT policy, rendering it uncompetitive across the entire range. The OUT policy maintains cost leadership until the upper bound of the tested penalties, thereby demonstrating its robustness and adaptability under diverse managerial priorities. Overall, these results confirm that the OUT strategy consistently offers the most favorable cost–service trade-off.

Figure 10. Simulated total cost vs. stock-out penalty cost for the three policies

5.3 Environmental Sensitivity and Robustness Analysis

The robustness of the policies was evaluated under three disruption scenarios. Table 6 summarizes the percentage increase in total cost for each policy relative to its own baseline.

Table 6. Environmental sensitivity–percentage increase in total cost

Policy	Baseline	Demand +25% (Surge)	Variability +50% (Shock)	Capacity -25% (Crunch)
VMI urgency heuristic	75,516.93	76,850.80	75,548.65	80,171.65
Optimized threshold	60,213.61	66,526.61	63,079.83	62,316.76
Dynamic inertial	63,581.98	75,594.42	65,349.30	66,030.95

The results reveal distinct robust profiles. The VMI Urgency Heuristic is almost completely immune to demand-side shocks. The Optimized Threshold policy demonstrates strong, balanced robustness across all scenarios. In stark contrast, the Dynamic Inertial policy proved to be brittle, with its cost increasing by a substantial 18.89% during the demand surge. This cost increase was driven by a catastrophic failure in service level, as shown in Table 7.

Table 7. Environmental sensitivity–absolute stockout quantity

Policy	Baseline	Demand +25% (Surge)	Capacity -25% (Crunch)	Variability +50% (Shock)
VMI urgency heuristic	0.00	1.10	0.00	0.00
Optimized threshold	77.40	60.60	77.40	189.20
Dynamic inertial	128.90	461.50	128.90	199.10

While the VMI Urgency Heuristic maintained its near-perfect service, the stockouts for the Dynamic Inertial policy more than tripled, increasing by 258%. The Optimized Threshold policy, while seeing an increase in stockouts during the variability shock, maintained a much more controlled service level during the critical demand surge, proving its superior resilience compared to the adaptive benchmark. The results, also visualized in Figure 11, reveal distinct robustness profiles.

Figure 11. Environmental robustness analysis: total cost sensitivity across disruption scenarios

A final sensitivity analysis was conducted by perturbing the key parameters of the two optimized policies by $\pm 20\%$. The results are summarized in Table 8. The analysis reveals that the OUT policy exhibits a desirable asymmetric sensitivity. Being overly cautious (increasing $U_{th}$ by 20%) results in a negligible 1.27% cost increase while dramatically improving service. Conversely, being too aggressive (decreasing $U_{th}$ by 20%) is correctly penalized with a 7.56% cost increase. In contrast, the Dynamic Inertial policy is extremely insensitive to its own dynamic parameters, suggesting its performance is primarily driven by its base-stock levels and has reached a conceptual performance plateau.

Table 8. Parameter sensitivity analysis for optimized policies

Policy	Scenario	Final Optimized Cost	Final Stockout Quantity	Cost Increase Pct
Optimized urgency	Optimal ($U_{th} = 2.11$)	60,184.89	77.10	0.00
	$U_{th}$ -20%	64,736.91	351.00	7.56
	$U_{th}$ +20%	60,947.89	19.75	1.27
Dynamic inertial	Optimal ($\beta = 0.50, \alpha = 0.43$)	64,148.96	133.45	0.00
	$\beta$ -20%	64,065.41	140.90	-0.13
	$\beta$ +20%	64,195.75	131.35	0.07
	$\alpha$ -20%	64,105.41	133.00	-0.07
	$\alpha$ +20%	64,101.51	133.85	-0.07

6. Discussion

This study embarked on a critical examination of inventory replenishment policies within VMI systems, specifically challenging the prevailing assumption that parametric complexity is a prerequisite for high performance in stochastic environments. Our findings robustly demonstrate that a simple, rigorously optimized time-based heuristic can not only match but often surpass the performance of more complex, state-of-the-art adaptive policies. This discussion delves into the interpretation of these results, their theoretical and managerial implications, and outlines avenues for future research.

6.1 Interpretation of Findings: Simplicity Outperforms Complexity

The most striking finding is the superior performance of the OUT policy. Across baseline conditions, the OUT policy achieved the lowest average total cost (€58,595.46) while maintaining a significantly lower stockout quantity (39.29 units, a 66.3% reduction) compared to the Dynamic Inertial policy. This directly challenges the established paradigm favoring multi-parameter adaptive models [10], [23]. The cost decomposition in Figure 5 highlights that the OUT policy's economic advantage primarily stems from its effective stockout prevention, underscoring the critical role of service level in overall cost efficiency, particularly in pharmaceutical supply chains where stockout penalties are severe. The VMI Urgency Heuristic, while achieving near-perfect service (2.80 units stockout), did so at a prohibitive operational cost (€74,680.29), confirming the inefficiency of non-optimized, rigid heuristics.

The unified KPI analysis further solidified the OUT-policy’s effectiveness under varying strategic valuations of stockouts (Table 6, Figure 9). Indeed, the OUT policy consistently showed the lowest Unified KPI Score, demonstrating its superior balance of cost and service even with high stockout penalties. Further sensitivity analysis on the stockout penalty (Figure 10) unequivocally showed the OUT policy maintaining cost leadership across nearly the entire tested range. The Dynamic Inertial policy's performance rapidly deteriorated beyond a certain penalty threshold, indicating a fundamental brittleness in its underlying cost-service trade-off. This suggests that while adaptive policies aim for robustness, their inherent complexity can lead to unforeseen vulnerabilities under shifting cost structures.

The environmental sensitivity analysis (Table 6 and Table 7) highlights the distinctive robustness profile of the OUT policy. During a demand surge of +25%, the OUT policy resulted in only 60.60 stockouts, compared to 461.50 stockouts under the Dynamic Inertial policy, underscoring its superior capacity to absorb sudden disruptions. Under a +50% variability shock, the OUT policy recorded 189.20 stockouts, which accounts for merely 0.72% of total demand during the shock period. This reflects a controlled and predictable degradation of service rather than catastrophic failure, thereby sustaining operational stability. Such behavior aligns with recent findings in supply chain resilience research, which emphasize that policies built around clear threshold-based mechanisms often achieve a more stable balance between robustness and efficiency compared to dynamically adjusted strategies [46].

Finally, the parameter sensitivity analysis (Table 8) provided crucial insights into the "brittleness" of the optimized policies. The OUT policy exhibited a desirable asymmetric sensitivity: a cautious increase in the urgency threshold ($U_{th}$ +20%) led to a minor cost increase (1.27%) and significant service improvement, while an aggressive decrease was appropriately penalized (7.56% cost increase). In stark contrast, the Dynamic Inertial policy was remarkably insensitive to perturbations in its own dynamic parameters (β and α). This insensitivity, rather than indicating robustness, points to a potential performance plateau where further fine-tuning yields negligible gains. This finding has broader implications for resource allocation in research and development, suggesting diminishing returns for further investment in optimizing such complex models and thereby redirecting efforts towards the rigorous optimization of simpler, more interpretable models as discussed in foundational VMI literature like [47], [48]. It reinforces our hypothesis that the intellectual overhead of complex adaptive models might not translate into commensurate performance gains, especially when a simpler, well-optimized alternative exists.

Thirdly, our analysis of robustness and parameter sensitivity offers a nuanced perspective on policy design. The observed “brittleness” of the Dynamic Inertial policy under demand surges and its insensitivity to parameter changes suggests potential for diminishing returns in complex adaptive models, especially regarding their interpretability and tuning. In contrast, the OUT-policy’s single, clear parameter provides enhanced managerial interpretability and ease of understanding, significantly lowering the cognitive load for decision-makers. This highlights a direct benefit of simplicity and aligns with recent arguments in the supply chain literature that emphasize the value of parsimonious, interpretable models over opaque complexity [49].

6.2 Managerial Implications

The managerial implications of the OUT policy are profound, particularly for industries like pharmaceuticals where both cost efficiency and service level are paramount.

• Simplified Decision-Making and Implementation: The OUT policy offers a single, intuitive parameter (Urgency Threshold) for managers to optimize. Its operational logic, visualized in Figure 3, makes it highly interpretable and actionable precisely because of its parsimonious, single-parameter design and intuitive time-based rule, which aligns well with managers' mental models. This explicitly connects simplicity to interpretability.

• Reduced Medicine Shortages and Enhanced Public Health: By significantly reducing stockout quantities while also lowering total costs, the OUT policy directly addresses the critical public health issue of medicine shortages, a vulnerability exposed during events like the COVID-19 pandemic [3]. For pharmaceutical distributors, this translates into more reliable patient access to essential medicines.

• Cost-Effective Resilience: The OUT policy's demonstrated robustness across various environmental shocks (demand surge, variability shock, capacity crunch) means that businesses can achieve higher supply chain resilience without incurring the higher operational costs typically associated with overstocking or overly complex systems. This is a crucial advantage in today's volatile global supply chain landscape.

• Strategic Resource Allocation: The “bubble of urgency” created by the OUT policy (Figure 3) allows for more efficient vehicle routing by concentrating visits to geographically clustered, at-risk pharmacies. This spatial clustering significantly improves vehicle utilization and reduces routing costs, a well-documented advantage in inventory routing problems, particularly when integrating VRP solutions [50], [51].

• Evidence-Based Policy Design: Our simulation-optimization framework provides a rigorous method for practitioners to validate and optimize simple heuristics. This approach empowers managers to move beyond arbitrary rule-setting and leverage data-driven insights to unlock the full potential of their existing operational rules.

7. Limitations and Future Research

Despite the rigor of our methodology, this study has several limitations that offer clear directions for future research.

Firstly, the current model assumes a fixed vehicle capacity and a single depot. For this foundational comparative study, a single-depot system and a homogeneous fleet were intentionally chosen to isolate the impact of the replenishment policy design and ensure computational tractability. Future research could explore multi-depot systems, heterogeneous fleets, and dynamic vehicle capacities. The fundamental concept of TTS within the OUT policy is inherently scalable, as each depot could independently manage its assigned pharmacies based on their local TTS, akin to decentralized inventory control strategies in multi-echelon systems (e.g., [52], [53]).

Secondly, demand stochasticity was modeled using a truncated Gaussian distribution. While robust for continuous demand, real-world pharmaceutical demand can exhibit seasonality, trends, and infrequent large spikes. Future work will investigate discrete demand patterns, which are often characteristic of many pharmaceutical items, and specific non-stationary patterns such as sudden spikes or gradual trends. The OUT policy can directly accommodate such patterns by making its mean demand (µi) calculation adaptive (e.g., using exponentially EWMA) to update the TTS dynamically [54]. Furthermore, the single-parameter nature of the OUT policy facilitates adaptation to non-stationary demand by periodic re-calibration of the optimal Uth using the simulation-optimization framework, drawing inspiration from adaptive control literature [55].

Thirdly, the study focused on cost and service level as primary KPIs. Future research could incorporate other critical factors such as product perishability, cold chain requirements, and carbon emissions. The clarity and parsimony of the OUT policy would simplify its integration into multi-objective optimization frameworks that explicitly consider such metrics (e.g., NSGA-II as in [56]), offering a more actionable path to broader impact compared to integrating complex adaptive models.

Fourthly, while the GA proved effective, exploring other metaheuristics or hybrid approaches (such as Particle Swarm Optimization (PSO) or metaheuristics combining exact methods with heuristics [57] could yield further improvements.

Finally, the study was simulation-based. While providing a controlled environment for rigorous comparison, real-world pilot implementation and validation of the OUT policy in a pharmaceutical VMI system would offer invaluable practical insights. A concrete plan for such validation would involve a phased approach, starting with a small-scale trial and outlining KPIs such as actual stockout rates, average inventory levels, and transportation costs before and after OUT policy implementation [58].

8. Conclusion

This study sets out to critically evaluate the performance of simple, time-based inventory replenishment heuristics against complex, adaptive policies in stochastic VMI systems. Through a comprehensive simulation-optimization approach, we introduced and rigorously optimized the novel OUT policy. Our findings demonstrate that this parsimonious, time-based policy achieves a superior balance of cost efficiency and service level resilience, consistently outperforming both a non-optimized heuristic and a state-of-the-art multi-parameter adaptive benchmark.

The OUT policy's intellectual contribution lies in its ability to transform an intuitive time-to-stockout metric into a powerful, single-parameter decision rule. Methodologically, our use of a robust simulation-optimization framework ensures a fair and unbiased comparison, revealing that the added complexity of adaptive models does not always translate into superior performance or robustness. Managerially, the OUT policy offers a highly interpretable, easily implementable, and cost-effective solution for reducing medicine shortages and enhancing supply chain resilience in critical sectors like pharmaceuticals. It empowers practitioners to make data-driven decisions with a clear understanding of the cost-service trade-offs.

By successfully challenging the prevailing notion that complexity is synonymous with performance in stochastic VMI, this research opens new avenues for policy design, advocating for the strategic optimization of simpler, more intuitive heuristics. Future research should extend this work to multi-depot systems, incorporate more complex demand patterns, and validate the OUT policy in real-world implementations, further solidifying its role as a pragmatic and powerful tool for modern supply chain management.

Author Contributions

Conceptualization, J.M. and I.B.; Methodology, J.M.; Validation, J.M., M.B, and I.B.; formal analysis, J.M.; resources, J.M.; writing-creating the initial design, J.M.; writing-reviewing and editing, I.B and M.B.; visualization, J.M.; project management, I.B. All authors have read and agreed to the published version of the manuscript.

Data Availability

The data are available from the corresponding author upon request.

Conflicts of Interest

The authors declare no conflict of interest.

References

1.

O. Badejo and M. Ierapetritou, “Enhancing pharmaceutical supply chain resilience: A multi-objective study with disruption management,” Comput. Chem. Eng., vol. 188, p. 108769, 2024. [Google Scholar] [Crossref]

2.

K. T. Getahun, A. I. Bilal, and J. Denny Cho, “Public sector pharmaceutical distribution system and its challenges: A case of a central Ethiopian Pharmaceuticals Supply Service and selected branches,” BMC Health Serv Res., vol. 25, no. 1, p. 278, 2025. [Google Scholar] [Crossref]

3.

World Health Organization, “COVID-19 pandemic significantly impacted access to medicines for noncommunicable diseases,” 2023, p. 10. [Online]. Available: https://www.who.int/news/item/22-03-2023-covid-19-pandemic-significantly-impacted-access-to-medicines-for-noncommunicable-diseases [Google Scholar]

4.

M. Chitsaz, J. Cordeau, and R. Jans, “A unified decomposition matheuristic for assembly, production, and inventory routing,” Inf. J. Comput., vol. 31, no. 1, pp. 134–152, 2019. [Google Scholar] [Crossref]

5.

Y. Wang, “Inventory path optimization of VMI large logistics enterprises based on ant colony algorithm,” Mob. Inf. Syst., vol. 2022, no. 1, p. 5186552, 2022. [Google Scholar] [Crossref]

6.

J. Sim, “The impact of a vendor-managed inventory policy on the cash-bullwhip effect,” Int. J. Ind. Eng., vol. 31, no. 2, 2024. [Google Scholar] [Crossref]

7.

A. Paulina Avila-Torres and M. Nancy Arratia-Martinez, “Fuzzy inventory-routing problem with priority customers,” Soft Comput., vol. 28, no. 13, pp. 7947–7961, 2024. [Google Scholar] [Crossref]

8.

S. Charaf, D. Taş, P. Simme Douwe Flapper, and T. Van Woensel, “A branch-and-price algorithm for the two-echelon inventory-routing problem,” Comput. Ind. Eng., vol. 196, p. 110463, 2024. [Google Scholar] [Crossref]

9.

A. Gruler, J. Panadero, J. de Armas, A. José Moreno Pérez, and A. Angel Juan, “A variable neighborhood search simheuristic for the multiperiod inventory routing problem with stochastic demands,” Int. Trans. Oper. Res., vol. 27, no. 1, pp. 314–335, 2020. [Google Scholar] [Crossref]

10.

J. Emilio Alarcon Ortega, S. Malicki, F. Karl Doerner, and S. Minner, “Stochastic inventory routing with dynamic demands and intra-day depletion,” Comput. Oper. Res., vol. 163, p. 106503, 2024. [Google Scholar] [Crossref]

11.

D. Cuellar-Usaquén, W. Marlin Ulmer, C. Gomez, and D. Álvarez-Martínez, “Adaptive stochastic lookahead policies for dynamic multi-period purchasing and inventory routing,” Eur. J. Oper. Res., vol. 318, no. 3, pp. 1028–1041, 2024. [Google Scholar] [Crossref]

12.

Y. Hong and J. Chen, “Graph database to enhance supply chain resilience for Industry 4.0,” Int. J. Inf. Syst. Supply Chain Manag., vol. 15, no. 1, pp. 1–19, 2021. [Google Scholar] [Crossref]

13.

F. Goodarzian, H. Hosseini-Nasab, J. Muñuzuri, and M. Fakhrzad, “A multi-objective pharmaceutical supply chain network based on a robust fuzzy model: A comparison of meta-heuristics,” Appl. Soft Comput., vol. 92, p. 106331, 2020. [Google Scholar] [Crossref]

14.

M. Mahjoob, S. S. Fazeli, S. Milanlouei, L. S. Tavassoli, and M. Mirmozaffari, “A modified adaptive genetic algorithm for multi-product multi-period inventory routing problem,” Sustain. Oper. Comput., vol. 3, pp. 1–9, 2022. [Google Scholar] [Crossref]

15.

F. Goodarzian, V. Kumar, and P. Ghasemi, “A set of efficient heuristics and meta-heuristics to solve a multi-objective pharmaceutical supply chain network,” Comput. Ind. Eng., vol. 158, p. 107389, 2021. [Google Scholar] [Crossref]

16.

Z. Dai, K. Gao, and B. C. Giri, “A hybrid heuristic algorithm for cyclic inventory-routing problem with perishable products in VMI supply chain,” Expert Syst. Appl., vol. 153, p. 113322, 2020. [Google Scholar] [Crossref]

17.

A. Mosca, N. Vidyarthi, and A. Satir, “Integrated transportation–Inventory models: A review,” Oper. Res. Perspect., vol. 6, p. 100101, 2019. [Google Scholar] [Crossref]

18.

L. Liu, L. Lee, H. Seow, and C. Y. Chen, “Logistics center Location-Inventory-Routing problem optimization: A systematic review using PRISMA method,” Sustainability, vol. 14, no. 23, p. 15853, 2022. [Google Scholar] [Crossref]

19.

H. Shaabani, “A literature review of the perishable inventory routing problem,” Asian J. Shipp. Logist., vol. 38, no. 3, pp. 173–185, 2022. [Google Scholar] [Crossref]

20.

T. Iswari, A. Caris, and K. Braekers, “Analyzing the benefits of a city hub: An inventory and routing perspective,” Comput. Ind. Eng., vol. 185, p. 109629, 2023. [Google Scholar] [Crossref]

21.

S. Sanjari, T. Basar, and S. Yüksel, “Isomorphism properties of optimality and equilibrium solutions under equivalent information structure transformations: Stochastic dynamic games and teams,” SIAM J. Control Optim., vol. 61, pp. 3102–3130, 2023. [Google Scholar] [Crossref]

22.

S. Rostami, S. Creemers, and R. Leus, “Maximizing the net present value of a project under uncertainty: Activity delays and dynamic policies,” Eur. J. Oper. Res., vol. 317, pp. 16–24, 2024. [Google Scholar] [Crossref]

23.

S. N. Johnn, V. A. Darvariu, J. Handl, and J. Kalcsics, ““A graph reinforcement learning framework for neural adaptive large neighbourhood search,” Comput. Oper. Res., vol. 172, p. 106791, 2024. [Google Scholar] [Crossref]

24.

I. Markov, M. Bierlaire, J.-F. Cordeau, Y. Maknoon, and S. Varone, “Waste collection inventory routing with non-stationary stochastic demands,” 2020. [Online]. Available: https://repository.tudelft.nl/record/uuid:9902404c-eab1-41b6-a9ab-0328949d1860 [Google Scholar]

25.

S. Malicki and S. Minner, “Cyclic inventory routing with dynamic safety stocks under recurring non-stationary interdependent demands,” Comput. Oper. Res., vol. 131, p. 105247, 2021. [Google Scholar] [Crossref]

26.

R. Lotfi, P. MohajerAnsari, M. M. Sharifi Nevisi, M. Afshar, S. M. Reza Davoodi, and S. S. Ali, “A viable supply chain by considering vendor-managed-inventory with a consignment stock policy and learning approach,” Results Eng., vol. 21, p. 101609, 2024. [Google Scholar] [Crossref]

27.

P. Karakostas, A. Sifaleras, and C. Michael Georgiadis, “Adaptive variable neighborhood search solution methods for the fleet size and mix pollution location-inventory-routing problem,” Expert Syst. Appl., vol. 153, p. 113444, 2020. [Google Scholar] [Crossref]

28.

A. Federgruen and P. Zipkin, “An efficient algorithm for computing optimal (s, S) policies,” Oper. Res., vol. 32, no. 6, pp. 1268–1285, 1984. [Google Scholar] [Crossref]

29.

S. Adirektawon, A. Theeraroungchaisri, and C. Rungpetch Sakulbumrungsil, “Efficiency of inventory in Thai hospitals: Comparing traditional and vendor-managed inventory systems,” Logistics, vol. 8, no. 3, p. 89, 2024. [Google Scholar] [Crossref]

30.

M. Hekimoğlu, A. Gürhan Kök, and M. Şahin, “Stockout risk estimation and expediting for repairable spare parts,” Comput. Oper. Res., vol. 138, p. 105562, 2022. [Google Scholar] [Crossref]

31.

N. Cao, A. Marcus, L. Altarawneh, and S. Kwon, “Priority-based replenishment policy for robotic dispensing in central fill pharmacy systems: A simulation-based study,” Health Care Manag Sci., vol. 26, pp. 344–362, 2023. [Google Scholar] [Crossref]

32.

S. Poormoaied, Z. Atan, and T. van Woensel, “Quantity-based emergency shipment policies,” IISE Trans., vol. 54, pp. 1186–1198, 2022. [Google Scholar] [Crossref]

33.

M. A. Mesquita and J. V. Tomotani, “Simulation-optimization of inventory control of multiple products on a single machine with sequence-dependent setup times,” Comput. Ind. Eng., vol. 174, p. 108793, 2022. [Google Scholar] [Crossref]

34.

R. G. Askin and M. Xia, “Hybrid heuristics for infinite period inventory routing problem,” Proceedings of the 2012 Industrial and Systems Engineering Research Conference. Orlando, FL, USA, 2012. [Online]. Available: https://www.semanticscholar.org/paper/Hybrid-Heuristics-for-Infinite-Period-Inventory-Askin-Xia/183075b65df8cfacef35d3e8c0d63e031f275c4f [Google Scholar]

35.

M. Fathi, M. Khakifirooz, A. Diabat, and H. Chen, “An integrated queuing-stochastic optimization hybrid Genetic Algorithm for a location-inventory supply chain network,” Int. J. Prod. Econ., vol. 237, p. 108139, 2021. [Google Scholar] [Crossref]

36.

C. Archetti, L. Peirano, and M. Grazia Speranza, “Optimization in multimodal freight transportation problems: A Survey,” Eur. J. Oper. Res., vol. 299, no. 1, pp. 1–20, 2022. [Google Scholar] [Crossref]

37.

Z. Liu and T. Nishi, “Surrogate-assisted evolutionary optimization for perishable inventory management in multi-echelon distribution systems,” Expert Syst. Appl., vol. 238, p. 122179, 2024. [Google Scholar] [Crossref]

38.

H. D. Perez, C. D. Hubbs, C. Li, and I. and Grossmann, “Algorithmic approaches to inventory management optimization,” Processes, vol. 9, p. 102, 2021. [Google Scholar] [Crossref]

39.

J. S. R. Daniel and C. Rajendran, “A simulation-based genetic algorithm for inventory optimization in a serial supply chain,” Int. Trans. Oper. Res., vol. 12, pp. 101–127, 2005. [Google Scholar] [Crossref]

40.

M. Fathi, M. Khakifirooz, A. Diabat, and H. Chen, “An integrated queuing-stochastic optimization hybrid Genetic Algorithm for a location-inventory supply chain network,” Int. J. Prod. Econ., vol. 237, p. 108139, 2021. [Google Scholar] [Crossref]

41.

K. Sörensen, “Metaheuristics—The metaphor exposed,” Int. Trans. Oper. Res., vol. 22, no. 1, pp. 3–18, 2015. [Google Scholar] [Crossref]

42.

M. Akbari Kaasgari, D. M. Imani, and M. Mahmoodjanloo, “Optimizing a vendor managed inventory (VMI) supply chain for perishable products by considering discount: Two calibrated meta-heuristic algorithms,” Comput. Ind. Eng., vol. 103, pp. 227–241, 2017. [Google Scholar] [Crossref]

43.

K. Govindan, “The optimal replenishment policy for time-varying stochastic demand under vendor managed inventory,” Eur. J. Oper. Res., vol. 242, no. 2, pp. 402–423, 2015. [Google Scholar] [Crossref]

44.

A. A. Taleizadeh, I. Shokr, I. Konstantaras, and M. VafaeiNejad, “Stock replenishment policies for a vendor-managed inventory in a retailing system,” J. Retail. Consum. Serv., vol. 55, p. 102137, 2020. [Google Scholar] [Crossref]

45.

J. Zhao, C. Archetti, T. A. Pham, and T. Vidal, “Large neighborhood and hybrid genetic search for inventory routing problems,” arXiv preprint, 2025. [Google Scholar] [Crossref]

46.

F. Lücker, A. Timonina-Farkas, and R. W. Seifert, “Balancing resilience and efficiency: A literature review on overcoming supply chain disruptions,” Prod. Oper. Manag., vol. 34, no. 6, pp. 1495–1511, 2025. [Google Scholar] [Crossref]

47.

S. Welikala, H. Lin, and P. J. Antsaklis, “Inventory consensus control in supply chain networks using dissipativity-based control and topology co-design,” arXiv preprint, 2025. [Google Scholar] [Crossref]

48.

S. Maitra, “A system-dynamic based simulation and Bayesian optimization for inventory management.” arXiv preprint, 2024. [Google Scholar] [Crossref]

49.

W. Cao and X. Wang, “Brittleness evolution model of the supply chain network based on adaptive agent graph theory under the COVID-19 pandemic,” Sustainability, vol. 14, no. 19, p. 12211, 2022. [Google Scholar] [Crossref]

50.

V. Zaruba, L. Potrashkova, O. Khoroshevskyi, and T. Chmeruk, “Construction of adaptive inventory management models for a trading enterprise under unstable conditions,” East.-Eur. J. Enterp. Technol., vol. 4, no. 4 (136), p. 6, 2025. [Google Scholar] [Crossref]

51.

S. Çetinkaya, F. Mutlu, and C. Y. C. Y. Lee, “A comparison of outbound dispatch policies for integrated inventory and transportation decisions,” Eur. J. Oper. Res., vol. 171, no. 3, pp. 1094–1112, 2006. [Google Scholar] [Crossref]

52.

M. K. Şahin and H. Yaman, “A branch and price algorithm for the heterogeneous fleet Multi-Depot Multi-Trip vehicle routing problem with time windows,” Transp. Sci., vol. 56, pp. 1636–1657, 2022. [Google Scholar] [Crossref]

53.

Y. Wang, M. Gou, S. Luo, J. Fan, and H. Wang, “The multi-depot pickup and delivery vehicle routing problem with time windows and dynamic demands,” Eng. Appl. Artif. Intell., vol. 139, p. 109700, 2025. [Google Scholar] [Crossref]

54.

S. Paull, A. Bubak, and H. Stuckenschmidt, “Machine learning for master production scheduling: Combining probabilistic forecasting with stochastic optimisation,” Expert Syst. Appl., vol. 271, p. 126586, 2025. [Google Scholar] [Crossref]

55.

R. R Corsini, A. Costa, and J. M Framinan, “An adaptive product changeover policy for a capacitated two-product supply chain in a non-stationary demand environment,” Int. J. Manag. Sci. Eng. Manag., vol. 19, no. 2, pp. 155–166, 2024. [Google Scholar] [Crossref]

56.

M. Sebatjane, “Inventory optimisation in a two-echelon cold chain: Sustainable lot-sizing and shipment decisions under carbon cap emissions regulations,” Ann. Oper. Res., 2025. [Google Scholar] [Crossref]

57.

S. Faramarzi-Oghani, P. Dolati Neghabadi, E. Talbi, and R. Tavakkoli-Moghaddam, “Meta-heuristics for sustainable supply chain management: A review,” Int. J. Prod. Res., vol. 61, no. 6, pp. 1979–2009, 2023. [Google Scholar] [Crossref]

58.

E. Razavian, A. Alem Tabriz, M. Zandieh, and M. R. Hamidizadeh, “An integrated material-financial risk-averse resilient supply chain model with a real-world application,” Comput. Ind. Eng., vol. 161, p. 107629, 2021. [Google Scholar] [Crossref]

Cite this:

APA Style

IEEE Style

BibTex Style

MLA Style

Chicago Style

GB-T-7714-2015

Musbah, J., Badi, I., & Bouraima, M. B. (2025). Optimizing Time-Based Heuristics for Resilient VMI Replenishment: A Simulation-Optimization Approach. J. Ind Intell., 3(2), 105-124. https://doi.org/10.56578/jii030204

cc

©2025 by the author(s). Published by Acadlore Publishing Services Limited, Hong Kong. This article is available for free download and can be reused and cited, provided that the original published version is credited, under the CC BY 4.0 license.

pdf

Figure 1. Operational framework of the dynamic inertial policy

Table 1. Comparative summary of relevant literature (2018–2024) and this study’s contribution

Citations

Crossref: 0