Systemic Determinants of Equipment Failure in Paper Mills: A Hybrid FAST-PCA Approach for Maintenance Optimization
Abstract:
Equipment failure in paper mills represents a critical barrier to operational efficiency and the adoption of Industry 4.0 principles. To address this, a systematic literature review was conducted to identify the multifactorial determinants of such failures. A novel hybrid methodology was proposed, integrating the Functional Analysis Systems Technique (FAST), enhanced by Lean 5S (Sort “Seiri”, Set in Order “Seiton”, Shine “Seiso”, Standardize “Seiketsu”, Sustain “Shitsuke”) principles, to structure the qualitative data collection. The analysis was performed using a Pugh matrix, followed by a Principal Component Analysis (PCA) to extract knowledge systematically. This approach facilitated the development of a conceptual model for downtime causation. The PCA results indicate that two principal components collectively explain 58.5% of the observed variance in failure data. The f irst component was strongly correlated with maintenance practices and operational errors, while the second was associated with intrinsic equipment characteristics and their operating conditions. This data-driven modeling elucidates underlying correlations between disparate factors, providing a robust foundation for prioritizing targeted maintenance optimization actions. This research contributes to the field of industrial intelligence by demonstrating an original methodology for transforming qualitative systematic review data into a quantifiable analytical framework. The application of PCA to this corpus enables the identification of multidimensional interactions that are frequently overlooked in conventional analyses, thereby enriching root-cause failure analysis and informing strategic decision making for predictive maintenance. The identified factors underscore the imperative of a balanced integration between technical data and human factors for the successful digital transformation of production systems.
1. Introduction
In a global industrial context marked by the transition to Industry 4.0, manufacturing plants, particularly those in paper mills, face critical operational challenges. Among these, the increase in technical downtime due to equipment failure is a major problem, directly impacting profitability and competitiveness [1]. These interruptions not only compromise productivity but also the quality of finished products, threatening the company’s value chain [2]. While existing literature has broadly identified the causes of failure, it often presents a fragmented view of the contributing factors and neglects their systemic interactions. Furthermore, studies on this topic in the specific paper-making sector remain scarce and lack a quantitative methodology for analyzing qualitative research data. It is within this context that the present study distinguishes itself.
This study focuses on the following question: What are the systemic determinants of equipment failure and how do they interact to amplify downtime? The main objective is to provide a conceptual modeling of the major contributing factors through a hybrid approach combining functional modeling and quantitative analysis of review data. To answer this, this research relies on a two-phase approach. The first consists of a systematic literature review, structured using FAST, whose principles are complemented by those of Lean 5S. The second phase is dedicated to a detailed analysis of the extracted knowledge via a statistical analysis incorporating a Pugh matrix and PCA.
The central hypothesis posits that equipment failure is primarily influenced by two interdependent dimensions: maintenance practices and human errors, and the physical characteristics and operating conditions of the equipment. PCA was used in this study to validate this hypothesis by quantifying the contribution of each dimension to the overall variance, providing an intuitive visualization of action priorities, aligned with the needs of intelligent decision systems.
The main contribution of this work lies in demonstrating a rigorous and reproducible methodological approach to transform qualitative literature data into actionable quantitative information. The application of PCA to data structured by FAST and the Pugh matrix allows for the identification and visualization of underlying correlations that would not be apparent with a simple review. This systemic factors model is an asset for the implementation of predictive maintenance strategies by providing a scientific basis for the prioritization of actions within the context of Industry 4.0.
2. Methodology
The methodology for this review is structured around FAST, whose process is illustrated in Figure 1. This technique served to structure the logic of the research process [3]. This approach was complemented by the principles of Lean 5S [4], not in a physical application, but as a framework for the management and structuring of the literature data. This analogy helped ensure the rigor of database collection and preparation. For better clarity, the details of each step, the tools used, and their link to Lean 5S are now explained in detail in the subsections below.

Theprocess began with asystematic literature review, using two complementary approaches. The primary search was conducted via bibliographic search engines such as Google Scholar, ScienceGate, and Academia to identify the determinants of failures with the query: ( “technical downtime” OR “equipment failure”) AND (“causes” OR “root causes”). In parallel, an exploratory search was carried out to contextualize and enrich the analysis using terms such as (“benchmarking” OR “best practices”) AND “pulp and paper industry”. The articles resulting from this second search were used to understand the industry’s standards and best practices, strengthening the validity of the proposed data matrix for the assessment.
The following inclusion criteria were applied to all identified articles: scientific articles or reviews with full text, published in English between 2013 and 2025, and dealing with paper mill processes. After a review of titles, keywords, and abstracts, the analogy with Seiri (Sort) was essential. This step allowed us to select and eliminate irrelevant studies to retain a corpus of only 50 scientific articles [5–54]. This final corpus, including studies of both equipment failure and benchmarking, served as the basis for the analysis.
The data collection phase consisted of extracting the key variables from each of the 50 selected articles. These variables, such as equipment obsolescence, maintenance practices, human errors, environmental conditions, and others, were identified as potential failure factors. To quantify the qualitative data from the literature, a data extraction matrix was built, inspired by the Pugh matrix. This matrix allowed for the synthesis of information by coding the impact and frequency of each variable within each article [55]. The inclusion of benchmarking works was essential to refine this coding, as it provided a reference base for evaluating best maintenance practices and expected performance in the sector. This process is similar to Seiton (Set In Order), as the use of bibliographic management software like Zotero allowed for structured organization and management of references. It also aligns with the principle of Seiso (Shine), as it allowed for the purification of raw data and its transformation into a quantifiable and usable format for the rest of the analysis.
Each variable was coded as follows: the variables were evaluated on a 3-point Likert scale to measure their impact (1 for ‘low impact’, 2 for ‘moderate impact’, and 3 for ‘high impact’) based on how they were described in each article, and specific variables were coded in a binary fashion (0 for ‘absent’, and 1 for ‘present’) to determine the frequency of their mention in the sample. To ensure the objectivity and consistency of this process, the coding was performed independently by two researchers. The criteria for this coding were defined beforehand. For example, a factor was coded with a ‘high impact’ (3) if the context of the article explicitly designated it as a root or major cause of failure. Conversely, a ‘low impact’ (1) was assigned when the factor was mentioned as a minor influence or a secondary condition. A consensus was then established for each variable to resolve any disagreements, thus guaranteeing the objectivity of the process.
This extraction matrix thus transformed qualitative information from the 50 articles into a quantitative dataset. The dataset was then imported into SPSS software for statistical analysis, enabling dimensionality reduction and rigorous identification of the root causes of equipment failure.
PCAis a multivariate statistical method designed to reduce the dimensionality of a dataset by transforming a set of correlated variables into a new set of uncorrelated variables, known as principal components [56]. This approach facilitates data interpretation by emphasizing the most essential information [57]. As part of this research, PCA was used on the data obtained from the literature review to determine the hidden dimensions of equipment failure. This process took place in several stages, which echo the principles of Seiketsu (Standardize) and Shitsuke (Sustain).
The PCA process was conducted in several steps:
Data standardization: Since the variables were evaluated on different scales, specifically Likert scales and a binary coding, they were first standardized to prevent any one variable from having a disproportionate influence. This standardization was performed by subtracting the mean of each variable and dividing it by its standard deviation, according to Equation (1). This step is a direct application of Seiketsu (Standardize), as it formalizes and makes the process uniform.
where, XIJ is the value of observation i for variable j; µJ is the mean of variable j; and σJ is the standard deviation of variable j.
•Calculation of the correlation matrix: Once the data were standardized, the correlation matrix R was computed. This symmetric matrix was used to assess the interrelationships between the studied variables [58].
• Extraction of principal components: PCA involves solving the eigenvalue and eigenvector problem of the correlation matrix R.
where, λ is the eigenvalue (representing the variance explained by the component), and v is the associated eigenvector (representing the component’s direction). The eigenvalues were sorted in descending order, and the corresponding eigenvectors formed the axes of the new variables—the principal components.
• Componentselection and interpretation: The number of components to retain was determined using the Kaiser criterion (eigenvalues greater than 1) and by analyzing the scree plot of eigenvalues. Component interpretation was facilitated by applying Varimax rotation, which maximized factor loadings for improved clarity. The contribution of each variable to a component was assessed based on its factor loading.
PCA thus allowed us to reduce the complexity of the dataset by identifying the latent factors that explain the majority of the variance in equipment failure. This approach proved particularly relevant for prioritizing critical factors and guiding the proposal of optimized maintenance strategies, which aligns with the principle of Shitsuke (Sustain) from Lean 5S.
3. Results
The selection process began with 252 articles. After removing duplicates, 171 articles remained. Screening of titles and keywords led to the exclusion of 98 articles. A thorough review of the remaining 73 references resulted in the final selection of 50 articles for analysis.
The relevance of PCA was validated by two statistical tests, as shown in Table 1. Bartlett’s test of sphericity demonstrated a significant correlation between variables (p < 0.001), and the Kaiser-Meyer-Olkin (KMO) index yielded a score of 0.785, confirming the suitability of the data for factor analysis.
The Kaiser criterion led to the retention of two principal components with eigenvalues greater than 1. These two components accounted for 42.876% and 15.629% of the total variance, respectively, resulting in a cumulative variance of 58.505%, as shown in Table 2.
KMO measure of sampling adequacy | ,785 | |
Bartlett's test of sphericity | Approx. Chi-Square | 211,863 |
ddl | 45 | |
Sig | ,000 |
Component | Initial eigenvalues | Extraction sums of squared loadings | Rotation sums of squared loadings | |||||||
Total | % of variance | Cumulative % | Total | % of variance | Cumulative % | Total | % of variance | Cumulative % |
| |
1 | 4,288 | 42,876 | 42,876 | 4,288 | 42,876 | 42,876 | 4,110 | 41,096 | 41,096 |
|
2 | 1,563 | 15,629 | 58,505 | 1,563 | 15,629 | 58,505 | 1,741 | 17,409 | 58,505 |
|
3 | ,988 | 9,876 | 68,381 |
|
|
|
|
|
|
|
4 | ,800 | 8,005 | 76,386 |
|
|
|
|
|
|
|
5 | ,632 | 6,318 | 82,704 |
|
|
|
|
|
|
|
6 | ,551 | 5,506 | 88,209 |
|
|
|
|
|
|
|
7 | ,484 | 4,836 | 93,046 |
|
|
|
|
|
|
|
8 | ,346 | 3,464 | 96,510 |
|
|
|
|
|
|
|
9 | ,216 | 2,161 | 98,671 |
|
|
|
|
|
|
|
10 | ,133 | 1,329 | 100,000 |
|
|
|
|
|
|
|
As shown in Figure 2, the scree plot of eigenvalues confirms this choice, showinga sharp drop in eigenvalues after the second component.

The quality of the variable representation is satisfactory, with communality values exceeding 0.4 for all variables. For instance, the variance in machine maintenance level (62.2%), machine age (53.4%), and the use of advanced technologies (65.3%) is well explained by the selected components, as shown in Table 3.
The interpretation of the components was facilitated by a Varimax factor rotation, as shown in Table 4.
•Component 1: Maintenance and human factors
This component primarily aggregates variables related to maintenance strategies, personnel training, and human error management. The significant factor loadings of these variables indicate a strong correlation between quality maintenance practices and the reduction of equipment failure. This component highlights the critical role of robust maintenance processes and operational competence in ensuring equipment reliability.
• Component 2: Equipment characteristics and operating conditions This component combines variables associated with technical machine specifications (e.g., age and technologies) and operating conditions (e.g., environmental factors and usage intensity). Their high loadings emphasize the direct influence of these factors on long-term machine performance and durability.
Communalities | ||
| Initial | Extraction |
Frequency of operational or maintenance errors | 1,000 | ,622 |
Age of the machines and their likelihood of failure | 1,000 | ,534 |
Level of maintenance performed on the machines | 1,000 | ,538 |
Equipment utilization intensity | 1,000 | ,539 |
Ambient conditions in which the equipment operates | 1,000 | ,512 |
Degree of misalignment in rotating equipment | 1,000 | ,564 |
Conducting diagnostics to detect anomalies | 1,000 | ,411 |
Root causes of observed failure or malfunction | 1,000 | ,443 |
Skill level of maintenance personnel | 1,000 | ,440 |
Use of advanced technologies for failure prediction | 1,000 | ,653 |
Extraction method: PCA. |
Pattern matrix a | |||
Component | |||
1 | 2 | ||
Frequency of operational or maintenance errors | -,782 | ||
Use of advanced technologies for failure prediction | ,782 | ||
Degree of misalignment in rotating equipment | -,749 | ||
Level of maintenance performed on the machines | ,681 | ||
Skill level of maintenance personnel | ,635 | ||
Conducting diagnostics to detect anomalies | ,588 | ||
Equipment utilization intensity | ,706 | ||
Ambient conditions in which the equipment operates | ,660 | ||
Age of the machines and their likelihood of failure | ,652 | ||
Root causes of observed failure or malfunction | ,614 | ||
Extraction method: PCA; rotation method: Varimax with Kaiser normalization. | |||
a. Rotation converged in three iterations. |
4. Discussion
The present study, through PCA applied to a systematic literature review, enhanced the understanding of the determinants of equipment failure.
The results confirmed the existence of two main dimensions related to maintenance practices and human factors as well as the technical characteristics and operating conditions of the machines. The findings of this analysis align with the best practices identified in the paper industry benchmarking literature. The dimensions identified by PCA, such as the need for proactive strategies and personnel training, are fundamental elements of high-performance maintenance models. For example, the strong contribution of predictive strategies and continuous training to reliability, highlighted by PCA, aligns with the recommendations of Shabane et al. [59] and is a common characteristic of the most successful companies in the sector [60].
The study, however, innovates by quantifying the impact of equipment age and the integration of advanced technologies within the second component. These factors, often underestimated in the literature, proved to be decisive in the analysis, reinforcing the importance of considering natural wear and technological obsolescence as critical performance levers [61]. This conclusion is corroborated by the work of Xie et al. [62] in intensive production environments, which validates the importance of these aspects in modern maintenance strategies.
The implications of this research for managers and practitioners are vast. The results advocate for an integrated approach that is not limited to technical aspects but also emphasizes human factors. This holistic vision is essential for Industry 4.0, where investment in predictive technologies must be complemented by operational training of personnel capable of fully leveraging these systems. Based on the two identified components, specific strategic recommendations can be formulated:
• Priority 1: Improve maintenance practices and human skills. Component 1, linked to human and maintenance factors, explains a significant portion of the variance. Managers should prioritize investments in the continuous training of maintenance personnel, the development of clear procedures to minimize operational errors, and the implementation of proactive rather than reactive maintenance programs.
• Priority 2: Modernize equipment and optimize operating conditions. Component 2 highlights the impact of obsolescence and the work environment. Equipment renewal strategies must be planned based on performance and age data. The use of sensors and the Internet of Things (IoT) to monitor ambient conditions (temperature, humidity, andvibrations) and usage intensity in real-time would allow for anticipating equipment failure related to these factors, as suggested by the concept of predictive maintenance.
This balanced approach ensures that maintenance optimization efforts are both technically robust and operationally viable, a prerequisite for achieving the digital transformation of production systems.
5. Conclusions
This study elucidated, through a hybrid methodological approach, the key determinants of equipment failure in the paper industry. The integration of the FAST method and Lean 5S to structure the analysis, followed by PCA, revealed two major dimensions: (a) maintenance techniques and operational errors, and (b) equipment characteristics and usage conditions. These two components collectively explain 58.5% of the failure variance, supporting the hypothesis of an interdependence between technical and human factors. The adopted methodology, based on a systematic literature review and quantitative analysis, identified structural correlations often overlooked in prior research. The use of an extraction matrix followed by PCA strengthened the robustness of the conclusions while providing a foundation for action prioritization. However, the generalizability of the results is limited by the theoretical nature of the study. Future work should focus on empirical validation using field data to enhance the model’s reproducibility. It would also be relevant to extend this methodology to other manufacturing sectors and integrate artificial intelligence techniques to refine predictive capabilities.
In summary, this research contributes to the optimization of maintenance strategies in the context of Industry 4.0 by demonstrating the effectiveness of hybrid approaches combining functional modeling and data-driven analysis. It particularly highlights that the performance of smart factories relies on a balanced integration of material constraints and human factors, a prerequisite for achieving the digital transformation of production systems.
Not applicable.
The data used to support the research findings are available from the corresponding author upon request.
We would like to extend our heartfelt gratitude to the University of Antananarivo, particularly to the Doctoral School of Engineering and Geosciences, for their academic support. Our sincere thanks also go to the SPAH (paper production plant) for their valuable collaboration in this study.
The authors declare they have no conflicts of interest.
