A Model for Energy Consumption Estimation in Computer Systems: The CMP Approach
Abstract:
To address the issue of estimating energy consumption in computer systems, this study investigates the contribution of various hardware parameters to energy fluctuations, as well as the correlation between these parameters. Based on this analysis, the CM model was proposed, selecting the most representative and monitorable parameters that reflect changes in system energy consumption. The CMP (Chip Multiprocessors) model adapts to different task states of the computer system by identifying primary components driving energy consumption under varying conditions. Energy consumption estimation was then conducted by monitoring these dominant parameters. Experiments across various task states demonstrate that the CMP model outperforms traditional FAN (Fuzzy Attack Net) and Cubic models, particularly when the computer system engages in data-intensive tasks.
1. Introduction
Traditional data centres, often limited in scale, have historically received insufficient attention regarding energy consumption and efficiency. However, the advent of cloud computing and large-scale data centres has introduced vast service node networks, making energy consumption a critical aspect of cost management. The growth of cloud computing is significantly constrained by electricity costs [1]. Studies indicate that the energy consumption of a service node over four years is approximately equivalent to its hardware cost. Although cloud platforms can employ energy-saving strategies, such as virtual machine migration and energy efficiency replica management, a crucial issue remains, i.e., the development of a precise model for estimating energy consumption in computer systems.
In smaller computer systems, energy usage in computing clusters can be measured directly using monitoring devices. However, in cloud computing platforms and extensive data centres, exclusive reliance on such devices is challenging, both economically and in terms of deployment feasibility.
2. Methodology
Current models for estimating the energy consumption of individual computer systems can broadly be classified into two types: proportional models and two-tier models. Both types assess energy consumption based on resource utilisation rates. Proportional models adopt a straightforward approach, assuming that a computer’s energy consumption is directly proportional to the utilisation of its hardware components. In contrast, two-tier models divide energy consumption into fixed and variable components. Fixed energy consumption is attributed to baseline functions, such as the initial power required to activate diodes, while variable energy consumption is related to system load, including Central Processing Unit (CPU) usage and disk speed. Two-tier models are generally more accurate, given that computers consume power continuously once powered on. The study by Qandeel et al. [2] indicates that an idle eight-core Xeon processor consumes approximately 60% of the energy it would under full utilisation. Nevertheless, both proportional and two-tier models establish a direct relationship between energy consumption and workload.
Several energy consumption models based on resource utilisation have been proposed. For instance, Velasquez and Filian-Gomez [3] utilised hardware counters to monitor CPU usage, thereby enabling a model for predicting overall energy consumption. Yadav et al. [4] introduced a linear model grounded in hardware performance counters, with the model by Zhang et al. [5] being among the most widely adopted. This linear model estimates energy consumption by monitoring CPU utilisation, with its fundamental expression as follows:
Based on this foundation, Baker et al. [6] introduced the Cubic model, which also centres on CPU utilisation. Google has similarly employed a CPU-utilisation-based linear model to estimate the energy consumption of each server node in its data centres [7]. In practice, however, CPU energy consumption accounts for only about 25\% of the total energy consumption of a computer system [8], with this proportion being even lower in distributed data centres [9]. Therefore, as data-intensive tasks become increasingly common, energy consumption models that rely solely on CPU utilisation as the primary variable prove insufficiently accurate in estimating total system energy consumption.
Figure 1 illustrates the energy consumption distribution across various components for three typical servers—IBM P670, Sun Ultra Sparc T2000, and a custom-designed server by Google [10]. It is evident that no single component entirely dominates energy consumption.

In response to this distribution, the CMP model was proposed in this study. The model is based on the premise that, in a computer system, the dominant contributors to energy consumption, namely the CPU, disk, and memory, vary according to the system’s task state. Thus, the model was designed by selecting a minimal set of parameters that adequately represent the load on these three components. The goal of the model is not only to enable accurate energy estimation for compute-intensive tasks but also to maintain a high level of accuracy for World Wide Web (WEB) transaction-based and data-intensive tasks.
3. Model Parameter Selection
The selection of measurement parameters is crucial to the accuracy of any computer energy consumption model. In a computer system, the three most critical components with the largest range of energy consumption variation are the CPU, disk, and memory. Other components, such as fans and power supplies, exhibit limited power variation, while elements like network cards show minimal fluctuation in power usage. Therefore, the model parameters for energy consumption were selected from these three primary components.
The criteria for parameter selection include the following: a) Representativeness: Parameters should sufficiently capture energy consumption variations in the primary components; b) Ease of monitoring: Selected parameters must be easily accessible on both Windows and Linux systems; c) Minimised parameter count: Reducing the number of parameters supports model portability and prevents additional computational burdens on the system during energy estimation.
Let P represent the power of the computer system, then:
where, $\sigma$ denotes the power of other system components apart from the CPU, disk, and memory. The power consumption of the CPU is related to its utilisation rate, which yields:
In selecting a monitoring parameter that represents changes in CPU energy consumption, % Processor Time was used. This parameter represents the percentage of time the processor spends executing non-idle threads. The determination of whether the processor is idle is performed during internal sampling intervals of the system clock (10 ms). As one of the most widely used counters for the CPU, % Processor Time effectively indicates CPU utilisation.
Memory energy consumption is primarily influenced by its utilisation rate, with an increase of 512 MB in memory capacity generally adding 1-3 W to power consumption. The energy consumption of memory can be expressed as follows [11]:
where, Pread denotes power in the read state, Pwrite represents power in the write state, Pactive denotes the active state, Pref represents the refresh state, and Ppre denotes the precharge consumption. Given that memory refresh operates automatically with a fixed frequency, both refresh and precharge can be considered constants. Thus, memory energy consumption is influenced by both utilisation and read-write frequency. For this reason, % Memory Used and Page Faults/sec were chosen as representative parameters for memory energy consumption.
% Memory Used denotes actual memory utilisation and reflects the portion of memory in use. Page Faults/sec represents the average number of page faults per second, with each fault indicating an incorrect page access. This parameter is calculated as the number of page faults per second, where a higher value of Page Faults typically indicates increased memory activity and a higher memory read-write frequency. Page faults, encompassing both hard and soft faults, are directly related to memory read-write frequency; hard faults require access to the disk. Therefore, this parameter not only reflects memory read-write frequency but also indirectly reflects disk read-write activity.
In linear models, variations in disk energy consumption are given relatively little consideration. This is because linear models are fundamentally designed for compute-intensive applications, where energy fluctuations are primarily attributed to CPU activity, with the assumption that energy changes in other components have a limited impact on overall system energy consumption or vary in alignment with CPU energy usage. However, this assumption does not hold in data-intensive tasks, where CPU utilisation is limited, with minimal variation and influence on system-wide energy consumption, while disk load and energy usage vary significantly. Disk energy consumption is analysed based on disk states and corresponding head operations. Typical disk operations include seek, read, write, and idle states, with energy consumption generally calculated as follows:
For this study, % Idle Time and Disk Bytes/sec of the disk were collected as key parameters. % Idle Time represents the disk’s idle time, during which no read, write, or seek operations are performed, indicating the time spent in idle mode. The remaining time reflects the active read, write, and seek operations. Disk Bytes/sec measures the data read or written per second, representing the specific data throughput during read/write operations.
Following the analysis of system energy consumption, five parameters were identified in this study. The next step involves analysing these parameters further to select an optimal subset for energy estimation.
Given the unique role of the CPU in computer systems, % Processor Time is an essential parameter in nearly all energy consumption models. Therefore, it was firmly established in this study. The focus then shifts to analysing the relationships among % Memory Used, Page Faults/sec, % Disk Idle Time, and Disk Bytes/sec. To illustrate the selection and analysis of parameters in detail, a DELL 2950 server configured with the corresponding parameters was used as an example, as shown in Table 1.
Component | Model |
Processor | 2*QuadCore Intel Xeon E5420 |
Motherboard | Dell PowerEdge 2950 |
Chipset | Intel Greencreek 5000X |
System memory | 8 186 MB (DDR2-667) |
Disk | 3*300 GB SAS Drives |
Network adapter | 2*Broadcom BCM5708CNetXtreme II GigE |
Table 2 presents the monitored parameter changes on the DELL 2950 server under progressively increasing loads, as assessed using the HP LoadRunner tool.
% Memory Used | Page Faults/sec | % Disk Idle Time | Disk Bytes/sec |
0.06 | 674 | 1.0 | 54 046 |
0.15 | 2 415 | 0.96 | 709 105 |
0.23 | 3 932 | 0.86 | 1 652 008 |
0.30 | 6 091 | 0.68 | 3 410 577 |
0.37 | 8 840 | 0.59 | 4 544 997 |
0.43 | 7 735 | 0.46 | 4 563 334 |
0.07 | 54 244 | 0.2 | 93 183 319 |
Since the data units in Table 2 are not uniform, standardisation of the data is required. In this study, the horizontal axis is denoted by $Y$ and the vertical axis by $X$. The mean of the $j$-th indicator sample is $\bar{Y}_j=\frac{1}{m} \sum_{i=1}^m Y_{i j}(j=1,2, \cdots, n)$, and the variance of the $j$-th indicator sample is $S_j^2=\frac{1}{m-1} \sum_{i=1}^m\left(Y_{i j}-\bar{Y}_j\right)^2(j=1,2, \cdots, n)$. Standardisation is then applied as follows: $X_{i j}=\frac{Y_{i j}-\bar{Y}_j}{S_j}$. After standardisation, the standardised matrix can be obtained as follows:
Using the standardised data, the correlation coefficient matrix $R=\left(r_{i j}\right)_{m \times m}$ can then be calculated, where:
Following these steps, a correlation coefficient matrix was derived for the four parameters, expressed as follows:
This matrix reveals the correlation between each pair of parameters.


In this analysis, $x_1, x_2, x_3$ and $x_4$ represent % Memory Used, Page Faults/sec, % Disk Idle Time, and Disk Bytes/sec, respectively. From the correlation matrix, it is evident that $x_1$ has the lowest correlation with the other three parameters, indicating that it cannot be accurately represented by them. Therefore, % Memory Used was selected as the second parameter in this study. Among the remaining three parameters, $x_2$ demonstrates the highest average correlation with the four parameters, effectively bridging information on both memory and disk components. Consequently, Page Faults/sec was chosen as the third parameter. Theoretical analysis indicates that whenever a page fault occurs, a hard fault necessitates reading or writing data from the disk, implying that disk read/write frequency correlates with page faults in memory. The relationship between Page Faults/sec and disk activity is thus proportional, and Page Faults/sec maintains a dominant relationship with % Memory Used in the parameter group. Figure 2 illustrates the trend in Page Faults/sec and disk load (read, write, and seek operations) over one minute, where a consistent trend between these parameters can be observed. This correlation suggests that Page Faults/sec accurately reflects both % Disk Idle Time and Disk Bytes/sec, theoretically and empirically capturing disk load and read/write activity.
Accordingly, % Processor Time, % Memory Used, and Page Faults/sec were selected as the fundamental parameters for modelling computer system energy consumption in this study. The next step involves developing an energy estimation model based on these parameters.
4. Development of the CMP Model
Based on the observations and analysis above, a three-parameter energy estimation model—the CMP model—was proposed in this study. The modelling process primarily considers two scenarios: compute-intensive states and non-compute-intensive states of computer systems. In compute-intensive states, CPU activity predominantly drives energy fluctuations, while memory and disk remain in low-load states, exerting minimal impact on energy consumption. Conversely, non-compute-intensive states arise mainly during WEB transactional tasks and data-intensive operations, where the CPU, memory, and disk are all moderately loaded, and energy consumption is collectively determined by these components. Historically, energy consumption models have tended to focus on compute-intensive scenarios due to the substantial CPU variation in such tasks, with minimal memory and disk fluctuations. However, in data-intensive scenarios, CPU variation is relatively minor, exerting limited influence on total system energy consumption, whereas disk energy consumption undergoes more significant changes.
The rationale for distinguishing between these two states lies in the fact that, under low memory and disk loads, their impact on overall system energy consumption is comparable to that of Input/Output (I/O), motherboard, and other devices; they resemble background noise that obscures their influence. Moreover, the CPU maintains a baseline level of energy consumption even at low utilisation, indicating that, in compute-intensive states, the CPU primarily dictates system energy variations. In non-compute-intensive states, where memory and disk utilisation reaches a certain threshold, or when energy consumption exceeds a specific range, the energy consumption of the computer system is determined collectively by the CPU, memory, and disk.
To establish thresholds for compute-intensive and non-compute-intensive states, $\Delta P_{c p u}$ was defined as ($P_{ {cpumax }}-P_{ {idle }}$) in Eq. (3). In this example, after the operating system initiates essential services (excluding test tasks), CPU load variations exceed 5%. For this server, a 5% CPU utilisation is considered a basic fluctuation threshold. Furthermore, 5% of $\Delta P_{c p u}$ represents a relatively small proportion of the system’s overall energy consumption. If the energy consumption due to disk and memory load is below 5% of $\Delta P_{c p u}$, these components cease to be primary contributors to system energy variation. Therefore, the energy variation threshold for this server was set at 5% of $\Delta P_{c p u}$ (the higher the CPU performance, the lower this threshold). Consequently, this threshold can be expressed as follows:
where, $\Delta P_{{memory }}$ and $\Delta P_{ {disk }}$ represent the difference in energy consumption between full load and idle states for memory and disk, respectively; $Memory _{ {utilization }}$ denotes memory usage; $Pf$ denotes Page Faults/sec; and $P f_{max }$ represents the average Page Faults/sec during full disk load. For this example, the compute-intensive range can be defined as follows: $Memory _{ {utilization }}$ $\leq$ 40% and $Pf$ $\leq$ 1094. These two conditions must both be satisfied for CPU to dominate energy variations; if either condition exceeds this threshold, the system energy variation is influenced collectively by CPU, memory, and disk.
% Processor Time | % Memory Used | Page Faults/sec | Energy/w |
0 | 6 | 674 | 209.12 |
19 | 23 | 3 932 | 242.83 |
64 | 37 | 5 205 | 275.53 |
75 | 43 | 7 735 | 281.92 |
93 | 8 | 406 | 294.76 |
Table 3 presents the energy consumption values of the server under different loads. Based on this data, the threshold values were used to define states, and a plot of CPU utilisation versus energy consumption in compute-intensive states is shown in Figure 3, where y represents energy consumption, and x represents CPU utilisation. The FAN model posits that CPU variations primarily drive overall system changes and therefore employs a linear model. However, according to Figure 3, and corroborated by publicly available CPU-to-energy data from Standard Performance Evaluation Corporation (SPEC) [12], it becomes apparent that even in compute-intensive tasks, the relationship between CPU utilisation and energy consumption is not strictly linear. As CPU load increases, energy consumption converges towards a fixed value, displaying a degree of curvature. Thus, the modelling approach may benefit from considering linear regression, exponential regression, and power regression models.

Using the existing data, linear regression, exponential regression, and power regression models were each applied, and their goodness-of-fit R2 was calculated. The highest R2 value, 0.9917, was achieved with the power regression model. Consequently, the energy consumption model for typical compute-intensive states adopts the power regression approach, yielding the following Eq. (10):
where, $x$ represents the percentage of time the CPU spends on non-idle threads. For non-compute-intensive states, a multiple linear regression equation can be constructed as follows:
where, $y$ represents the explained variable; computer energy consumption is the dependent variable; $x_1$, $x_2$, and $x_3$ represent the explanatory variables, which include the percentage of CPU time spent on non-idle threads, memory utilisation, and Page Faults/sec, respectively, which are independent variables; $\beta_0$ denotes the constant term; and $\varepsilon$ is the random error term.
For $n$ observed values, the equation system can be represented as follows:
$y=\beta_0+\beta_1 x_{1 i}+\beta_2 x_{2 i}+\beta_3 x_{3 i}+\varepsilon_1$
That is,
By substituting the data from the table into this equation, a system of equations can be established. Using the least squares method to minimise the sum of squared errors, the parameters were estimated to derive the model. The resulting equation is as follows:
$y=198.20737-6.23549 x_1+200.28839 x_2+0.00033479 x_3$
In summary, the CMP energy consumption model for the DELL 2950 server is as follows:
$\left\{\begin{array}{l}y=\mathrm{e}^{5.6851} x^{0.1257}\left(\text { Memory }_{\text {utilization }} \leqslant\right. \quad 40 \% \text { And Pf } \leqslant 1094) \\ y=198.20737-6.23549 x_1+200.28839 x_2+0.0003348 x_3\left(\text { Memory }_{\text {utilization }}>40 \% \text { Or Pf }>1094\right)\end{array}\right.$
5. Experiments
The CMP model was compared with the commonly used FAN and Cubic models in terms of accuracy [12]. The energy consumption model for the DELL 2950 server, developed in Section 3, was employed to evaluate these three models. For the experimental tests, SPEC's JVM2008 suite [13], [14] was selected as the compute-intensive benchmark, as it includes various algorithmic tests and serves as a representative compute-intensive workload. WEB transactional tasks were simulated using HP LoadRunner [14], [15] with 4000 virtual users, and data-intensive tasks were tested using iozone [16], [17]. The energy measurement tool used was Northmeter's Power Bay-SSM [18].
SPEC is a U.S.-based third-party standards association [19]. The SPEC-JVM suite includes a wide range of data-intensive algorithm tests, while iozone is an internationally recognised tool for file system read/write testing and is commonly employed in energy consumption studies of data-intensive operations.
As shown in Figure 4, the energy estimation differences between FAN and CMP models in the WEB transactional LoadRunner tests are minimal, with CMP showing a slight advantage, nearly equalling FAN. In the compute-intensive SPEC JVM2008 tests, the CMP model slightly outperformed the FAN model. However, in the data-intensive iozone tests, the CMP model demonstrated a marked improvement over the FAN model. These results indicate that while the FAN model maintains a certain level of accuracy, it does not consider energy-dominant components as comprehensively as the CMP model, resulting in noticeably lower accuracy in certain scenarios.

The CMP model was also applied to IBM X3650 and HP Proliant DL380 G5 for similar modelling. Test results indicate that for compute-intensive tasks, the CMP model improves accuracy over the FAN model by an average of approximately 1%. For data-intensive tasks, accuracy increases by 2-3%. Furthermore, as disk configuration volume increases within server configurations, the advantage of the CMP model becomes increasingly evident. Given that storage systems now account for 27%-40% of data centre energy consumption [20], this approach to energy consumption estimation for service nodes holds significant potential.
The experimental findings demonstrate that, in computer systems, the contributions of various components to energy consumption fluctuations vary according to the task type and system state, indicating that no single component consistently dominates energy consumption changes.
6. Conclusion
The experiments indicate that the CMP model achieves a higher accuracy in estimating computer system energy consumption than the FAN and Cubic models. Once the CMP model is established, overall energy consumption can be swiftly estimated by monitoring only three parameters within the system, demonstrating strong operational feasibility and portability.
The analysis conducted during CMP modelling reveals that managing CPU utilisation and controlling disk I/O speed (disk rotation speed) are both effective methods for reducing energy consumption and improving computer system efficiency. These methods align with current research trends in cloud computing focused on enhancing computer energy efficiency. Thus, the CMP energy model provides a foundational framework for improving power utilisation efficiency in computer systems, increasing the number of tasks completed per unit of energy consumed. Accordingly, future research should focus on leveraging these models to further explore the relationship between computer tasks and energy efficiency. Additionally, the design of operational modes to improve computer efficiency and task scheduling methods for enhancing energy efficiency on cloud platforms are recommended as areas for continued investigation.
The data used to support the research findings are available from the corresponding author upon request.
The authors declare no conflict of interest.
