Development of an Artificial Neural Network Model for Predicting Standard Penetration Test N-Values from Cone Penetration Test Data
Abstract:
Accurate prediction of Standard Penetration Test (SPT) blow counts from Cone Penetration Test (CPT) data is critical for reliable geotechnical characterization, particularly when SPT data are scarce or difficult to obtain. This study presents a data-driven framework that employs an Artificial Neural Network (ANN) to estimate the corrected SPT blow number ($N_{60}$) using key CPT parameters. The database was compiled from two construction sites in Nasiriyah, Iraq, comprising cone tip resistance ($q_c$), sleeve friction ($fs$), and effective overburden pressure ($\sigma_{vo}^{\prime}$) as input variables. Multiple ANN architectures were trained and validated, and optimal performance was achieved using one hidden layer with eight neurons, yielding a coefficient of determination ($R^2$) of 0.9967, and two hidden layers with six and sixteen neurons, achieving $R^2$ = 0.9976. Relative importance analysis indicated that cone tip resistance contributed 44% to the model’s predictive strength, followed by sleeve friction and effective overburden pressure, each accounting for approximately 26%. Sensitivity analysis confirmed that $N_{60}$ increases with higher input parameters, consistent with soil behavior principles. The ANN model demonstrated high accuracy and generalization capability across both sandy and clayey soils. Design charts derived from the trained model enable practical estimation of $SPT-N$ from CPT results, providing geotechnical engineers with a rapid and reliable tool for site characterization and preliminary design.
1. Introduction
Standard penetration tests (SPT) and cone penetration tests (CPT) are the most essential field tests commonly used in site investigation. The SPT was used widely worldwide because of its simplicity, and many correlations with soil parameters are available. On the other hand, the CPT was used widely because of continuous readings that can be measured through performing the test. In Iraq, this test is utilized in limited projects, particularly for significant ones. This test mainly outcomes two components: cone tip resistance ($q_c$) and sleeve friction resistance ($fs$). The number of blows is obtained directly from SPT and indirectly from CPT.
Since the CPT is performed by continuously advancing in the soil, there is no ability to conduct the SPT in the exact location. In addition, too many correlations between soil parameters and the results of SPT have been developed over decades and can be effectively used in design. Therefore, the correlation between SPT and CPT is crucial in geotechnical engineering since the SPT is widely used around the world and vast dataset of SPT with different correlations of soil parameters, such as density and shear strength and model correlations such as bearing capacity, and liquefaction, are available and used in design. Besides, the $SPT-N$ can complement the CPT to get a reliable design, especially when one of these is limited in depth and there is a gap between them. Many empirical equations have been developed [1], [2]. Most are presented as linear relationships between qc and $SPT-N$ and written as $q_c$/$N_{60}$. A wide range of ratios has been found based on the soil type. Some relations depend on the ratio that increases as the mean grain size increases. These empirical mean grain size equations depended on different correlation methods, which may give a high discrepancy in actual data. These correlations are specified on some conditions and may not apply to others. Therefore new correlations have been developed [3]. More rigorous correlation is necessary. Another correlation trend has been researched depending on machine learning approaches such as Artificial Neural Networking (ANN) and Genetic Algorithm (GA).
ANN is an intelligent technique used widely and increasingly in different applications, especially with the development of computer speed. It tries to simulate the biological structure of the human brain and nervous system [4]. ANN can imitate the nonlinear behavior between inputs and outputs. In this technique, the error between output and predicted value is minimized by adjusting the connection weights between inputs and hidden layers and between the hidden layers and output in a model called the training model. ANN is an adaptive model that simulate the behavior of human brain structure, while the other used predefined equations. This model learn the pattern directly from the data given without previous knowledge about the governed equation.
Considerable researches have used ANN models especially in the field of geotechnical engineering. Several related studies such as in pile raft foundation [5], shaft resistance [6], predicting soil physical and mechanical properties like prediction of CBR value [7], uniaxial compressive strength [8], unit weight [9], compression coefficient and electrical resistivity of soil [10].
Artificial neural networks have also been widely used to estimate CPT and SPT correlations. However, some did not correlate with specific soil, which may have affected the results. The crucial steps are the suitable data that has been relatively retrieved from measuring on approximately similar soil and the accuracy of the neural network that can predict the output.
Back-propagation ANN [11] has been used to get the SPT from CPT based on cone tip resistance ($q_c$), sleeve friction ($fs$) and effective overburden pressure ($\sigma_{vo}^{\prime}$). The study considered 109 pairs of SPT and CPT. However, the study was restricted to sand, silty sand and sandy silt soil and the accuracy of the predicted $SPT-N$ was 7–16% and 7–20% over-predicted. Several trials have been performed to find a correlation between SPT and CPT using other techniques, such as ANN and four optimized machine learning [12] and Gen algorithm [13]. Alam et al. [13] predicted the $SPT-N$ value from CPT data across various soil types such as silty sand, sandy silt, silty clay, and lean clay, using gene expression programming (GEP). The input parameter used in the GEP models are $q_c$, $fs$, and $\sigma_{vo}^{\prime}$. The results stated that the proposed models using Gen techniques either under-predict the targeted value by 3–9% or over-predict by 3–12%. Therefore, getting good correlations to estimate $SPT-N$ from CPT is still required for more range of soil and regions.
This paper aims to develop an ANN to estimate the $SPT-N$ from CPT and investigate the feasibility of ANN in estimating the $SPT-N$ from CPT. The input parameters were the $q_c$, $fs$, and $\sigma_{vo}^{\prime}$ while the single target parameter is selected as the $SPT-N$. The data was retrieved from the SPT and CPT test result, which were performed through site investigation on two projects in Nasiriyah. The study also examines the input parameters, and the variation sensitivity of the $SPT-N$ has been performed. The last research objective was to prepare a soil characteristics chart based on the ANN model to find $SPT-N$ from the CPT. This paper contributes by providing charts used to obtain the $SPT-N$ from CPT results based on ANN.
2. Preliminary Phase
Data from site investigation reports has been collected and divided into inputs and outputs as a preparation step for developing ANN. The output of the data, which is the subject of this research, is the $SPT-N$; the inputs are the measurements of cone tip resistance and friction in addition to the effective overburden pressure. The available data has been divided according to their statistical parameters. Then the number of blows at every depth are opposite to the same depth of SPT. The first case considered the data CPT-SPT in sand, silty sand and sandy silt soil. This data is part of the project located in Nasiriyah, South-East of Baghdad City, Iraq. The site was explored to build an oil depot project [14], [15]. The site investigation included four CPTs in addition to boreholes. Figure 1 shows part of data obtained from this project. The data considered is for soil has soil behavior (SBT) of 5 and 6 which represents the clean sand with or without fine soil passing sieve 200 micron, sandy silt and silty sand that represent a transition state between silt and sand. This is according to Robertson’s charts.
The second case considered the CPT-SPT in clay soil. The number of value of $SPT-N$ is modelled data against CPT. The data consist of 106 data retrieved from 24 boreholes. The data has been analyzed and the normal distribution in addition to the histogram has been presented. The project was for the oil refinery site in Nasiriyah [16], [17]. Figure 2 shows the soil profile and other soil parameters for one CPT due to lake of space.


Selecting the soil parameters that can be used to estimate the $SPT-N$ depends on the prior knowledge related to the subject of estimation $SPT-N$ from CPT. The main components that can be measured from the CPT are qc and fs. These parameters have been used as input parameters in many correlations with the engineering soil parameters.
The strength of the soil depends mainly on these two parameters. Therefore, the parameters correlated with the number of blows from SPT are also be determined by these parameters. Some researchers used the three parameters in addition to the soil behavior index and soil fine content [18] However, these two parameters are related to the type of soil.
The soil is specific in this research, so these two parameters are unnecessary at this approach as they make the analysis complex. Based on the previous correlation between CPT and SPT, the most variables that can be included in the analysis are the $q_c$, $fs$, and $\sigma_{vo}^{\prime}$. Table 1 and Table 2 show the statistical indices for the input and output used in the model of case 1 and 2 respectively. Ranges of data, in addition to minimum and maximum, are summarized in these Tables.
Stat. | $\mathbf{q_c}\left(\mathbf{k N} / \mathbf{m}^2\right)$ | $\mathbf{fs}\left(\mathbf{k N} / \mathbf{m}^{\mathbf{2}}\right)$ | $\sigma_{\mathrm{vo}}^{\prime}\left(\mathrm{kN} / \mathrm{m}^2\right)$ | $\mathbf{N}_{\bf{60}}$ |
|---|---|---|---|---|
Mean | 7061.3 | 144.8 | 85.1 | 20 |
STDEV | 4109.5 | 163.7 | 48.2 | 10 |
COV | 58.2 | 113.1 | 56.6 | 50.4 |
Kurt | -1.6 | 1.8 | -1 | -1 |
Skew | 0.2 | 1.8 | -0.7 | 0.2 |
Min | 600 | 0 | 0.1 | 2 |
Max | 14420 | 588 | 138.2 | 43 |
Range | 13820 | 588 | 138.1 | 41 |
Stat. | $\mathbf{q}_{\mathbf{c}}\left(\mathbf{k N} / \mathbf{m}^{\mathbf{2}}\right)$ | $\mathbf{f s}\left(\mathbf{k N} / \mathbf{m}^{\mathbf{2}}\right)$ | $\sigma_{\mathrm{vo}}^{\prime}\left(\mathrm{kN} / \mathrm{m}^2\right)$ | $\mathbf{N}_{\bf{60}}$ |
|---|---|---|---|---|
Mean | 389.627 | 10.648 | 69.732 | 13 |
SD | 162.436 | 3.9587 | 27.978 | 5 |
COV | 41.7 | 37.2 | 40.1 | 39 |
Kurt | 0.897339 | -0.61699 | -1.26162 | -1 |
Skew | 0.789 | 0.22 | -0.076 | 0 |
Min | 64.359 | 3.774 | 29.285 | 3 |
Max | 986.605 | 20.293 | 127.565 | 26 |
Range | 922.246 | 16.518 | 98.28 | 23 |
The dataset used in this study is divided into three subsets, one for training and the other for testing and validation. The training process on a subset and others for evaluating the complementary training is called cross-validation techniques, which can overcome the overfitting problem. As a first step, the data are randomly reshaped and then included in a procedure shown in Figure 3. The method of dividing the data into training, testing, and validation significantly affects the model’s performance. The statistical analysis of the three subsets was taken into consideration. An optimum model performance is achieved by testing the F test [19], [20].

Figure 4 shows the histogram of the three subgroups of data used in the ANN. It shows that the data approximately indicates a tendency to be normally distributed.

The input layer consists of three input parameters and one output. According to previous studies, the $SPT-N$ estimated from CPT is a function of three input parameters. Two types of networks have been examined in this study: the first type used one hidden layer [11], [12]. The number of neurons, i.e., the number of hidden nodes, has been obtained by optimization using trial and error where no significant effect on the tested data is obtained.
3. Application ANN Model for SPT
This study used traincgb, a MATLAB training function for neural networks using the Conjugate Gradient Backpropagation (CGB) algorithm. This method is an optimization technique that combines Conjugate Gradient optimization with back-propagation to adjust the weights of a neural network [21], [22]. Weights are updated to minimize the loss function by moving in the direction that the Conjugate Gradient method determines to be most effective.
In this stage, the weight between input and output is adjusted and optimized to achieve a model with good performance and suitable generalization. The Conjugate Gradient propagation has been used. In this process, the step size and the updated weight are used [23]. A line search was performed to find the optimal step size ($\alpha$) along the current search direction. This is done to minimize the loss function along this direction:
where, $w$ represents the weights and $d_k$ is the current search direction. Then, the weights were updated using the calculated step size and the current search direction:
The new gradient $g_{k+1}$ was obtained after updating the weights. The search direction was updated using the new gradient and the previous direction:
where, $\beta_{k+1}$ is the conjugate gradient parameter that ensures the new direction is conjugate to the previous ones.
The last step of the algorithm was to check the convergence. It can be examined by checking if the weight change or the gradient magnitude is below a predefined threshold. If not, the process is repeated from step 1. Figure 5 shows a detailed procedure to construct the ANN.

4. Results and Discussion
The best performance neural network model (BPNNM) was obtained to be one hidden layer with eight neurons and two hidden layers with 6 and 16 in the first and the second hidden layers using tanh function.
Figure 6 shows the architecture of the best neural network model proposed for this system of data. It consists of three inputs and one output. Two hidden layers are found to be the best. The first layer comprises six neurons and the second consists of 16 neurons. The number of neurons in each hidden layer is obtained by a procedure presented in the section. Ten trials have been performed to get the best number of neurons in each layer. The best performance network is obtained based on the smaller error and high correlation between predicted and original value. The connection between input neurons and hidden layer neurons and the connection between hidden layer neurons and output are all presented by weight and bias.

Based on the results presented in Table 3 for the architecture with one hidden layer and six cases on neurons from 4 to 20, the model of eight neurons gives the best performance. It gave the lowest error value and a high regression correlation value close to unity. The regression correlation was 0.99671; the error was MSE = 0.00148, and MAE = 0.02707. Other values are shown in Table 3.
Md No. | IN | ON | HN | Training Data | Testing Data | ||||
|---|---|---|---|---|---|---|---|---|---|
MAE | MSE | $\text{R}^2$ | MAE | MSE | $\text{R}^2$ | ||||
1 | 3 | 1 | 4 | 0.0386 | 0.0031 | 0.992 | 0.0343 | 0.0021 | 0.995 |
2 | 3 | 1 | 6 | 0.0350 | 0.0023 | 0.994 | 0.0299 | 0.0021 | 0.995 |
3 | 3 | 1 | 8 | 0.03 | 0.0016 | 0.995 | 0.0270 | 0.0014 | 0.996 |
4 | 3 | 1 | 10 | 0.035 | 0.002 | 0.994 | 0.0318 | 0.0016 | 0.996 |
5 | 3 | 1 | 16 | 0.036 | 0.002 | 0.993 | 0.0390 | 0.0025 | 0.994 |
6 | 3 | 1 | 20 | 0.0340 | 0.0023 | 0.994 | 0.0342 | 0.0020 | 0.995 |
Table 4 shows the results of the network training and testing with network architecture of two hidden layers and different number of neurons ranging from 4 to 20. It was found that the best performance was with the network having two hidden layers, six hidden neurons in the first hidden layer and 16 hidden neurons in the second hidden layer. The result shows a regression correlation of 0.9955 for the training data and 0.9976 for the testing data. Other networks also can give excellent results.
Md No. | HN1 | HN2 | Training Data | Testing Data | ||||
|---|---|---|---|---|---|---|---|---|
MAE | MSE | $\text{R}^2$ | MAE | MSE | $\text{R}^2$ | |||
7 | 4 | 4 | 0.0285 | 0.0015 | 0.996 | 0.0317 | 0.0019 | 0.994 |
8 | 4 | 6 | 0.0412 | 0.0033 | 0.992 | 0.0361 | 0.0027 | 0.992 |
9 | 4 | 8 | 0.0556 | 0.0062 | 0.984 | 0.0509 | 0.0054 | 0.987 |
10 | 4 | 10 | 0.0368 | 0.0025 | 0.994 | 0.0396 | 0.0023 | 0.995 |
11 | 4 | 16 | 0.0356 | 0.0025 | 0.994 | 0.0437 | 0.0032 | 0.993 |
12 | 4 | 20 | 0.0388 | 0.0026 | 0.993 | 0.0386 | 0.0024 | 0.995 |
13 | 6 | 4 | 0.0410 | 0.0036 | 0.991 | 0.0390 | 0.0039 | 0.9883 |
14 | 6 | 6 | 0.0386 | 0.0029 | 0.992 | 0.0329 | 0.0021 | 0.996 |
15 | 6 | 8 | 0.0357 | 0.0023 | 0.994 | 0.0308 | 0.0017 | 0.996 |
16 | 6 | 10 | 0.0337 | 0.0023 | 0.994 | 0.0366 | 0.0030 | 0.993 |
17 | 6 | 16 | 0.0293 | 0.0016 | 0.996 | 0.0265 | 0.0012 | 0.997 |
18 | 6 | 20 | 0.0506 | 0.0045 | 0.99 | 0.0429 | 0.0031 | 0.994 |
19 | 8 | 4 | 0.0300 | 0.0016 | 0.996 | 0.0286 | 0.0012 | 0.997 |
20 | 8 | 6 | 0.0361 | 0.0023 | 0.994 | 0.0408 | 0.0025 | 0.994 |
$\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ | $\cdots$ |
42 | 20 | 20 | 0.0276 | 0.0013 | 0.9963 | 0.0415 | 0.0029 | 0.993 |
Figure 7 shows the rendered visualization for comparison of predicted versus actual values of $SPT-N$ across 70 samples. The results exhibited directional compatibility and, positive and negative values maintaining similar trends. This gave a sign that the model across the samples is generalized.

Figure 8 shows the correlation between predicted $SPT-N$ and original $SPT-N$ for training and testing data. It is noted that the relation is robust and the correlation coefficient is close to unity, $R^2$ = 0.996 for both data.

Table 5 shows the weights from input neurons to neurons in the first hidden layer. The weights from the first hidden layer to second hidden layer is presented in Table 6. Table 7 shows the remaining weighs and bias from the second hidden layer to the output.
| Weight of Connection | Bias | ||||||
|---|---|---|---|---|---|---|---|
| $W_{11}$ | 1.843 | $W_{21}$ | 0.538 | $W_{31}$ | 1.238 | $bh_1$ | -2.825 |
| $W_{12}$ | 1.357 | $W_{22}$ | -0.489 | $W_{32}$ | -1.758 | $bh_2$ | -2.023 |
| $W_{13}$ | 0.297 | $W_{23}$ | -1.863 | $W_{33}$ | -1.259 | $bh_3$ | -0.883 |
| $W_{14}$ | -2.434 | $W_{24}$ | 0.218 | $W_{34}$ | 0.424 | $bh_4$ | -0.426 |
| $W_{15}$ | 2.092 | $W_{25}$ | -1.097 | $W_{35}$ | 0.091 | $bh_5$ | 1.928 |
| $W_{16}$ | -1.429 | $W_{26}$ | 1.693 | $W_{36}$ | 1.046 | $bh_6$ | -2.608 |
Weight connenction of hidden layer 1 to hidden layer 2, wmr* | Bias weight (bhr) | ||||||||||||
W11 | -0.789 | W21 | -0.021 | W31 | -1.157 | W41 | 0.052 | W51 | 0.95 | W61 | -1.437 | bh1 | 2.221 |
W12 | 0.257 | W22 | -1.46 | W32 | 0.815 | W42 | -0.778 | W52 | 0.8 | W62 | 0.641 | bh2 | -2.03 |
W13 | 0.259 | W23 | -0.57 | W33 | 1.159 | W43 | -0.78 | W53 | -1.351 | W63 | -0.732 | bh3 | -1.708 |
W14 | -0.59 | W24 | -1.413 | W34 | -0.997 | W44 | 0.98 | W54 | 0.569 | W64 | -0.444 | bh4 | 1.416 |
W15 | 1.421 | W25 | -1.219 | W35 | 0.022 | W45 | -0.681 | W55 | -0.165 | W65 | -0.546 | bh5 | -1.25 |
W16 | -0.76 | W26 | -1.207 | W36 | 1.082 | W46 | 0.568 | W56 | -0.699 | W66 | -0.928 | bh6 | 0.836 |
W17 | 0.479 | W27 | 0.112 | W37 | -0.029 | W47 | -1.379 | W57 | -1.351 | W67 | -1.058 | bh7 | -0.535 |
W18 | -1.357 | W28 | 0.08 | W38 | 0.047 | W48 | 0.744 | W58 | -0.341 | W68 | -1.568 | bh8 | 0.184 |
W19 | 0.759 | W29 | 0.292 | W39 | 0.816 | W49 | 1.045 | W59 | -1.283 | W69 | 0.959 | bh9 | 0.082 |
W110 | -1.054 | W210 | -1.054 | W310 | 1.58 | W410 | -0.374 | W510 | 0.263 | W610 | -0.058 | bh10 | -0.449 |
W111 | 0.42 | W211 | 0.944 | W311 | -0.779 | W411 | 1.26 | W511 | -1.549 | W611 | 0.528 | bh11 | 0.54 |
W112 | -1.571 | W212 | 1.004 | W312 | 0.86 | W412 | 0.31 | W512 | 0.476 | W612 | -0.456 | bh12 | -1.154 |
W113 | -1.23 | W213 | 0.069 | W313 | -0.707 | W413 | -0.394 | W513 | -1.357 | W613 | -0.876 | bh13 | -1.401 |
W114 | -0.886 | W214 | 1.128 | W314 | -0.833 | W414 | 0.504 | W514 | -1.3 | W614 | 0.198 | bh14 | -1.708 |
W115 | 0.093 | W215 | -0.675 | W315 | -1.292 | W415 | -0.818 | W515 | 1.152 | W615 | 0.927 | bh15 | 1.952 |
W116 | 0.072 | W216 | -0.341 | W316 | -1.839 | W416 | -1.189 | W516 | -0.034 | W616 | -0.223 | bh16 | 2.296 |
Input Data | ANN Output | Original Output | Difference | ||
$q_c$(kPa) | $f_s$(kPa) | $\sigma_{Y Q}^{\prime}$(kPa) | |||
11520 | 66 | 104.4 | 26.18 | 26 | 0.18 |
12480 | 44 | 102.56 | 26.03 | 26 | 0.03 |
7880 | 31 | 101.64 | 18.30 | 18 | 0.3 |
1080 | 97 | 137.76 | 7.86 | 8 | 0.14 |
12430 | 93 | 106.05 | 29.14 | 29 | 0.14 |
4170 | 27 | 2.67 | 7.52 | 8 | 0.48 |
10560 | 102 | 109.91 | 26.92 | 27 | 0.08 |
12690 | 97 | 107.8 | 29.70 | 30 | 0.3 |
3040 | 77 | 3.86 | 8.09 | 8 | 0.09 |
12900 | 99 | 107.89 | 29.94 | 30 | 0.06 |
11280 | 57 | 104.12 | 25.20 | 25 | 0.2 |
4190 | 83 | 43.01 | 12.65 | 13 | 0.35 |
3050 | 51 | 64.61 | 10.99 | 11 | 0.01 |
To check the validity of the ANN, set of data has been used in the model. Table 7 shows the results of the validation. Based on the developed ANN, the following procedure can be used to estimate the $SPT-N$ based on the three input variables. Suppose the hidden layer 1 nodes ($h_1$, $h_2$, $h_3$, $h_4$, $h_5$, $h_6$), hidden layer 2 nodes ($h_1$, $h_2$, $h_3$, $h_4$, $h_5$, $\cdots$, $h_{16}$), input layer nodes ($I_1$, $I_2$, $I_3$).
The following equations will be used to calculate inputs and outputs at each node in the hidden layer [24]. In the hidden layer 1, the input at each node can be presented by:
The transfer function of tanh can be used to get the output at each node at range of $[-1, 1]$ [25]:
The input at each node in the hidden layer 2 is computed by:
The transfer function of this model is tanh so the output at each node in hidden layer 2 is in the range of $[-1, 1]$:
Then the input at the output layer (IO):
The last step represents finding the output of the ANN:
The following steps can be used for validation
1. Normalize the input and output vectors using Eq. (11):
$\begin{aligned} & I_{\min }=\left\{q_c, fs, \sigma_{vo}^{\prime}\right\}=\left\{600, 0, 0.09\right\} ; \\ & I_{\max }=\{13150185138.22\} ; \\ & N_{60,\min }=\{2\} ; N_{60,\max }\{31\} ;\end{aligned}$
2. Calculate inputs and outputs at each node in the hidden layer 1
Step 1: Calculate input at each node in the hidden layer1 using Eq. (5).
Step 2: Calculate output at each node in hidden layer1 using Eq. (6).
Step 3: Calculate inputs and outputs at each node in the hidden layer 2:
Step 4: Calculate input at each node in the hidden layer 2 using Eq. (7).
Step 5: Calculate output at each node in hidden layer 2 using Eq. (8). Calculate inputs and output at the node of the output layer.
Step 6: Calculate input at the node of the output layer using Eq. (9).
Step 7: Calculate output at the node of the output layer (output of the ANN) using Eq. (10).
3. Denormalize and return data to its normal values using Eq. (12):
Example: Find the $\text{N}_{60}$ from the CPT results given below:
$\begin{aligned} & \left\{q_c, fs, \sigma_{vo}^{\prime}\right\}=\{2850, 73, 42\}\; N_{60}=\{10\} ; \\ & I_n=\{-0.6415-0.2109-0.3932\}, O_n=\{-0.4483\}\end{aligned}$
$\begin{array}{c} {I h_m=\left\{\begin{array}{l}-4.6073 \\-2.0992 \\-0.1856 \\0.9228 \\0.7816 \\-2.4595\end{array}\right\},} {o h_m=\left\{\begin{array}{l}-0.9998 \\-0.9704 \\-0.1834 \\0.7272 \\0.6536 \\0.9855\end{array}\right\},} {I h_r=\left\{\begin{array}{c}5.3173 \\-1.6942 \\-2.3552 \\5.0820 \\-1.5568 \\3.4394 \\-1.9604 \\3.3179 \\-2.1336 \\1.2949 \\-1.2696 \\0.2706 \\-0.4187 \\-2.4423 \\1.9955 \\2.2251 ?\end{array}\right\},} {o h_r=\left\{\begin{array}{c}? \\-0.9347 \\-0.9822 \\0.9999 \\-0.9149 \\0.9979 \\-0.9611 \\ 0.9974 \\-0.9723 \\0.8604 \\-0.8537 \\0.2642 \\-0.3958 \\-0.9850 \\0.9637 \\0.9769 ?\end{array}\right\}.} \end{array}$
From Eq. (5), Eq. (6), Eq. (7) and Eq. (8):
From Eq. (9) and Eq. (10):
$I_O$ = {- 0.4425}
Final output = {- 0.4157}
De-normalization and return data to its original values using Eq. (12):
Final output = {- 0.4157}
Original value = {10.4723}
The denormalization data of $SPT-N$ with the randomly original $SPT-N$ is shown in Table 7, along with the input data. It is clearly demonstrated that the results are very close, and the difference is less than 0.5. This difference can be ignored since the number of blows is always presented in integer form. It is not reasonable to express $SPT-N$ in the form of decimal fractions. Therefore, the simulation by the network is a successful process.
With respect to case 2 of data that represents the clay soil, the BPNNM utilized a similar approach used in this study, which resulted two hidden layers with six neurons for both layers. The procedure presented in section 4.2 which has been demonstrated by an example was applied in a manner similar to case 1. The denormalization has been performed and the results are presented in Table 8. The table consist of the original and predicted $SPT-N$ value for validated samples of data. The results gave a good agreement however in some values the accuracy is less than that in case 1. The differences show the accuracy of data. It depends on the accuracy of the measurements in the two cases. The pairs of qc and fs of first case was measured every 2 cm while the data of case 2 was measured every 20 cm. In addition to that the accuracy for the $SPT-N$ in sand soil is more reliable than that in clay soil. This may be compatible with the commonly known idea that the SPT in clay soil is not reliable compared to SPT in sand. It is attributed to the type of soil which may undergoes high pore pressure.
Input Data | ANN Output | Original Output (SPT-N) | Difference | ||
qc (kN/m2) | fs (kN/m2) | σ'vo (kPa) | |||
411.26 | 12.03 | 82.52 | 11.51 | 12 | 0.481 |
263.40 | 6.30 | 33.38 | 11.07 | 9 | 2.073 |
547.97 | 15.74 | 102.99 | 13.51 | 14 | 0.487 |
311.26 | 9.814 | 82.52 | 11.64 | 9 | 2.643 |
488.57 | 12.17 | 57.95 | 14.63 | 16 | 1.366 |
414.63 | 13.19 | 98.9 | 13.56 | 13 | 0.564 |
314.63 | 11.19 | 98.9 | 12.60 | 12 | 0.600 |
371.90 | 9.838 | 57.95 | 12.64 | 12 | 0.640 |
271.90 | 7.838 | 57.95 | 11.40 | 12 | 0.595 |
414.95 | 13.29 | 98.9 | 13.57 | 14 | 0.430 |
330.06 | 7.501 | 33.38 | 10.81 | 10 | 0.819 |
362.35 | 9.747 | 57.95 | 12.56 | 14 | 1.432 |
238.57 | 6.933 | 57.95 | 10.85 | 9 | 1.852 |
This section presents the results of importance of the essential variables that affect the relation between $SPT-N$ and CPT components used in this study. The connection weights between inputs and hidden layers and hidden layers to output are transmitted to relative importance using the following equation [26]:
where, $RI_i$ is the relative importance of independent variables; $i$, $W_{ij}$, $W_{jk}$ is the sum of the product of connection weights from input to hidden to output neurons. This analysis cannot be performed before the training data since it depends on the connection weights. The connection weight is studied as a part of ANN behavior interpretation. Therefore, the proposed method depends on the connection weights. This may depend on assessing the relationship of the model developed by the behavior of ANN.
Figure 9 shows that the relative importance of the $q_c$ is more significant than others, and the friction and effective overburden pressures have approximately a similar importance on the $SPT-N$ values. The relative importance of the $q_c$ is about 44% and represent Rank 1 of the importance, which is the most significant input variable that affects the $SPT-N$. The other two input variables have relatively similar effects and represented the second and third Rank.

The robustness of the ANN predictive has been tested by performing the sensitivity analysis to check the response [27]. The procedure involves fixing the two input parameters at their average values and changing the third parameter values between their minimum and maximum values. Figure 10a, Figure 10b and Figure 10c show the relation between the $SPT-N$ versus input parameters $q_c$, $fs$, and $\sigma^{\prime}_{vo}$. This test is vital to check the ability of the proposed ANN to estimate the SPT-N based on the results of the CPT.
The estimated $SPT-N$ obtained based on the proposed ANN is presented against every input variable in Figure 10. Figure 10a shows the relation between predicted $SPT-N$ and the cone tip resistance. It indicates that the $SPT-N$ increases as the $q_c$ increases. Curves have similar behavior for estimated $SPT-N$ versus $\sigma^{\prime}_{vo}$ and $fs$ (Figure 10b and Figure 10c). The measured $q_c$ reflects the strength of the soil as the cone penetration is advanced. Similarly, the relation trend of the $SPT-N$ increases, indicating an increase in shear strength parameters. Therefore, the relation between $SPT-N$ and CPT is a positive-relationship.



The developed ANN can be utilized to produce design charts to predict the $SPT-N$ at different values of $fs$, $q_c$ and $\sigma^{\prime}_{vo}$. Figure 11 shows the ability of the proposed ANN to estimate the $SPT-N$ from the $q_c$. It shows the design chart for the $SPT-N$ versus $q_c$ at different values of $fs$.
The charts are repeated at various values of overburden pressures. This chart states that the ANN can construct a relation between $SPT-N$ and CPT and make design charts that the engineers can use simply. The curves reasonably simulate the relation. In all three figures, the $SPT-N$ increases as the three variables increase. It is also shown that at low value of $q_c$, about less than 6 MPa, and high value of $fs$, the SPT decreased at $fs$ 120 to 180 kPa.








5. Conclusion
An Artificial Neural Network (ANN) has been used to model a correlation between $SPT-N$ and CPT results. A special procedure has been used to segment the data into three groups training, testing and validation data. Three input parameters has been used in this study $q_c$, $fs$, and $\sigma^{\prime}_{vo}$. The following points can be outlined through this conclusion:
1. The ANN effectively estimates the $SPT-N$ based on the data obtained by CPT by normalizing the data for silty clay soil. The ANN was constrained for the range of $q_c$ data (600–14400 kPa), $fs$ (200–580 kPa) and average over-burden pressure (2–185 kN/m$^3$).
2. The best ANN with three input and one output for one hidden layer and eight hidden neurons gave results with $R^2$ = 0.99671. Another model is for two hidden layers, where the first hidden layer contains 6 neurons and the second hidden layer contains 16 neurons. It provided results with $R^2$ = 0.99760. The comparison of results obtained from networks with architecture of one hidden layer and two hidden layers reveals that the two hidden layers gave the best performance ANN.
3. It is concluded that the predicted results has excellent $R^2$ using simple architecture of one layer and 8 neurons. With 6 neurons in the first hidden layer and 16 neuron in the second hidden layer, the ANN gave the excellent result. Both models can be used, however the model with one hidden layer is simpler that the model with two hidden layers.
4. The importance analysis of the input parameters showed that the $q_c$ parameter is the most important variable with rank of 0.44 compared to other two input parameters $fs$, $\sigma^{\prime}_{vo}$. The sensitivity analysis was used to check the most important variable.
5. The providing charts proposed to be used by geotechnical engineers to find $SPT-N$ from CPT results were corrected for energy 60.
6. The SPT predicted based on CPT for clay and sand, silty sand or sandy silt, modelled by ANN required different architecture since the behavior of these soils are different. It is recommended for future study to include SBT as an additional input variable to incorporate the impact of soil type in the ANN model and make a generalized model.
7. Finally, this study contributes in utilizing soil data specifically at Nasiriyah city, Iraq that has not been studied. In addition to that more than one type of soils have been used in this study and was not restricted to one type of soil, utilizing a developed framework to apply ANN and extract the data indicated by the specific grade of soil behavior type (SBT). Moreover, a good accuracy of predicted $SPT-N$ has been obtained manually by example, and a comprehensive study has been performed on the sensitivity and importance of parameters, in addition to a parametric study. The current framework is robust and comprehensive and can present a good solution to predict SPT from CPT.
The author is grateful to the College of Engineering and the consultant for making the data available for this research.
The authors declare that they have no conflicts of interest.
| AI | Artificial intelligence |
| ANN | Artificial neural network |
| $\beta_{k+1}$ | Conjugate gradient parameter |
| $bh_m$ | Bias of hidden layer 1 |
| $bh_r$ | Bias of hidden layer 2 |
| $b_O$ | Bias of output layer |
| BPNNM | Best performance neural network model |
| CPT | Cone penetration test |
| $d_k$ | current search direction |
| $fs$ | Sleeve friction component |
| $g_k$ | Gradient |
| $g_{k+1}$ | New gradient |
| $h$ | Hidden layer node |
| $I$ | Input layer node |
| $Ih_m$ | The input at each node of hidden layer 1 |
| $Ih_r$ | The input at each node of hidden layer 2 |
| $IO$ | The input at the output layer |
| $N_{60}$ | The number of blows |
| $Oh_m$ | The output at each node in hidden layer 1 |
| $Oh_r$ | The output at each node in hidden layer 2 |
| $q_c$ | Cone tip resistance |
| $RI_{j}$ | The relative importance of independent variables i |
| SPT | Standard penetration test |
| $SPT-N$ | No of blows |
| traincgb | training function using the Conjugate Gradient algorithm |
| $w$ | The weight |
| $W_{ij}\cdot W_{jk}$ | The product of connection weights from input to hidden to output neurons |
| $W_{mr}$ | The weights between hidden layer 1and 2 |
| $w_{nm}$ | The weights between input and hidden layer 1 |
| $w_{or}$ | Weights between hidden 2 and output layer |
| $\sigma^{\prime}_{vo}$ | Effective overburden pressure |
| $\alpha$ | The optimal step size |
| SBT | Soil Behavior Type |
