With this paper the author forecasts the outofsample volatility of gold price changes in Turkey. Looking at the both the symmetric and the asymmetric evaluation criteria, GJRGARCH model is the best fitted model for forecasting gold price volatility in Turkey. The GJRGARCH model findings reveal a negative shock asymmetry for gold prices. Thus, it shows that positive news in the market affects the volatility of gold prices in the next period more than negative news.
The volatility estimation is used by researchers with the fluctuation of international financial markets and for hedging and speculative income. It also has a significant place in the application of asset pricing models, including foreign exchange rate risk, policymaking, and regulation, hedging, financial risk management, option pricing, international portfolio diversification. Although there is sufficient evidence to assess the volatility estimation performance on international stock exchanges and foreign exchange markets, there is little evidence for the volatility estimation on commodity prices (Kroner et al., 1995). The statistical characteristics of financial time series play a key role in the development of volatility forecasts. The studies of Mandelbrot (1963) and Fama (1965) indicate that financial returns do not act together over time, but are not independent of each other. At the same time, they point out that large amounts of changes in prices of financial assets traded in financial markets are followed by large amounts and small amounts of changes, and that volatility clusters are formed with another statement. It is also known that the financial returns series do not show normal distribution characteristics, but show features such as excessive kurtosis around the mean, volatility clustering, asymmetric response, and leverage.
According to the simple volatility model, the basic assumptions that the return series are independent of each other and have the same distribution, their means are zero, and their variance is constant are not valid for financial return series. With the Autoregressive Conditional Heteroscedasticity (ARCH) model published by Robert M. Engle in 1982, he revealed the existence of heteroscedasticity in financial time series and argued that heteroscedasticity should be modeled. This model was later developed by Bollerslev (1986) and named as Generalized ARCH (GARCH) model. After these studies, ARCH models have been used frequently in volatility modeling in finance literature.
ARCH models have revealed successful and more complex ARCH derivatives as there are financial return series with different statistical properties in volatility modeling. For this reason, comparing the performance of various volatility forecasting models by looking at the in/out of sample performance of the model when choosing the volatility forecasting model has given more accurate results in practice. The focus of this study is on forecasting volatility in gold prices in Turkey. In this context, it is aimed to find the best performing model among a lot of volatility models(random walk, simple moving average models, exponential smoothing model, ARCH, GARCH, GJRGARCH and EGARCH) for gold prices. Thus, it will be discuss the findings of the best performing model. The remainder of this paper is organized as follows: The second section presents the existing literature on gold price volatility forceasting. The Third section describes symmetric and asymmetric volatility forecasting methodology, data, and discusses the forecast evaluation methods. The empirical results are presented in the fourth section. Finally, in the fifth section the paper is concluded.
Kutan and Aksoy (2004) directly used the GARCH (1,1) model to examine the effect of the consumer price index on gold market returns and volatility. However, there is no investigation of the most suitable model. As a result, it is concluded that gold does not react significantly to consumer price index news and is not good protection against inflation. Capie et al. (2005) examine how gold behaves as a hedging instrument for exchange rate risk. GARCH, threshold GARCH, exponential GARCH methods are used in the study. Among these, the GARCH (1,2) model is found as best model for volatility structure. Erer (2011) used weekly data for the sale price of gold (TL / gr) between the 20012011 periods in his study, which examined the volatility in the gold market. During this analysis, symmetric and asymmetric conditional volatility modeling of the volatility of the gold bullion sales price logarithmic return series is performed. The most successful result was obtained in the TARCH (2,2) model. Cihangir and Ugurlu (2018) examined the volatility in gold prices in Turkey by using daily data for the period 20042012. In the study, GARCH, GJRGARCH, and EGARCH models were used and the GJRGARCH model was selected as the best fitted model for the data according to the model determination criteria. As a result of the GJRGARCH model, there is no leverage effect Istanbul Gold Market. Aksoy (2013), using the Istanbul Gold Exchange gold and silver prices for the period 20082011, investigated the dayofweek effect on returns and volatility. In the study using GARCH models, a dayofweek effect is found in yield and volatility for gold. It is also concluded that gold prices are more volatile than silver prices.
The observed volatility of gold prices is considered monthly for use in forecasts and estimations. In this context, the data on gold prices evaluate between the periods 1985: 012018: 01. The observed volatility for use in forecasting and estimation define as the standard deviation of logarithmic return data, similar to Balaban (2004). Logarithmic return series calculate as follows;
where, P_{t}, and Rt are price and return in month t. Monthly volatility is defined as withinmonth standard deviation of all periods returns:
providing volatility 397 estimations ($\sigma_{a,t}$). Of these, 1985:012001:12 period refers to estimation period(a) and 2002:012018:01 period refers to forecast period(f). It has been paid attention to the fact that the estimation and forecast period is a close number of periods (in half) while determining these periods. There exist a broad range of potentially useful models for forecasting volatility. However, it is impossible to employ all models in a single study. In study will be used a wide range of time series forecasting techniques from a naive benchmark of the random walk to the more sophisticated conditional heteroscedasticity models like in Brailsford and Faff (1996) and Balaban, Bayar, and Faff (2006). Besides, it will be excluded the models that regimeswitching specifications. While a regimeswitching model is a good one for insample modeling, it is not readily amenable to an outofsample volatility forecasting exercise(Balaban, Bayar and Faff, 2006).
This study's models include a random walk, simple moving average models, an exponential smoothing model, a regression model, and symmetric and asymmetric conditional volatility models.
Random walk (RW) model:
The RW model foresees that the best forecast of this month’s volatility ($\sigma_{f,m}$) is the last month’s realised volatility.
Moving average (MAa) models:The MAa model tells that the best forecast is an equally weighted average of realized values in the last $\alpha$ months:
where, $\alpha$ = 3, 12, 30.
Exponential smoothing (ES) model:
Forecast under the ES model is a function of the immediate past forecast and immediate past observed volatility:
The smoothing parameter ($\theta$) is restricted to lie between zero and one. The optimal $\theta$ is estimated through minimizing the mean squared error, with an annual update.
Regression model:
In the regression model, I use parameter estimates of c and $\beta$ from the monthly rolling autoregressions
to forecast next month’s volatility.
It should be noted that as this study performing an investigation of outofsample forecasts, all parameter estimates for all competing models employ data from estimation windows only.
The use of conditional heteroscedastic models has been a common tool for modelling and forecasting volatility of financial asset returns following the introduction of the ARCH model and its generalized version, the GARCH model.
Note that the previous models use monthly volatility series. However, with the conditional volatility models, mothly price changes are first modelled as a porder autoregression:
$R_t=c+\delta_1 R_{t1}+\ldots+\delta_p R_{tp}+u_t$
The autoregressive terms account for the economically minor but statistically significant autocorrelation in price changes. The monthly prediction errors (u_{t)} are assumed to be conditionally normally distributed with a zero mean and variance $\sigma_t^2$ based on the information set $\Psi$ available at time t1.
Then the following conditional variance specifications are estimated using the quasimaximum likelihood technique with the Bollerslev and Woolridge (1992) standard errors. Since $\operatorname{var}\left(u_t \mid u_{t1}\right)=\sigma_t^2$, the conditional variance can be modeled as AR (p) process by using the squares of the estimated residual lag;
$\sigma_t^2=c+\phi_1 u_{t1}^2+\phi_2 u_{t2}^2+\ldots+\phi_q u_{tq}^2+v_t$
where, v_{t} is white noise process. If $\phi_1=\phi_2=\ldots \phi_q=0$, variance will homoscedastic.
ARCH(1) model
Autoregressive conditional heteroskedasticity ARCH (1) process can write as:
As can be seen, the conditional variance of u_{t} depends on the actual value of $u_{t1}^2$. The higher the actual value of $u_{t1}^2$, the higher the conditional variance in the t period.
GARCH(1,1) model:
Bollerslev (1986) developed Engle's ARCH model to allow the conditional variance to be modeled as an ARMA (p, q) process. In this model, the conditional variance defines as a function consisting of the terms autoregression and moving average, and conditional variance is transformed into an ARMA process. The superiority of this model to the ARCH model is that it can model the volatility resistance without the need for a large number of variables. The most commonly used GARCH model in finance literature is the GARCH (1,1) model. For instance, in a GARCH (1,1) model, the conditional current period volatility depends on the previous period‘s conditional volatility and the previous period’s squared prediction error:
When the ARCH and GARCH models are examined, the signs of the shocks disappear because the errors are squared. Only their magnitude can be interpreted. In other words, in the model, the effects of positive shocks of the same magnitude and negative shocks on volatility are calculated the same. This, however, does not fully reflect a reality that exists in the financial asset series. This fact is that a negative shock of the same magnitude (bad news) has a greater impact on volatility than a positive shock (good news). Such asymmetries in stock returns are called the leverage effect. The decrease in the firm's stocks will cause an increase in the debt equity ratio. According to Dijk and Franses (2000), the behavior of conditional variance of time series for financial assets is generally asymmetrical compared to the previous return. Also, during the recession periods, the volatility of financial assets is high. In short, asymmetric volatility is the characteristic feature of financial time series (Li and Li, 1996). The most used asymmetric GARCH models Threshold ARCH models (TARCH  Threshold ARCH) or the GJR GARCH model, which is very similar to the TARCH model, were identified by Zakoian (1994) and Glosten, Jaganathan, and Runkle (1993), respectively, and the EGARCH (Exponential GARCH) model is developed by Nelson (1991).
EGARCH(1,1) model:
The leptokurtic structure and volatility cluster, which exist in financial time series, can be effectively determined with the GARCH model. However, GARCH models fail to capture the asymmetry that serves to distinguish between negative and positive shocks in the variance structure. The exponential GARCH (EGARCH) model is developed by Nelson (1991) to eliminate the weaknesses of the GARCH model that takes into account the asymmetry in the volatility structure. .In the EGARCH model, the possibility that the up and down movements in the financial markets may not have the same effect on the predictability of the future volatility of financial assets is taken into account. Downward movements are more effective than upward movements in predicting volatility. This effect, called the "Leverage Effect", was first put forward by Black (1976). This situation, in which it is claimed that negative news coming to the market has more impact on the volatility of financial assets than positive news is modeled as follows:
As seen in Equation 9, the conditional variance of a time series in the EGARCH model is a nonlinear function of the magnitude and sign of its historical values and lagged residuals. The $\frac{u_{t1}}{\sigma_{t1}}$ in the conditional variance equation are standardized error terms. The use of standardized error terms instead of historical values of error terms in the EGARCH model provides information about the magnitude and permanence of the shock. Concerning the $\gamma$ parameter in the conditional variance equation, the $\frac{u_{t1}}{\sigma_{t1}}$ variable gives the EGARCH model an asymmetric character. The $\gamma$ parameter is the asymmetric leverage coefficient that defines the "Leverage Effect" in volatility. The most important sign showing that this model works is that the y parameter is statistically significant.
Accordingly, the statistically significant negative $\gamma$ parameter indicates that positive return shocks generate less volatility than negative return shocks. For example, the volatility of gold prices tends to increase after negative returns and to decrease after positive returns. As a result, the presence of asymmetric volatility in the EGARCH model depends on the statistical significance of the $\gamma$ parameter.
GJRGARCH(1,1) model:
Glosten Jagannathan and Runkle (1993) developed a GARCH model that takes into account the different effects of good and bad news on volatility. That's why the threshold GARCH model is also called GJR GARCH. The GJRGARCH model or threshold GARCH model is actually the asymmetric ARCH process used in modeling volatility. In this model, u_{t 1} = 0 acts as a threshold. The effects of shocks above and below this threshold on volatility are different. The threshold GARCH model can be written as:
The u_{t} in equation 10 represents the shocks that occur in the markets. u_{t 1} < 0 represents negative shocks (news), and $u_{t1} \geq 0$ represents positive shocks. On the other hand, $D_{t1}^{}$ refers to the dummy variable that takes the value 1 and 0 depending on whether the shocks are positive or negative. While the effect of positive news on conditional variance is $\alpha_1$, the effect of negative news on conditional variance is equal to $\alpha_1$+ $\gamma$ . The leverage effect is related to the $\gamma$ parameter and the $\gamma$$\neq$0 state expresses the asymmetry. If $\gamma$ = 0 , the model becomes the GARCH model. The most important sign showing that this model works is that the $\gamma$ parameter is statistically significant. Accordingly, if $\gamma$ > 0 and statistically significant, there is a leverage effect. Finally, It should be noted that all conditional volatility models fulfil the standard requirements for nonnegativity of conditional variance and parameter restrictions.
In this study employed the four commonly used symmetric error statistics: the mean error (ME), the mean absolute error (MAE), the mean squared error (MSE), and the mean absolute percentage error (MAPE). Monthly forecast error is forecast volatility $\left(\sigma_{f, t}\right)$ minus realized volatility $\left(\sigma_{a, t}\right)$.
$M E=\left(\frac{1}{193}\right) \sum_{t=204}^{397}\left(\sigma_{f, t}\sigma_{a, t}\right)$
$M A E=\left(\frac{1}{193}\right) \sum_{t=204}^{397}\left\sigma_{f, t}\sigma_{a, t}\right$
$M S E=\left(\frac{1}{193}\right) \sum_{t=204}^{397}\left(\sigma_{f, t}\sigma_{a, t}\right)^2$
$M A P E=\left(\frac{1}{193}\right) \sum_{t=204}^{397}\left\frac{\sigma_{f, t}\sigma_{a, t}}{\sigma_{a, t}}\right$
The symmetric criteria give an equal weight to under predictions of volatility of similar magnitude. However, under prediction of volatility is primarily important for traders with long and short positions as well as option buyers and sellers. Although Poon and Granger (2003) suggest that using the asymmetric evaluation criteria is advisable, there are only a few papers with this feature in the literature (Brailsford and Faff, 1996; Balaban, 2004; and Balaban, Bayar and Faff, 2006).
Besides, in this study also employed asymmetric error statistics: the mean the logarithmic error (LE) metric (Pagan and Schwert, 1990), for discrimination between under/overpredictions.
The LE statistic reads as follows:
$L E=\left(\frac{1}{193}\right) \sum_{t=204}^{397}\left(\ln \sigma_{f, t}\ln \sigma_{a, t}\right)^2$
Descriptive statistics for all periods of the gold price, return and volatility data are given in Table1 and the graphs of the series are given in Figure1.
Gold Price( Pt )  Return( Rt )  Sigma $\left(\sigma_{a, t}\right)$  
Mean  31.57452  0.026461  0.00172 
Median  11.985  0.025771  0.00132 
Maximum  162.09  0.298575  0.01366 
Minimum  0  0.12645  3.77E06 
Std. Dev.  42.36527  0.048526  0.00172 
Skewness  1.297838  1.178447  2.63796 
Kurtosis  3.499842  7.831218  13.8363 
JarqueBera  115.2917  476.7777  2396.8 
Probability  0  0  0 
Observations  396  396  396 
The return series derivate with the Rt = ln(Pt / Pt 1 ) formulation. When the JB statistics of the return series examine, it is seen that it shows the feature of a leptokurtic.
 ARCHLM(1)  ARCHLM(3)  ARCHLM(6)  ARCHLM(9)  ARCHLM(12) 
Stats.  5.841652  7.243126  7.572382  15.04835  16.52032 
Prob.  0.0157  0.0645  0.2711  0.0896  0.1685 
Accordingly, at the 10% significance level, there is an ARCH effect in the 1st, 3rd and 9th delay of the Gold return series, but no ARCH effect is found in the 6th and 12th lag.
Table 3 presents the comparative results of the symmetric evaluation criteria and the summary statistics.
Forecast  ME  MAE  MSE 
 MAPE  
Competitors  Actual  Actual  Rank  Actual  Rank  Actual  Rank 
MA3  3.42E05  0.001319  9  3.05E06  9  5.499082  9 
MA12  9.92E05  0.00122  8  2.63E06  8  4.782915  8 
MA30  0.000136  0.001195  7  2.42E06  7  4.742886  7 
ES  0.000119  0.00118  6  2.4E06  6  4.595469  6 
Random Walk  1.86E05  0.001556  10  4.59E06  10  5.852174  10 
Regression  8.8E05  0.001143  5  2.39E06  5  4.162858  5 
ARCH  0.00011  0.001137  3  2.35E06  3  4.02876  3 
GARCH  0.00011  0.001136  2  2.35E06  4  4.020307  2 
GjrGARCH  0.00011  0.001136  1  2.35E06  2  4.013842  1 
EGARCH  9.1E05  0.001139  4  2.35E06  1  4.075958  4 
Mean  1.1E05  0.001216  2.69E06  4.577425  
Median  3.5E05  0.001162  2.4E06  4.379164  
Std  0.000104  0.000133  7.03E07  0.658005  
Std/Mean  9.59454  0.108977  0.261405  0.14375  
Std/Median  2.9904  0.114086  0.293058  0.150258 
Mean error (ME), mean absolute error (MAE), mean squared error (MSE), mean absolute percentage error (MAPE).
Table 3 shows the comparative results of symmetrical evaluation criteria and summary statistics. The ME statistic shows as a mean whether a model is under/overpredicted. All models overpredict volatility except regression and unsymmetrical volatility models (ARCH, GARCH, GJRGARCH, and E GARCH). According to ME statistics, the MA30 model has the highest overpredict figure, while the GJRGARCH model has the lowest underpredict figure. However, it should not be given too much weight to ME, as negative and positive forecast errors can cross each other. When i ignore the ME results, the mean and median adjusted standard deviations of the error statistics show that the MSE statistic produces the most variable performance results among the models.
Looking at other symmetrical criteria, the GJRGARCH model has the best performance according to MAE and MAPE criteria. It is followed by GARCH and ARCH models, respectively. According to the MSE criteria, the EGARCH model has the best performance, followed by the GJRGARCH, ARCH, and GARCH models, respectively. When all symmetrical criteria consider, the model with the worst performance consistently is the random walk model. This model follows by MA3, MA12, and MA30, respectively.
It should be noted that irrespective of the error statistics, the performance of the MAa models is almost undistinguishable from each other for any a. Thus, the weighting approach does not seem much value added. Table 4 shows the results of the asymmetric evaluation criteria where positive and negative forecast errors are differently treated.
 LE 
 
Forecast Competitors  Actual  Rank  
MA3  1.928428  9  
MA12  1.765173  8  
MA30  1.760148  7  
ES  1.745899  6  
Random Walk  2.957185  10  
Regression  1.637487  5  
ARCH  1.616858  3  
GARCH  1.615265  2  
GJRGARCH  1.614226  1  
EGARCH  1.625991  4  
Mean  1.826666 
 
Median  1.691693 
 
Std  0.410021 
 
Std/Mean  0.224464 
 
Std/Median  0.242373 

Our second asymmetric criterion, the LE statistic, favours the GJRGARCH model among the other competitors, and particularly over the GARCH model, another asymmetric conditional volatility specification. ARCH, EGARCH, and regression models follow them, respectively.
According to Tables 3 and 4, it is seen that the optimal model for forecasting gold price volatility is the GJRGARCH model. This finding also correspondence with Cihangir and Ugurlu (2018). Erer (2011) also stated that the best performing model for gold price prediction is TARCH. If I ignore the model denomination, our results correspond. However, I think it is important to interpret the GJRGARCH model forecast results since it contains leverage (Asymmetry) information for gold prices. Thus, the estimation results of all period GJRGARCH model gives in table5.
Dependent Variable: $\sigma_t^2$  
Variables  Parameters  Std Error  z stat.  Prob Value 
𝐶ons.  0.001457***  0.000183  7.954262  0.0000 
$\sigma_t1^2$  0.143639  0.092246  1.557129  0.1194 
Variance Equation  
Variables  Parameters  Std Error  T stat.  Prob Value 
𝐶ons.  1.73E06***  5.03E07  3.437046  0.0006 
$u_t1^2$  0.158442**  0.065270  2.427488  0.0152 
$u_{t1}^2 D_{t1}^{}$  0.315230**  0.143027  2.203986  0.0275 
$\sigma_t1^2$  0.306338  0.208270  1.470865  0.1413 
Rsquared  0.037099  
Log likelihood  1976.046  
DurbinWatson stat  1.895192  
Included observations:  395 
According to Table 5, the $\gamma$ parameter estimate as 0.315230, and this value is statistically significant. Therefore, I can say that the model works. The $\alpha_1$ parameter 0.158442 , which expresses the effect of positive news on conditional variance, has been estimated and is statistically significant. In asymmetric models, good news will collect on the $\alpha_1$ parameter, and bad news will collect on the $\alpha_1$+ $\gamma$ parameters. There is a negative shock asymmetry with a larger effect on volatility in models with a leverage effect (i.e., $\gamma$ > 0 ) and whose parameter is statistically significant. In other words, bad (negative) news means that the next period will affect the volatility of gold prices more than positive news. However, the asymmetry coefficient of 0.315230 was estimated in the model in our study. So, $\gamma$ < 0 . In models with a statistically significant asymmetry coefficient $\gamma$ < 0 and this parameter, there is a positive shock asymmetry with a greater effect on volatility. In other words, it means that good (positive) news will affect the volatility of gold prices more than bad (negative) news in the next period (Brooks, 2008: 408).
In this paper, the author analyses a wide range of volatility forecasting techniques using both symmetric and asymmetric evaluation criteria, for gold prices in Turkey. To our best knowledge, there has been no evidence for the outofsample predictive accuracy of a broad range of time series models of volatility using gold price(gr/tl) data. The following points are worth emphasizing.
The overall rankings of the symmetric error statistics clearly assert that the GJRGARCH model is significantly superior over the other competitors while both the symmetric and the asymmetric conditional volatility models better perform. The GJRGARCH model findings reveal a negative shock asymmetry for gold prices. Thus, it shows that positive news in the market affects the volatility of gold prices in the next period more than negative news. This results are of importance for gold price forecasting, spot and derivatives pricing and risk management.