1. Introduction
In the 1950s, Markowitz [
1] introduced the modern portfolio theory, according to which portfolio risk can be reduced to the level of systematic risk through diversification consisting in the inclusion of different assets in the portfolio. Until the early 2000s, commodities, commodity derivatives and commodity indices were major components of adequately diversified portfolios because of their negative or weak positive correlation to traditional financial assets such as stocks and bonds [
2,
3,
4,
5]. Already in the early 1980s, Bodie and Rosansky [
6], based on research findings, concluded that a 40% share of commodity futures considerably reduces portfolio risk while increasing the expected rate of return. Similar benefits of diversification of the investment portfolio by supplementing it with commodities were also reported by Jaffe [
7] investigating diversification of investments by adding gold; Satyanarayan and Varangis [
2] in the case of diversification by adding commodity futures comprising the Goldman Sachs Commodity Index (GSCI); Froot [
8] in the case of diversification by adding oil and production-weighted indices of commodity futures; and Jensen et al. [
9] in the case of diversification by adding commodity futures.
Commodity and financial markets are considerably affected by many factors, including the development of e-commerce, the establishment of exchange-traded funds (ETFs) and passively managed index funds focused on commodity, the inflow of capital from financial investors into commodity markets, and financial and economic crises. The above contributed to a greater integration of certain commodity markets and promoted convergence of commodity and financial markets [
10,
11,
12]). Putnam [
13] identified the determinants influencing dependencies between the rates of return for the S&P 500 index and for portfolios composed of commodity futures (relating to commodity sectors, i.e., energy, foods and fibers, grains and oilseeds, livestock and precious metals) in the period October 1992–October 2013. He concluded that the dynamic correlations between the stock index and the portfolios composed of futures in energy, grains and oilseeds, precious metals and (to a lesser extent) foods and fibers may be explained with increasing accuracy (especially after May 2003) using macroeconomic indices and financial market indices. Also, as noted by Zaremba [
14], after 2003 the rates of return for commodity indices have become increasingly dependent upon the general economic condition. The increasingly strong dependencies between markets resulted in greater difficulties in managing investment portfolios. Hence, especially since the beginning of the financial crisis in the late 2007, both researchers and practitioners have focused on the dependence structure between commodity futures [
15]. Correlation and volatility are central to many applied issues in finance, ranging from asset pricing, through asset allocation to risk management [
16].
The body of source literature includes many studies addressing relationships between commodity prices or rates of return on commodity prices in the spot, and futures markets and their stability over time. The studies differ in terms of data selection parameters (spot prices, futures prices, data frequency, period of analysis); additionally, they either considered or disregarded the temporal volatility of these dependencies. Moreover, the analyses often focus on relationships only between pairs of commodities. Many studies analyzed dependencies between commodities and macroeconomic variables (exchange rate, interest rate and index price), crude oil and other commodities, rather than investigating relationships between other commodities applying a multivariate method. An extensive review of such research was presented by Attaf et al. [
12].
For many years, the Pearson linear correlation coefficient has been the most widely used measure of dependence between the rates of return on asset prices. However, it fails to account for temporal changes in the correlations, which are particularly noticeable in periods of declining prices of many assets on the markets, in which investors sell out their assets. In these periods, the Pearson linear correlation coefficient underestimates the dependence between assets [
17]. An increased dependence between assets means that in periods of falling asset prices in markets, well-diversified portfolios become riskier. Another downside of the Pearson linear correlation coefficient is that it provides a reliable dependence measure only for elliptically distributed assets [
18]. When non-elliptical distributions are analyzed, the problem of non-subadditivity risk measures appears (risk of the investment portfolio measured by Value-at-Risk (VaR) may be higher than the sum of the VaRs of individual components of the portfolio). Moreover, the Pearson correlation coefficient only measures linear dependence between two assets. If we have
N-asset portfolio, then we have
N(
N − 1)/2 correlation coefficients (one for each pair of assets). In this case, it is difficult to control all correlation coefficients.
The temporary volatile conditional correlations between the rates of return on assets may be modeled using multivariate volatility models, such as, e.g., Multivariate Generalized Autoregressive Conditional Heteroskedasticity (MGARCH) models [
19]. Nevertheless, it needs to be remembered that when considering portfolios consisting of more than six assets it is difficult to estimate a large number of parameters within these models. The Dynamic Conditional Correlation (DCC) model [
20] is a parsimonious parametric model facilitating estimation of portfolios comprising a large number of assets. A drawback of this model is connected with restrictions imposed on the multivariate joint distribution describing the structure of conditional dependencies between rates of return on assets, and on marginal distributions of rates of return from these assets. The above downside does not apply to copula Generalized Autoregressive Conditional Heteroskedasticity (copula-GARCH) models with dynamic (DCC) estimation of the correlation [
21]. Copula models enable the modeling of the multivariate joint distribution defining the structure of the dependence between the rates of return from assets, independently of the marginal distributions of rates of return on these assets [
21]. Static copula models are not flexible enough to describe the dynamics of relationships between the returns on assets in the markets. In contrast, dynamic copula models make it possible to capture the moments when the relationships change in strength and nature [
22]. Jondeau and Rockinger [
23] proposed the copula-GARCH approach to model the dependency between stock market rates of return. This is an approach based on copula functions, which includes two steps. In the first step the univariate distributions are estimated and in the second the joining distribution is estimated. In such approach, the dependency parameter may simply be rendered conditional and time-varying. In turn, Engle and Kelly [
24] introduced the Dynamic Equicorrelation (DECO) model, which eliminates the computational and presentational difficulties of the dependence structure of high-dimensions systems. The DECO model considers systems in which all pairs of rates of return on assets have the same correlation at a given moment, but the correlation varies over time. Nevertheless, it needs to be remembered that the DECO model may be a poor tool for describing raw rates of return on assets. This model should be applied to the GARCH standardized residuals [
24]. A different approach is represented by the Implied Correlation Index (ICJ), which was introduced by Chicago Board Options Exchange (CBOE) [
25]. The ICJ measures the expected average correlation between rates of return on the S&P 500 index components. The Implied Correlation Index is based on options written on the 50 largest companies in the S&P 500 index. Cambell et al. [
26] proposed the Implied Correlation Index based on volatility estimation instead of option-implied volatility. Echaust and Just [
27] used GARCH and GARCH-Filtered Historical Simulation (GARCH-FHS) approaches to estimate volatility and VaR in the implied correlation formula. They examined the dynamics and properties of the implied correlation estimates within various economic sectors of the commodity futures and stock markets in the period of 2006–2017. Assets in commodity sectors were on average much less correlated than assets in stock sectors. The implied correlation in the analyzed sectors showed clustering properties, long memory, asymmetry, and co-movement with volatility. There are also works focusing on the long-run relationship or causality between markets (financial markets, commodity markets, financial and commodity markets) using linear and nonlinear cointegration and causality methods [
28,
29,
30,
31,
32].
Due to the properties (heteroskedasticity, i.e., volatility clustering, asymmetry and fat tails) of time series of rates of return on commodities (commodity futures, commodity indices) [
12,
33,
34,
35,
36,
37,
38,
39,
40,
41]), copula-GARCH models provide a useful tool in analyzing the dependencies between the time series. The GARCH family models (e.g., GARCH [
42]; Taylor-Schwert GARCH (TS-GARCH) [
43,
44]; Exponential GARCH (EGARCH) [
45,
46]; Glosten Jaganathan Runkle-GARCH (GJR-GARCH) [
47]; Asymmetric Power Autoregressive Conditional Heteroskedasticity (APARCH) [
48]; Threshold Autoregressive Conditional Heteroskedasticity (TARCH) [
49]; and Fractionally Integrated GARCH (FIGARCH) [
50]) capture and describe properties of univariate time series of rates of return. In turn, copulas allow the combination of any distributions of univariate series of rates of return into a multivariate distribution. The advantage of copula models stems from the fact that they separate the structure of dependencies from marginal distributions [
22]. The source literature includes studies which rely on dynamic copula models to analyze temporary volatility of dependencies of commodity futures portfolios or commodity futures and traditional financial instruments [
40,
51]. The application of clustering methods in grouping the conditional correlation coefficients derived from copula-GARCH models indicates moments or periods of changes in the structure of conditional dependencies in the market for the assets under consideration. The approach which relies on the copula-GARCH model and disjoint clustering methods to identify changes in the structure of conditional dependencies in the spot market for precious metals was used by Wanat et al. [
39]. They assumed that a specific conditional dependence structure pattern may be assigned to a given state of market, while changes in the state of the market are related to drastic changes in the structure of conditional dependencies. However, their assumptions and the disjoint clustering method adopted in their studies, made it possible to identify only the moments when changes occurred in the structure of dependencies. Just et al. [
52] expanded that approach by proposing the use of the fuzzy clustering method to identify changes in the structure of conditional dependencies in the precious metals futures market. However, the empirical study focused only on momentous changes in the structure of conditional dependencies.
This paper is a continuation of the authors’ previous research [
52]. The authors propose the application of fuzzy clustering methods to identify changes in the structure of conditional dependencies, market states in different commodity futures markets in the period 2000–2018, and to determine the time of transition from one state to another. This is a long-term study as the authors intend to identify the patterns of the conditional dependence structure and their changes in selected large and liquid markets in different sub-periods (stability and crises).
The main aim of this paper is to assess the conditional dependence structure in different commodity futures markets (energy, metals, grains and oilseeds, soft commodities, agricultural commodities) in the years 2000–2018. The specific purpose is to identify the moments or periods of change in the structure of conditional dependencies. The analyzed period is connected with the dynamic development of commodity derivatives trading markets and it was marked by economic and financial crises. In the years 2005–2018 the volume of exchange traded futures and options for agricultural commodities and precious metals quadrupled; in the case of futures and options in energy and non-precious metals the volume of trade increased 8- and 15-fold, respectively [
53,
54]. In the late 1990s, financial markets were severely affected by crises coming from the emerging economies, which culminated in the Argentinean crisis in early 2002 [
22]. In subsequent years this was followed by the food crisis (2006–2007), the global financial and economic crisis (late 2007–early 2013), including the subprime crisis (2007–2009), and the European debt crisis (2008–early 2013). These phenomena may have had affected a change in the structure of dependencies in the markets considered. The analysis focused on the dependencies between rates of return on prices of commodity futures applying the copula-GARCH models and fuzzy clustering methods.
The authors’ contribution to the literature on the subject includes, firstly, the assessment of the conditional dependency structure in different commodity futures markets based on copula-GARCH models and fuzzy clustering methods. This approach applies various GARCH family models and copula models in order to determine dynamic conditional correlations (Kendall’s tau coefficients). Secondly, by employing the fuzzy c-means method, the authors extend knowledge of the dependence structure in markets by identifying the states of markets corresponding to specific patterns of conditional dependency structures assuming that the markets’ transition from one state to another may vary in intensity and may occur in different time frames. Therefore, this paper considerably supplements and broadens previous research on the structure of dependencies between commodity futures. The obtained results indicate that the structures of conditional dependencies in the commodity futures markets (agricultural commodities, soft commodities, grains and oilseeds, metals, energy) were changing in the period from 2000 to 2018. Two states were identified in the markets for agricultural commodities, soft commodities, grains and oilseeds and metals, while three states were found in the energy market. The strongest and relatively stable conditional dependencies existed between the rates of return on futures for commodities which are related, either being substitutes or raw materials in the production of other commodities. Findings from this study provide information on the structure of conditional dependencies in commodity futures markets. This data is required to gain insight into prevailing market mechanisms and to ensure valid risk aggregation, valuation and effective management of the investment portfolio.
The remaining part of this paper is structured as follows:
Section 2 presents the data and methods used in the empirical study,
Section 3 comprises the findings and conclusions along with their discussion, while
Section 4 sums up the study.
2. Materials and Methods
The study used continuous series of daily closing prices for commodity futures from the period 2000–2018. Commodity futures are included in the analysis if they are covered by the Thomson Reuters Equal Weight Commodity Index (except for live cattle and lean hogs). Five classes of commodities (energy, metals, grains and oilseeds, soft commodities, agricultural commodities) were considered. The dataset was retrieved from a financial stock news website, stooq.pl [
55]. The total number of observations are: 4819 for energy, 4820 for metals, 4782 for grains and oilseeds, 4768 for soft commodities and 4783 for agricultural commodities. The components of different commodity classes (markets) include as follows: energy – crude oil (CL.F), heating gas (HG.F), natural gas (NG.F); metals – gold (GC.F), silver (SI.F), platinum (PL.F), copper (HG.F); grains and oilseeds – corn (ZC.F), wheat (ZW.F) and soybeans (ZS.F); soft commodities – cotton (CT.F), sugar (SB.F), cocoa (CC.F) and coffee (KC.F); agricultural commodities – corn (ZC.F), wheat (ZW.F), soybeans (ZS.F), soybean oil (ZL.F), cotton (CT.F), sugar (SB.F), cocoa (CC.F) and coffee (KC.F).
The calculations were based on daily percentage log-returns. The rates of return were calculated as , with denoting the closing price for contract at day . The distributions of rates of return on futures under consideration were leptokurtic and demonstrated very weak or moderate (negative or positive) asymmetry.
The relationships between the rates of return on quoted prices of commodity futures were assessed using the copula-GARCH models estimated in two stages. In the first stage the ARMA-GARCH models were adapted to one-dimensional series of returns, while in the second stage two-dimensional conditional copula models were fitted.
A copula is a function that allows the component describing only the dependence structure to be extracted from a joint distribution of a random vector. The application of the conditional copula enables modeling of joint distributions of an
N-dimensional vector
conditional on a set of information
available until and including
. The general conditional copula model has the following form [
21]:
where:
is the copula;
is joint distribution
at moment
; and
are the marginal distributions
at moment
This study assumes that the rates of return
on prices of commodity futures are described with the Autoregressive Moving Average (ARMA) - Generalized Autoregressive Conditional Heteroskedasticity (GARCH) models. The following designations were used for the rates of return modeled [
42,
56]:
In view of autocorrelation in some series of rates of return on commodity futures, respective ARMA
models were used to model the conditional mean of returns:
Because a strong ARCH (Autoregressive Conditional Heteroskedasticity) effect was observed even with a one-day lag for the series of returns on commodity futures, a GARCH (1,1) model [
42] with various distributions of innovations was used in order to model the conditional volatility of rates of return:
where:
. This model takes into account the volatility clustering phenomenon. Although simple, the GARCH (1,1) model delivers relatively good estimates of conditional volatility compared to more complex models. However, for some series of rates of return the models that take account of the leverage effect (i.e., a property such that negative shocks at
− 1 has a stronger impact on the volatility of the return at
than a positive) are better fitted than the GARCH (1,1) model [
57]. The EGARCH (1,1) [
45], GJR-GARCH (1,1) [
47], APARCH (1,1) [
48] models with various distributions of innovations were also estimated in order to capture the asymmetric impact of positive and negative returns on conditional volatility. The EGARCH (1,1) model of Nelson takes into account the asymmetry effect:
where:
captures the sign effect and
—the size effect. The GJR-GARCH (1,1) model of Glosten Jaganathan and Runkle is another model which also allows the measuring of the asymmetry effect:
where:
now represents the leverage term,
takes value 1 for
and 0 otherwise. The APARCH (1,1) model of Ding, Granger and Engle models both the leverage effect and the Taylor effect (i.e., the sample autocorrelation of absolute returns is usually larger than that of squared returns):
where:
plays the role of a Box-Cox transformation of the conditional standard deviation
and
reflects leverage effect. The choice of the best ARMA-GARCH model from the group of models considered was based on information criteria and properties of the residuals.
This study assumes that
is the distribution of standardized residuals
from the model fitted to
. The copula-GARCH model assumes that the joint conditional distribution of the
N-dimensional vector
is modeled using a conditional copula with conditional correlations
. The copula correlation matrix is obtained from the DCC model [
20]:
where: conditional variance
is modeled using a GARCH model;
is the unconditional covariance matrix of
, where
for Student’s
t copula (
for Gaussian copula);
,
are parameters such that
. If
and
are equal to zero, the DCC model is reduced to the Constant Conditional Correlation (CCC) model [
58]. In this study two-dimensional copula-ARMA-GARCH models were estimated using the maximum likelihood method, while the semi-parametric transformation method was applied for marginal innovations of the ARMA-GARCH fitted models. This study considered models with a Gaussian copula or a Student’s
t copula. The best copula model was selected based on information criteria. The calculations were performed in the R programming environment with the rmgarch [
59] and rugarch [
60] packages.
The best-known dependence measures, which are invariants of the copula, include Kendall’s tau coefficient and Spearman’s rho coefficient. In this study Kendall’s tau coefficient was used to assess the strength of the relationship between returns on prices of commodity futures. If
is a vector of a pair of random variables and
is an independent copy of
, Kendall’s tau coefficient is expressed as follows [
22]:
If variables
and
are correlated with an elliptical copula (e.g., a Gaussian or a Student’s
t copula) and the correlation coefficient is
, then Kendall’s tau coefficient is expressed as follows:
Changes in the conditional dependence structure in the commodity futures markets were identified using the fuzzy c-means clustering method.
In a general sense, clustering is a process of making groups of similar objects [
61]. The purpose of clustering is to separate object clusters (groups) which are relatively homogeneous in terms of their characteristic properties. Each cluster includes objects similar to one another in terms of the criterion considered, while they differ from objects in other clusters. The disjoint clustering methods are the most widely used. They typically make it possible to assign properties of only one structure type to an object. Such a definite identification does not reflect true structure, because in practice most objects have properties of many types. The application of classical clustering methods is burdened with some restrictions, which often result in an oversimplification of the actual course of investigated phenomena. It is often very difficult to discover the actual structure of clusters due to data imperfections, such as uncertainty, incompleteness, etc. Since ancient times the terms “uncertainty” and “incompleteness” have had pejorative connotations and have been considered to reflect a lack of knowledge. This changed greatly following the publication of a paper titled “Fuzzy Sets” by Zadeh [
62] in the journal ‘Information and Control’, also presenting the foundations of a new infinite-valued logic which uses values from the [0, 1] interval. That theory emerged because of the need to describe highly complex phenomena or poorly defined terms, which could not be precisely described with classical mathematical methods.
For these reasons it is more appropriate to define the degree to which particular objects belong to each of the states identified. Such a clustering method is made possible by the application of clustering methods based on fuzzy sets. In the fuzzy approach an object may belong to more than one cluster. Intuitively, a fuzzy cluster consists of objects which belong to it fully or partially. Transition from membership to non-membership is a gradual process, in contrast to a traditional clustering procedure where an object either is or is not a cluster member.
Clustering is a complex process which includes the following stages:
Stage 1. Defining the main clustering criterion.
Stage 2. Selecting objects.
Stage 3. Selecting variables and normalizing their values.
Stage 4. Setting a system of weights for the variables.
Stage 5. Selecting a measure of proximity (similarity, distance).
Stage 6. Selecting a clustering method.
Stage 7. Determining the number of clusters.
Stage 8. Performing the appropriate clustering procedure.
Stage 9. Identifying and describing structure types (states).
Prior to the clustering procedure the main clustering criterion (e.g., the states of the markets corresponding to typical patterns of the conditional dependency structure) (Stage 1) and the objects (e.g., moments in time) to be clustered (Stage 2) need to be defined. The correct selection of variables is a key stage in the clustering procedure (Stage 3). This is done based on substantive and statistical analyses. Once determined, the values of
K variables for
T moments are set together into a
data matrix:
where:
is the value of variable
k at time
t, and
is a row vector composed of values of
K variables at time
t.The variables describing the objects in the study may be expressed in different units. Furthermore, they also differ in the ranges of variation between their maximum and minimum values. To eliminate differences between the values of variables they need to be normalized, e.g., through standardization. If all variables are expressed in the same units, their values do not need to be normalized. This is the case in this study.
Sometimes the variables differ in how valid they are for the clustering process. Their validity may be determined using respective weight coefficients (Stage 4). Weights may be established using three methods: statistical, substantive and integrated. There is no universally accepted approach concerning the weight coefficient system for the variables. As a result, empirical studies frequently use identical weights for all variables and such a system was also adopted in this paper.
The clustering procedure is based on distances between pairs of multi-variable objects
and
[
63,
64,
65] (Stage 5). The Minkowski distance is one of the most widely used metrics of distance [
63]:
if
, it is referred to as the Manhattan distance; at
, it becomes the Euclidean distance; at
, it is the Chebyshev distance. The Minkowski distance is a general distance formula and is used to calculate the similarity of objects described by
K variables. The application of the Manhattan distance results in a cubic clustering; in turn, the Euclidean distance is used for spherical clustering. In cubic clustering the clusters take the form of hypercubes, and in spherical clustering, hyperspheres [
63]. Another type of distance used for clustering purposes is the Mahalanobis distance, adopted in the case of spherical clustering when objects are assessed for similarity in terms of linear relationships between variables [
63]. It should be noted that no universal method exists. Moreover, all methods are subject to restrictions related to their “legibility”, which deteriorates with an increase in the number of objects. In this paper, the study was conducted applying the algorithm, which requires determining the number of clusters and preliminarily clustering of the data set. In the next steps of the clustering process, the objects are moved from their clusters to others so that within a given cluster they differ as little as possible from certain cluster variables (prototypes). The iterative process is repeated until the clustering attains a certain predefined stability level [
65,
66]. The most popular methods used for that purpose include the
k-means method and its fuzzy version, the fuzzy
c-means method (Stage 6).
The application of fuzzy clustering methods often requires predetermination of the initial clustering of objects. The simplest way of doing so is to randomly assign the objects to clusters. However, the outcomes of the clustering procedure are not always satisfactory. In the statistical literature, most authors are in favor of evaluating clustering using the outcomes of another clustering method [
65,
67,
68,
69,
70,
71].
Clustering of objects requires determination of the number of clusters (Stage 7). This can be done in various ways [
72,
73]. In this paper, the number of clusters was specified in two steps. In the first step disjoint clusters were generated using the
k-means method and they were evaluated using clustering quality indices. The most commonly used are indices or functions enabling the selection of the best partition of a population into clusters, i.e., the Caliński-Harabasz index [
74], the concordance index [
75], the Hubert-Levin index [
76], the Krzanowski-Lai index [
77], the Hartigan index [
61], the silhouette index [
78], and the gap index [
79]. It was decided in this study to apply the Caliński-Harabasz index [
74] and the Krzanowski-Lai index [
77].
The Caliński-Harabasz clustering quality index can be written as:
where:
—trace of the matrix of inter-class variance
;
—trace of the matrix of intra-class variance
;
T—number of objects;
—number of clusters. When
reaches the (global or local) maximum for the number of clusters
, the best partition of the data is the partition into
clusters.
The Krzanowski-Lai clustering quality index is defined as:
where
When
reaches the first local maximum for the number of clusters
, the best partition of the population is the partition into
clusters.
In the second step of clustering, the number of clusters was identical to that determined using the disjoint clustering procedure for the same data matrix and the clustering was performed applying the fuzzy
c-means method [
80,
81,
82,
83] (Stage 8).
The fuzzy clustering problem was presented as a nonlinear mathematical programming problem [
80,
82,
83]:
with the following conditions:
where:
T—number of moments in time (days);
(
)—number of fuzzy clusters;
K—number of variables;
m—parameter used to adjust the degree of fuzziness for the clustering process;
− a
matrix of membership degrees of the objects in fuzzy clusters;
− a
matrix of cluster centroids;
− a
data matrix, with
representing the normalized value of variable
k in object
t.
The mathematical programming problem shown above (21)–(24) was presented for the first time by Dunn [
84] with
, and its generalized form was provided by Bezdek [
80], for
. That parameter can be referred to as a parameter of fuzziness, because if
, the resulting clusters are completely fuzzy (
), whereas if
, the clustering process becomes quasi-deterministic and
values are close to 0 and 1. To date no theoretical foundations have been presented for the selection of an optimal value of parameter
m. This parameter is selected based on empirical research, which indicates that the interval should assume values from the [1.3, 1.4] interval [
64,
85].
As a result of the fuzzy clustering process, each object is assigned to each cluster with a certain degree of membership, being a number from the [0, 1] interval, while for each object the sum of all degrees of membership is 1. The degree of membership specifies the degree to which an object belongs to a specific cluster. The higher the degree of membership, the more specifically the object is characterized by variables of that cluster. The solution to the fuzzy clustering problem takes the form of a table composed of the degrees of membership to individual clusters. The resulting partition may easily be converted into disjoint clusters applying the principle, according to which a given object is assigned to the cluster for which its degree of membership is the highest. The clustering methods based on fuzzy sets provide a much greater amount of information on clustering of objects than classical methods, which only allow the unambiguous assignment each element to one of clusters.
The next stage of the procedure consists in the identification of the types (the states of markets) (Stage 9). This identification may be divided into formal and substantive, of which formal identification specifies the name of the state, whereas substantive identification determines values of descriptive statistics.
An assumption was made that the typical conditional dependence structure pattern may be assigned to a given state of the market and that considerable changes in the conditional dependence structure correspond to the time of transition from one state of the market to another. Kendall’s tau coefficients
Ki (
i = 1, 2, 3, 4, 5) (
K1 = 3 for energy,
K2 = 6 for metals,
K3 = 3 for grains and oilseeds;
K4 = 6 for soft commodities;
K5 = 28 for agricultural commodities) for the pairs of rates of return on prices of commodity futures considered at
Ti moments in time (
T1 = 4818 for energy,
T2 = 4819 for metals,
T3 = 4781 for grains and oilseeds,
T4 = 4767 for soft commodities,
T5 = 4782 for agricultural commodities) were arranged into a
Ti ×
Ki data matrix (the variable is Kendall’s tau coefficient; the object is the point in time (day)). For the number of clusters ranging from 2 to 10 the sequences of disjoint clusters were generated using the
k-means algorithm. The calculations were performed in the R programming environment using the clusterSim package [
86]. The partitions were evaluated with the Caliński-Harabasz clustering quality index and the Krzanowski-Lai clustering quality index. The initial clustering result obtained using the
k-means method provided a starting point for the fuzzy clustering procedure based on the fuzzy
c-means method. The calculations were performed in the R environment using the fclust package [
87].
3. Results and Discussion
The types of ARMA-GARCH models suitable for one-dimensional time series of returns on commodity futures prices are shown in
Table 1.
The series of rates of return covered by this study extend over a long period marked by crises and perturbations in the markets analyzed. Hence, the fitted conditional variance models are GARCH (1,1) or asymmetric GARCH (1,1) with a skewed Student’s
t distribution (except for heating oil and cotton futures). In accordance with the procedure described in item 2, first the conditional variance models were fitted, then this step was followed by fitting the two-dimensional conditional copula models (
Table 2). In most cases, these were the conditional Student’s
t copula models with the DCC dynamics or conditional Gaussian copula models with the DCC dynamics.
The dynamic Kendall’s tau coefficients were used to assess the strength of conditional dependencies between the returns on prices of commodity futures in the markets for energy, metals, grains and oilseeds, soft commodities and agricultural commodities. The resulting Kendall’s tau coefficients (
Figure 1,
Figure 2,
Figure 3,
Figure 4 and
Figure 5) were clustered using the fuzzy
c-means method (
Figure 6,
Figure 7,
Figure 8,
Figure 9 and
Figure 10).
The structure of conditional dependencies in commodity futures markets varied between 2000 and 2018, as confirmed by the different states of markets identified (
Figure 6,
Figure 7,
Figure 8,
Figure 9 and
Figure 10,
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7). In the period from late 2007 to early 2013 marked by the global economic and financial crisis (the 2007–2009 subprime crisis, the 2008–early 2013 European debt crisis), changes in the structure of these dependencies were particularly noticeable in the markets for commodity futures, in which rates of return on quoted prices were generally weakly correlated (soft commodities; agricultural commodities other than grains and oilseeds). As regards futures contracts for agricultural commodities, the strongest dependence between the rates of return was recorded for soybeans and soybean oil futures (
Figure 1,
Table 3). Soybeans are used in the production of many foods because of their high protein content, with approximately 2/3 of produced soybeans being processed into soybean oil and soybean meal. In turn, soybean oil is used to produce cooking oils, margarine, mayonnaise and salad dressings, in the chemical industry and in biodiesel production [
88]. An increase in dependencies between soybeans and soybean oil futures contracts was observed in 2004, with stronger relationships recorded over the next nine years. One of the reasons for the increased dependencies may have been connected with the development of the biofuel market. The requirement to add ethanol to fuels was introduced in 2001 by the European Union and in 2005 by the U.S. Congress [
89]. Two states of market were identified in the agricultural commodity futures market (
Figure 6,
Table 3). The correlation between the rates of return on futures prices in the agricultural commodity market became stronger in early 2008 and remained at a similar level until mid-October, 2012. In approximately the same period stronger dependencies were also recorded in the futures market for soft commodities, where two states of the market were also identified (
Figure 7,
Table 4).
The changes described were not observed in the structure of conditional dependencies between the rates of return on prices of grains and oilseeds futures. In this market two patterns of conditional dependency structure were identified (
Figure 8,
Table 5). The dependencies between the rates of return on prices of grains and oilseeds futures were moderate or weak. Longer periods of stronger dependencies were recorded from October 2006 to the end of January 2008 and from the end of September 2009 to the end of 2018. In contrast, 2009 saw a decline in the correlation between the rates of return on the analyzed futures (
Figure 3), as their prices peaked at record levels at different times, after which they dropped dramatically. The stronger correlations between the rates of return on wheat, corn and soybeans futures in October 2006 were caused by a considerable increase in prices of these futures. During the boom in the market for raw materials, wheat futures prices quadrupled, while corn and soybeans futures prices approximately tripled. The source literature usually lists the following determinants for price growth in the grains and oilseeds market: the development of the biofuel market and the related increase in the correlation between cereals and oilseeds prices and crude oil prices; increased demand caused by economic growth in Asian countries, particularly China; trade liberalization; low stocks of raw materials; underinvestment in agriculture; increase in fertilizer prices; adverse weather conditions; a loose monetary policy (particularly in the US before 2007), which stimulated physical and speculative demand for raw materials; and speculation in the financial markets [
5,
89,
90,
91,
92,
93,
94]. The strengthening of dependencies in the futures market for grains and oilseeds is related to an increase in trade volumes in that market. This increase was related to the transition of the Chicago Board of Trade (CBOT) from the open outcry trading system to a new, more efficient online transaction-matching system, as well as the introduction of ETFs and the inflow of capital from financial investors [
10]. Opening the market to a broader group of investors resulted in a considerable increase in the number of transactions executed using the online transaction-matching system in the years 2006–2008. According to Irwin and Sanders [
10], the enlargement of the group of market players contributed to a decrease in the risk premium and thus reduced hedging costs for raw material producers and processors. In turn, this could also have led to reduced price volatility in commodity markets and the growing integration between commodity and financial markets.
The economic and financial crises (late 2007–early 2013) had no marked impact on the dependency structure in the metal futures market. Two states of the market were identified in the metal futures market (
Figure 9,
Table 6). The structure of conditional dependencies between the rates of return on prices of metal futures changed in 1Q 2004. Afterwards the rates of return on metal futures were correlated more strongly (Kendall’s tau coefficients increased from approx. 0.1–0.3 to 0.4–0.6), which was particularly evident in the rates of return on the following pairs of futures: platinum–gold and platinum–silver. A relatively strong and stable relationship existed throughout the study period between the rates of return on gold and silver futures (
Figure 4,
Table 6). The above conclusions are consistent with the findings reported by Sensoy [
95], who when analyzing the dynamic conditional correlations in the market of precious metals (gold, silver, platinum and palladium) in the years 1999–2013 found a strong correlation between precious metals in the past decade. This makes diversification less beneficial and indicates a convergence towards one class of assets. When investigating the conditional dependence structure between precious metal rates of return, Wanat et al. [
39] also recorded a change on April 29, 2004. Since this study applied the fuzzy
c-means method, when analyzing the degrees of membership of moments (days) to the identified states of the market it may be clearly seen that the period of transition from one state of the market to another in the metal futures market was longer and lasted two years (2004–2005). In those years the correlation between the rates of return on metal futures was increasing. Additionally, it was also a period marked by a considerable rise in metal futures prices. These findings do not corroborate the general observation that in the financial markets the dependencies between assets tend to become stronger in periods of declining prices [
96]. Conversely, our results are consistent with those reported by Attaf et al. [
12], who when studying the period 1960–April 2014 found stronger relationships between most analyzed spot markets for non-energy commodities (metals and minerals, fats and oils, grains, other foods, beverages, agricultural raw materials) during periods of price increases in those markets.
Three patterns of conditional dependency structures were identified in the energy futures market (
Figure 10,
Table 7). The structure of conditional dependencies underwent a major change in early 2Q 2002 and next in April 2003. Between then and mid-2006 the correlation between the rates of return on energy futures prices was greater than in the other periods. During most of the above-mentioned period contract prices followed an upward trend. This was caused by a sharp rise in crude oil prices, which began in March 2003 following the invasion of Iraq. Global economic growth is believed to be the main cause for the increase in crude oil prices in the years 2003–2008 [
97]. During the suprime crisis (February 2008–January 2009) also, the energy futures market was in the state characterized by the highest level of correlation. A strong and relatively stable relationship existed between the rates of return on crude oil and heating oil futures (
Figure 5,
Table 7). That relationship was caused by links between these basic commodities (heating oil is produced by the distillation of crude oil).
Based on these findings it may be concluded that very weak correlations existed between the rates of return on prices of most futures in the markets for agricultural commodities and soft commodities. Conversely, moderate or strong and relatively stable relationships were found between rates of return on futures on fundamentally interrelated commodities (i.e., commodities which are substitutes or raw materials used to produce other commodities) (
Figure 1,
Figure 2,
Figure 3,
Figure 4 and
Figure 5,
Table 3,
Table 4,
Table 5,
Table 6 and
Table 7).