Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Next Article in Journal
Cultural Heritage Tourism and Sustainability: A Bibliometric Analysis
Previous Article in Journal
No Stakeholder Is an Island in the Drive to This Transition: Circular Economy in the Built Environment
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Enhancing Forest Site Classification in Northwest Portugal: A Geostatistical Approach Employing Cokriging

1
Department of Geosciences, Environment and Land Planning, University of Porto, Rua Campo Alegre, 687, 4169-007 Porto, Portugal
2
Earth Sciences Institute (ICT), Pole of the FCUP, University of Porto, 4169-007 Porto, Portugal
3
Forest Research Centre, School of Agriculture, University of Lisbon, Tapada da Ajuda, 1349-017 Lisbon, Portugal
4
Forest Research Centre, Associate Laboratory TERRA, School of Agriculture, University of Lisbon, Tapada da Ajuda, 1349-017 Lisbon, Portugal
*
Author to whom correspondence should be addressed.
Sustainability 2024, 16(15), 6423; https://doi.org/10.3390/su16156423
Submission received: 6 June 2024 / Revised: 7 July 2024 / Accepted: 24 July 2024 / Published: 26 July 2024
(This article belongs to the Section Sustainability in Geographic Science)

Abstract

:
Forest managers need inventory data and information to address sustainability concerns over extended temporal horizons. In situ information is usually derived from field data and computed using appropriate equations. Nonetheless, fieldwork is time-consuming and costly. Thus, new technologies like Light Detection and Ranging (LiDAR) have emerged as an alternative method for forest assessment. In this study, we evaluated the accuracy of geostatistical methods in predicting the Site Index (SI) using LiDAR metrics as auxiliary variables. Since primary variables, which were obtained from forestry inventory data, were used to calculate the SI, secondary variables obtained from LiDAR surveying were considered and multivariate kriging techniques were tested. The ordinary cokriging (CK) method outperformed the simple cokriging (SK) and Inverse Distance Weighted (IDW) methods, which was interpolated using only the primary variable. Aside from having fewer SI sample points, CK was proven to be a trustworthy interpolation method, minimizing interpolation errors due to the highly correlated auxiliary variables, highlighting the significance of the data’s spatial structure and autocorrelation in predicting forest stand attributes, such as the SI. CK increased the SI prediction accuracy by 36.6% for eucalyptus, 62% for maritime pine, 72% for pedunculate oak, and 43% for cork oak compared to IDW, outperforming this interpolation approach. Although cokriging modeling is challenging, it is an appealing alternative to non-spatial statistics for improving forest management sustainability since the results are unbiased and trustworthy, making the effort worthwhile when dense secondary variables are available.

1. Introduction

The Site Index (SI), or average dominant height of trees at a given age, is an ecological indicator and an essential tool for forest management [1]. This parameter is obtained from various forest metrics acquired during forest inventories and calculated using species-specific equations. Although field inventories provide useful information, they are time-consuming, costly, and prone to measurement inaccuracies [2]. To address these challenges, passive remote sensing has been employed as a method for gathering spatial information on forested areas [3]. However, it has limitations since it is sensitive to atmospheric conditions and depends on radiation emitted and reflected from the Earth’s surface. Although passive remote sensing remains highly useful for vegetation categorization, it is critical to overcome its constraints. In this regard, an active remote sensing approach known as Light Detection and Ranging (LiDAR) has recently been employed to capture relevant and reliable information about forest structures [4]. LiDAR employs laser pulses to measure the distance between objects, with a sensor detector coupled to an aircraft or unmanned aerial vehicle (UAV) detecting the reflected pulses from those objects [5] and it is not affected by weather conditions. Most LiDAR data are collected as discrete returns and stored as point cloud data; however, it may also be recorded in full-waveform or single-photon form [6].
Derived products from LiDAR point clouds can be analyzed and interpolated using techniques like the Inverse Distance Weighted (IDW) method, which is a function of the distance between the observed and predicted points and is often used in field inventories to estimate values in unsampled regions [7,8,9]. Despite its ease of application, IDW has several disadvantages, such as not considering the spatial autocorrelation and variations at each anticipated site [8]. On the other hand, kriging and its variants, such as cokriging (CK), are stochastic methods that use geostatistical techniques to model the spatial correlation between regionalized variables through fitting theoretical model to experimental semivariograms to predict values at unsampled locations [10]. Furthermore, the kriging technique minimizes the variance estimate, making the best use of the available information, attempting to provide the best prediction of unobserved values and prevent systematic errors, thereby providing the best estimation [10].
Kriging methods have been used to predict many aspects of forest structures, such as above-ground biomass (AGB) [11], leaf area density [12] and site classification [13], which is correlated with dominant height (Hdom), indicating that kriging could perform better than alternative interpolation techniques. In the last example, the SI was interpolated using ordinary kriging because the number of sample plots (845 points) was significant. According to the authors [13], this makes kriging approaches useful and reliable for delineating SIs and considers geostatistical techniques an indispensable tool for forest inventory.
It is worth noting that insufficient sampling points due to technical or economic constraints or biased sampling designs can impact the estimation of the main variable of interest [14]. CK is a multivariate geostatistical technique extension of kriging that takes into account densely sampled secondary variables when main variables are sparsely sampled. Furthermore, for CK to be more accurate than kriging, the spatial correlation between the main and secondary variables must be significant [15]. There are different CK methods, such as simple CK (SCK), which requires the mean values to be known, and ordinary CK (OCK), which relies on the direct and cross-covariance or direct and cross-variograms [14,15,16,17].
The most common type of forest inventory data is characterized by limited and sparsely sampled points, which may not accurately reflect the SI information. This situation is compounded by the fact that most forest studies rely on non-spatial statistics, neglecting the spatial relationships among forest structure attributes. Therefore, the objective of this research was to showcase the potential of CK approaches in improving SI predictions. This involves considering the spatial correlation between undersampled forest inventory data and densely sampled LiDAR information, as well as leveraging Geographic Information System (GIS) tools for spatial data editing and analysis.
Furthermore, we evaluated the accuracy of CK techniques and compared the results with those of IDW approaches to validate the reliability of the selected method. Additionally, we aimed to offer valuable insights for forest management through the application of geostatistical interpolation techniques across the different species of the Vale do Sousa Forest, located in northern Portugal.

2. Materials and Methods

2.1. Study Site

The study area, Vale do Sousa, is situated in the northwest region of Portugal, and comprises two Forest Intervention Zones (ZIFs): Entre-Douro-e-Sousa and Castelo de Paiva. Established in Portugal in 2005, ZIFs are defined as delimited and continuous areas mostly covered by forest, managed by a single organization in accordance with forest management and firefighting plans [18]. Additionally, the study area includes eight municipalities: Arouca, Castelo de Paiva, Cinfães, Gondomar, Marco de Canaveses, Paredes, Penafiel, and Santa Maria da Feira.
The landscape is predominantly covered by eucalyptus (Eucalyptus globulus Labill), maritime pine (Pinus pinaster Aiton) [19], and sparse vegetation. A small portion of the study area is included in Natura 2000, a network of protected areas dedicated to preserving Europe’s most important species and habitats, spanning the 27 European Union member states [20]. Moreover, in recent years, the region has experienced recurring wildfires, with four of its municipalities ranking among the twenty with the highest number of fires in 2022 [21]. Additionally, the forest lands of Vale do Sousa are privately owned, fragmented, and generally small in size, with most areas covering less than 5 hectares [22].
The characterization of the Vale do Sousa landscape (Figure 1) was carried out in a previous study [23]. A buffer zone was created around both ZIFs, and a Sentinel-2 image from 2022 was preprocessed and classified using a machine learning (ML) approach. This information was essential for identifying the dominant species and various land uses, including for agriculture, and built areas. The buffer zone was established due to nearby forested regions being susceptible to wildfires, with building and agriculture being the most common categories within the buffer.

2.2. Inventory Data Collection

In this study, two forest inventory datasets were used to calculate the Site Index (SI) for the different forest species, specifically, eucalyptus (EC), maritime pine (MP), cork oak (CO), and pedunculate oak (PO). EC and MP are the most representative species due to their commercial value, while CO and PO, though less common, offer land reconversion opportunities because of their resilience to wildfires and other positive environmental impacts. Since the SI estimate relates to potential land productivity rather than forest parameters that can easily change over time, LiDAR data were utilized. The survey followed forest inventory field guidelines [24], defining sample plots as an area of 500 m² to collect general data on trees with a diameter at breast height (DBH) greater than 7 cm, and sub-plots of 12 m² to collect data on trees with smaller diameters, known as regeneration.
The fieldwork was conducted by the forest technician of the Vale do Sousa Forestry Association. The first survey carried out between August and September 2019 was limited to areas within the ZIF borders. A total of 158 plots were inventoried, with 17 parcels considered non-forested or inaccessible. Due to clear-cutting, fires, and other circumstances, 45 plots lacked information on forestland. Of the 158 plots, only 69 contained tree and plot information, while 89 were classified as regenerative.
The second survey took place between July and September 2022. Fieldwork was suspended in August due to the high risk of forest fires to ensure the safety of the forest technicians. A total of 113 plots were inventoried, with 11 classified as built areas and 47 as regenerative. The plots were established using a stratified random sampling strategy based on land cover classes and their proportion in the study site. Land cover classification was conducted using multispectral satellite images and forest inventory sampling plots. For this study, only the plots with tree parameters were used to calculate the Hdom, which enabled the estimation of the SI for the study area.

2.3. Aerial Data Collection and Variable Selection

To conduct the topographic survey of the study area, a private company was hired due to the need to complete the task within a short period. The LiDAR system was mounted on a PATERNAVIA P68C-TC aircraft model, known for its large coverage capability due to its flight autonomy. The system includes an aerial RGB digital camera with a resolution of 100 MP and a RIEGL VQ-1560i laser scanner.
The laser scanner employs waveform-LiDAR technology with dual laser channels, intersecting each other by 28° parallel lines (14° each), with a forward and backward scan angle in a non-nadir direction of ±8° at the edge, and a minimum range of 100 m. It boasts an accuracy and precision of 20 mm, with a scan angle range of 60° per channel, resulting in an effective FOV of 58°. It achieves a high laser pulse repetition rate of up to 2 MHz, resulting in more than 1.33 million measurements per second on the ground.
A wall-to-wall LiDAR aerial data acquisition of the study area was performed in September 2022, within the scope of the FIRE-RES project [25], during the leaf-on season. The acquisition took place during a single flight lasting 4 h, flying at 1600 m above ground at a speed of 220 km/h. This resulted in 600 lines per second surveyed, with a scan width of 1793 m, a point cloud density of 5 points per square meter, and an average ground point distance of 0.45 m.
Given the large area of the study site, which covers 28,942 hectares, tiles measuring 2 km × 2 km were created to store the point cloud data. The LiDAR points were classified as follows: (1) unclassified; (2) ground; (3) low vegetation height (0 m to 2 m); (4) medium vegetation height (2 to 25 m); (5) high vegetation height (>25 m); (6) building; (7) low point (noise); and (12) overlap points.
Raster data such as the Canopy Height Model (CHM) and Digital Elevation Model (DEM) were generated using points classified as ground (last and last of many) and non-ground (first and first of many) (Figure 2). Other raster files such as point density and intensity were generated considering all returns. Only classes 3, 4, and 5, corresponding to tree heights, were considered in generating the CHM.

2.4. Inventory and LiDAR Exploratory Data Analysis

Instead of processing metrics directly from the LiDAR point clouds, metrics were generated by interpolating continuous raster surfaces in ArcGIS Pro® 3.3. This approach was chosen to avoid violating spatial resolution invariance and to eliminate the spatial resolution dependence of the area-based approach. The resulting raster data have high spatial resolution, and the loss of spatial information is minimized. Moreover, ArcGIS Pro® is well-suited for spatial analysis, as it can manage, visualize, analyze, and share large point cloud datasets generated by LiDAR.
However, to compute metrics and statistics from raster files, we must ensure consistency between tree height values in both the spatial and field data. Initially, the CHM raster was normalized using the DEM. Subsequently, a tree height threshold was defined, using tree height information from the field data as a reference to estimate the height interval. The smallest trees had a height of around 2 m, while the tallest reached up to 50 m; considering that some species can exceed 40 m in height, the limit defined to generate CHM information ranged from 2 m to 60 m.
It is important to note that certain factors, such as diameter at breast height (DBH), height, and age, are required to compute Hdom. Consequently, plots lacking these values cannot be used to calculate the SI. Therefore, only plots from the forest inventory dataset that satisfied all these requirements were selected for SI estimation. Given that the PO and CO are often undersampled compared to the dominant species in the study area, CK will be advantageous in predicting the SI for these species as well.
Since each plot covers an area of 500 m², the first five trees with the largest DBH were utilized to calculate the Hdom. Subsequently, the SI was calculated for EC and MP using Equation (1), for PO using Equation (2), and for CO using Equation (3), as outlined below [18,26,27]:
S I = A H d o m A t t p n
S I = 129.0321 H d o m 129.0321 ( 60 t ) 0.301881
S I = A 1 1 A H d o m t t p n
where SI is the Site Index (m); Hdom is the dominant height (m); t is the desired age (years); tp is the standardized age (years); and A, n, and tp are standardized values for each species, detailed in [18,26,27].
Moreover, SI class curves can be divided into 5 distinct quality classes (Table 1), with 1 denoting the lowest and 5 representing the highest quality class [18,26,27,28]. This implies that once we determine the classification of one species, we can categorize other species in the same class.
Various metrics, including height, intensity, and density (Table 2), derived from the raster files generated from the LiDAR point clouds, were subsequently transferred to RStudio using the R-ArcGIS Bridge. This enabled us to access and export spatial data from ArcGIS Pro® into RStudio. Consequently, the data from the plots and LiDAR were treated, processed, and statistically analyzed using both ArcGIS Pro® and RStudio software version 2024.04.2 [29]. The Shapiro–Wilk normality test was then applied to verify the normal distribution of the field and LiDAR data.
Following data distribution verification, a correlation analysis was performed to determine the most strongly correlated variable between the SI and LiDAR-derived data. This information is crucial as it helps identify variables that are highly related; thus, only variables with correlation coefficient values higher than 0.40 were considered [15]. Furthermore, only one secondary variable was chosen for each species, as using more than two auxiliary variables with CK might lead to unreliability [31]. In this manner, the LiDAR metrics employed as predictors were estimated with resolution invariance and high spatial resolution [32]. Similarly, inventory data modeling was conducted to obtain statistics for each inventory sample point, and the SI was computed for each forest inventory sample point.

2.5. Cokriging Predictor

The SI was then imported from RStudio to ArcGIS Pro® for preprocessing, modeling of the spatial structure, and interpolation in conjunction with the LiDAR metrics. Although kriging may be used to estimate the SI, due to the restricted number of sample points, CK was adopted as the geostatistical approach for improving the spatial prediction of SI, especially for those species that have been insufficiently sampled, such as CO and PO.
CK is a multivariate kriging approach that incorporates other geographical variables that are highly correlated [14,33]. From the LiDAR point clouds, the raster files of the relevant metrics with a resolution of 0.40 m were produced. The raster metrics were then converted to points, as CK is a point-based estimation method. Furthermore, the spatial distribution of the sample plot points across the study area was evaluated to determine if there was any sampling aggregation. This is necessary because high-value samples may influence those obtained nearby and might affect the entire research region, thereby skewing the frequency distribution of the variable under consideration.
When modeling several variables, a sequence of matrices for covariance and semivariogram are generated; next, the coregionalization must be modeled. To fit a linear model of coregionalization (LMC), the direct and cross empirical semivariograms must be modeled simultaneously, considering the sill, range, and nugget effects (presuming that covariance and semivariogram have the same basic structure, such as normality and stationarity) [16,31].
Furthermore, to select a suitable model, it is necessary to verify the spatial autocorrelation and dissimilarity between each observed pair of primary and secondary variables [34]. Then, using Equation (4) the empirical semivariograms were computed [15].
γ 1,2 h = 1 2 N ( h ) i = 1 N ( h ) { Z 1 x i Z 1 x i + h Z 2 x i Z 2 x i + h ] }
where the random functions Z 1 x i and Z 2 x i represent the dependent and independent coregionalized variables, and N h is the number of pairs for the variables and increments Z 1 x i + h and Z 2 x i + h of the random functions, separated by the vector’s lag distance h .
Theoretical models were fitted to experimental semivariograms to elucidate the spatial correlation and variability between variables. These parameters were adjusted using the most reliable theoretical variogram models, including Circular (C), Spherical (Sp), Exponential (Exp), and Gaussian (G) models. These models are crucial for implementing the CK estimator approach [10,15,35,36].
As previously mentioned, CK incorporates secondary information and considers the correlation between variables. This means that the dependent and independent variables should exhibit a strong relationship to be employed in the CK Ẑ_1 (x_0) approach for estimating the SI at unsampled locations [10,16].
1 ( x 0 ) = i = 1 n λ 1 i Z 1 ( x 1 ) + j = 1 p λ 2 j Z 2 ( x j )
where 1 ( x 0 ) is interpolated by linear combination of n surrounding the Z 1 variables’ positions and p neighboring locations of the second covariable Z 2 ; λ 1 i     a n d   λ 2 j are the weight matrices under the unbiased condition [10,31,37,38], λ 1 i = 1   and λ 2 j = 0 .
Afterward, cross-validation was employed to assess CK interpolation and determine which models were best suited for each species. The selection criteria were based on several factors: the mean error (ME) should be close to zero, the root mean squared error (RMSE) should be minimized, the mean standardized error (ME-STD) should approach zero, the root mean squared standardized error (RMSE-STD) should ideally be close to one, and the average standard error (ASE) should be minimized and approach the RMSE [39].
All these procedures were conducted using ArcGIS Pro® Geostatistical Analyst. Subsequently, the cross-validation outcomes from the selected CK techniques were compared to the IDW results to assess the CK method’s accuracy. This accuracy assessment involved comparing the root mean squared error (RMSE), mean error (ME), and standard deviation (Sd) of both interpolation methods. In this case, Sd was utilized to evaluate the preciseness of CK in comparison to IDW [40].

3. Results and Analysis

In environmental sciences, data often deviate from a normal distribution, presenting a challenge for geostatistical predictor methods which typically require normally distributed variables. Upon conducting Shapiro–Wilk’s tests on the SI and LiDAR metrics, it was found that the SIs of EC and MP were not normally distributed, whereas those of PO and CO were (p-value > 0.05). However, upon examining Figure 3, it became apparent that the SI distribution for CO, PO, and EC was positively skewed, with EC showing a slight negative skew. Regarding the LiDAR metrics, only Imean (Table 3) was not normally distributed (p-value < 0.05), with a right skew (mean higher than median), while the others exhibited a slight positive skew. To address this, ArcGIS Pro® tools were employed for the descriptive spatial analysis, and the data were normalized using a normal score transformation for SCK and a logarithmic transformation for OCK.
The correlation test results between the LiDAR data and SI revealed a significant association between the two variables. Moreover, the LiDAR measurements exhibited strong interrelations among themselves. Specifically, EC, MP, and PO were significantly correlated with percentile height position. However, CO demonstrated a notable positive association with intensity (Figure 4), representing the electromagnetic reflection from objects intercepted by the laser pulse [41].
Despite the high correlation observed between the various LiDAR metrics and the SI, percentiles 80, 90, and 99 were chosen for EC, MP, and PO. As for CO, Imean intensity (dimensionless) was selected as the LiDAR metric (Table 4). These metrics were employed as auxiliary variables to predict the SI for each species throughout the entire study area.

Cokriging Results

Regardless of the species, the interpolation cross-validation results (Table 5) showed that the RMSE values were quite similar across the board, as were the ME values. Therefore, the theoretical models that best represented the spatial autocorrelation structure of the SI were evaluated based on the smallest difference between the RMSE and ASE, RMSE-STD, ME-STD, and ME statistics.
As previously mentioned, predictor models were chosen considering how closely the RMSE and ASE values align. For EC, MP, and PO, this difference was zero, and for CO, it was negligible, very close to 0 (0.01 m), indicating accurate estimations of the prediction variability. The RMSE-STD statistics were somewhat positive for EC (1.01), MP (1.01), and CO (1.01), while for CO, it was equal to one, demonstrating the high accuracy of the models. The ME-STD statistics were zero for CO, PO, and MP, and slightly negative for EC (−0.01), which was the lowest among all models, indicating minimal bias.
Finally, the ME statistics were not the lowest for EC (0.02 m), MP (0.02 m), and CO (0.04 m), suggesting that the models may be predicting higher values, whereas for PO, it was 0. However, these differences were insignificant, and when compared to the values of other models, the discrepancy was extremely small. This underscores the importance of not solely relying on RMSE and ME as criteria when identifying the most suitable CK model for assessing CK interpolation.
SI interpolation was conducted considering isotropy, the Linear Model of Coregionalization (LMC), and the fitting of theoretical semivariograms and cross-covariance models [39] for EC primary and secondary variables (Figure 5). Similar procedures were applied for MP, PO, and CO, as detailed in Appendix A. The semivariogram models of the primary and secondary variables exhibited high nugget values (Table 6), resulting in very low Sill values, although the ranges were not insignificant, with the EC model having the shortest range at 5500 m. This implies that data beyond this threshold lack spatial autocorrelation.
High nugget values are typically associated not only with random and measurement errors, but also with insufficiently sampled data information [42]. However, these values should not be disregarded as they contain valuable information [43] and should be taken into account in the decision-making process. A significant nugget effect can influence other semivariogram parameters, such as the range and sill, leading to considerable interpolation prediction errors [11]. However, as observed in Table 5, this was not the case, as the interpolation errors were minimized due to the SI prediction being performed using CK techniques, which incorporate extensively sampled secondary variables.
As indicated in Paz-Ferreiro et al. [44], CK improves the spatial continuity of SI predictions at small distances by reducing model variance, thereby offering accurate and reliable outputs. Importantly, if only the main variables were considered, the information would have been insufficient to generate SI predictions, leading to significant cross-validation errors. It is noteworthy that the CO cross-covariance was negative, suggesting a spatial negative correlation between the primary and secondary variable data, indicating that in certain locations, one variable is large while the other is small. However, CK can utilize negative cross-covariance values to enhance the accuracy of the predicted values.
Visual analysis was conducted alongside quantitative analysis. As previously mentioned, the large nugget effect impacts various factors, including the smoothness of prediction maps due to the high kriging weights for distant samples [42]. However, as shown in Figure 4, this was not the case for the CK interpolated surfaces, which were relatively smooth, but exhibited clear transitions between SI classes. This effect is related to the smooth search neighborhood of ArcGIS Pro® Geostatistical Analyst, which is applied to reduce sharp edges [39] rather than the nugget effect.
On the thematic maps (Figure 6), blue regions represent class 1, the lowest SI class, while light yellow, dark yellow, and orange denote classes 2, 3, and 4, respectively, with the highest class 5 depicted in red. The EC prediction map (Figure 6a) displays a smooth distribution, with classes 3 and above concentrated in the south, southeast, and east regions, delineated by well-defined boundaries. Class 1 dominated in the north and southwest, with SI classes evenly dispersed throughout the study area.
For MP (Figure 6b), the prediction map exhibited less smoothness, with classes 3 and 5 predominantly found in the southeast and east regions, while classes 2 and 4 were smaller and less frequent. Class 1 was prominent in the southwest, similar to EC. In the PO prediction map for the SI (Figure 6c), class 4 prevailed in the south and east, with class 1 centered in the northeast, and classes 2 and 3 scattered across the area. In the CO maps (Figure 6d), the SI classes are clearly differentiated, with classes 4 and 5 mainly occupying the southeast and east, while class 1 was concentrated in the southwest and west, and classes 2 and 3 exhibited a more dispersed distribution across the study area. Identifiable SI classes facilitated the decision-making process.
It is noteworthy that one of the outputs of CK interpolation is the standard error surface of predicted values (Figure 6), where smaller errors indicate higher precision and vice versa. Despite low error rates in interpolation cross-validation, the SI prediction error surfaces showed minimal differences among species. The largest errors, ranging from 2.20 m to 2.31 m, were associated with unsampled sites located in the northwest of the study area, where there were fewer samples beyond the ZIF area, resulting in higher variance. Conversely, the lowest error values, ranging from 0.80 m to 0.91 m, were concentrated within the ZIF and around the primary variable sample locations. There was variance within each species’ SI class ranges (see Table 1).
While cross-validation suffices for evaluating CK interpolation, a comparison with the IDW technique was conducted to ensure the reliability of the chosen approach. The IDW interpolation only utilized the main variable. The interpolation output (Figure 7) demonstrated that the CK prediction surfaces were smoother than the IDW surfaces. This discrepancy arises from spatial interruptions in IDW interpolation caused by the high dispersion of the sample points.
Despite this, Amaral and Justina [43] have shown that IDW interpolation outperformed ordinary kriging due to the scattered spatial data information. The authors only analyzed the RMSE and R² values. However, OK had more cross-validation errors and output to assess (e.g., prediction standard error maps) than IDW; thus, these should be addressed when comparing both approaches. However, given the limited number of sample points, deterministic interpolation approaches are likely to outperform kriging. In this manner, CK methods are presented as a better alternative, principally when compared to IDW outcomes (Table 7).
As can be noticed, the RMSE of CK interpolation was lower than that of IDW. The same occurred with the ME. The IDW-RMSE was sometimes triple that of CK-RMSE (e.g., the RMSE statistic of PO). The same trend was observed with the ME, while most CK-ME statistics were zero or negligible, and IDW-ME was always higher. Likewise, the minimum, maximum, and mean of CK interpolation are more reliable because they are not affected by extreme values. The standard deviation (SD) of the CK and IDW interpolation outputs was used to verify the accuracy of the SI predictions [40].
When compared to IDW, CK increased the SI prediction accuracy by 36.6% for EC, 62% for MP, 72% for PO, and 43% for CO, indicating that CK is a trustworthy approach since it integrates a densely sampled secondary variable, regardless of the large nugget effect due to insufficient field data information. These findings suggest that CK is a reliable geostatistical interpolation method for predicting SIs. Moreover, the variance of CK can be assessed using the prediction error map, helping to minimize uncertainty. Consequently, decision-makers can make better decisions about forest management, productivity improvement, silviculture practices for resilient landscapes, and also indicate where field inventory sampling should be prioritized.
When compared to other research works, as can be seen (Table 8), forest field data and different geostatistic approaches are commonly methods applied to SI assessment and interpolation. Among the geostatistical methods, SK and OK are the most commonly applied techniques with this objective.
It is worth mentioning that there are common gaps among these three studies, such as (I) only one kriging method was evaluated, when others should be examined with or without considering secondary variables; (II) key statistical error values to determine whether the model adopted was the most suitable were missing; and (III) none of these studies took into account the standard error surface predicted value (kriging variance error map), which is one of the most significant outcomes of this method to help in the decision-making process. It is worth mentioning that instead of adopting IDW, authors [45] could have used CK to interpolate the SIs with highly correlated climate variables.
The main differences between this study and the others mentioned previously are that two variables were considered, and the necessary exploratory statistical analysis was performed. The spatial autocorrelation modeling and theoretical model fitting outputs were verified prior to being used in CK interpolation in order to enhance the SI prediction accuracy. We aimed to demonstrate that CK could be also reliable and an advance in spatializing dendrometric variables at unsampled locations, when few samples of the primary variable and dense secondary variable sampling are available.

4. Conclusions

Undersampled forest information in vast areas like Vale do Sousa can significantly impact SI prediction when employing interpolation methods, particularly when considering only one variable. In this work, if only the main variable was considered, it would not be sufficient to predict the SI when applying CK. Geostatistical techniques are sensitive to the distance between points, making metrics such as range, sill, and nugget highly influenced by a small number of samples. Depending on the size of the dataset, kriging techniques should be effective for estimating SIs under most scenarios. In this context, it is important to examine different interpolation approaches for determining SIs in unsampled locations before selecting a method. You should consider the results of semivariograms, cross-covariance, statistical errors, and accuracy evaluations when considering which interpolation method is the most appropriate for your data.
However, the main purpose of this research was to demonstrate the advantages of CK for estimating SIs when there is a densely ancillary variable, since it can reduce the effect of large nugget values in the semivariograms and cross-validation errors when working with undersampled primary data. However, this information should not be disregarded, as it provides decision-makers with valuable insights for better forest management decisions. The impact on nugget values is associated with measurement errors and low numbers of sample points. Therefore, geostatistical approaches considering more than one variable, like CK, can be a valid strategy for estimating SIs at unsampled sites. The secondary variables reduced cross-validation errors and improved smoothness, ensuring the proper delineation of classes in the prediction surface maps of EC, MP, PO, and CO.
The study findings indicated that OCK outperformed SCK by reducing estimation errors, and both geostatistical techniques surpassed IDW interpolation, highlighting improved SI prediction when incorporating LiDAR metrics that were significantly correlated to site quality as secondary variables. However, the CK approach is more complex, requiring simultaneous modeling of two semivariograms and the cross-covariance. Despite its complexity, the geostatistical analyst in ArcGIS Pro® facilitates automating the fitting procedure.
In future research, incorporating a third variable in the SI prediction could be explored. Environmental factors such as slope, aspect, soil type, and remote sensing indices like NDVI, along with additional LiDAR metrics, should be evaluated for their correlation with SIs and potential to enhance the prediction accuracy. This approach could mitigate the large nugget effect and estimation errors and improve interpolation accuracy. It is noteworthy that CK can become unstable when more than three secondary variables are considered, highlighting the importance of exploratory statistical analyses to verify the correlation between the primary and secondary variables before selecting ancillary data to be implemented in the CK method. Furthermore, in order to effectively employ the CK approach, the selected auxiliary variables must be intensively sampled. However, with sufficient attention, this system can be modified and applied to different locations, particularly when considering remote sensing satellite image-derived information, for instance, vegetation indices.
Additionally, utilizing variance error predictor thematic maps can guide fieldwork sampling efforts to areas with higher errors to reduce the distance between samples and mitigate large nugget values resulting from limited sampling points. When wall-to-wall LiDAR data collection is not feasible due to technological or economic constraints, geostatistical interpolation methods in combination with LiDAR metrics as ancillary variables could be reliable tools for estimating productivity indices. This assists in decision-making by indicating where efforts should be focused to improve SI estimates and identify areas of higher forest productivity for improving forest management sustainability.

Author Contributions

Conceptualization, B.P.-B. and A.C.T.; Methodology, B.P.-B., J.G.B., S.M. and A.C.T.; Software, B.P.-B.; Formal analysis, B.P.-B.; Writing—original draft, B.P.-B.; Writing—review and editing, B.P.-B., J.G.B., S.M. and A.C.T. All authors have read and agreed to the published version of the manuscript.

Funding

Bárbara Pavani-Biju was financially supported by Portuguese national funds through the Foundation for Science and Technology I.P. (grant reference number: 2022.13033.BD) and Decision Support for the Supply of Ecosystem Services under Global Change (DecisionES) (grant agreement number: 101007950—H2020-MSCA-RISE-2020). José G. Borges and Susete Marques were financially supported by Portuguese national funds through the Foundation for Science and Technology I.P. (project reference UIDB/00239/2020) of the Forest Research Centre and DOI identifier 10.54499/UIDB/00239/2020.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The datasets presented in this article are not readily available because the data are part of an ongoing study. Requests to access the datasets should be directed to [email protected].

Conflicts of Interest

The authors declare no conflicts of interest.

Appendix A. Cokriging Theorical Models and Cross-Covariance

Figure A1. Cokriging theoretical models and cross-covariance. Variograms and cross-correlations for MP, PO, and CO primary and secondary variables. The OCK spherical model was the top-ranked model for MP, followed by Exponential for PO and Circular for CO. The primary variable is SI, the secondary variable is PCT80 for MP, PCT99 for PO, and Imean for CO. The CHM raster data yielded the height metrics PCT80 and PCT 99, whereas the intensity raster data yielded Imean. The cross-covariance demonstrates the positive similarity between the SI and PCT80 and PCT99. PO primary and secondary variables exhibited less similarity.
Figure A1. Cokriging theoretical models and cross-covariance. Variograms and cross-correlations for MP, PO, and CO primary and secondary variables. The OCK spherical model was the top-ranked model for MP, followed by Exponential for PO and Circular for CO. The primary variable is SI, the secondary variable is PCT80 for MP, PCT99 for PO, and Imean for CO. The CHM raster data yielded the height metrics PCT80 and PCT 99, whereas the intensity raster data yielded Imean. The cross-covariance demonstrates the positive similarity between the SI and PCT80 and PCT99. PO primary and secondary variables exhibited less similarity.
Sustainability 16 06423 g0a1

References

  1. Carmean, W.H. Forest Site Quality Evaluation in The United States. Adv. Agron. 1975, 27, 209–269. [Google Scholar] [CrossRef]
  2. Cunha Neto, E.M.D.; Veras, H.F.P.; Moura, M.M.; Berti, A.L.; Sanquetta, C.R.; Pelissari, A.L.; Corte, A.P.D. Combining ALS and UAV to Derive the Height of Araucaria angustifolia. Acad. Bras. Cienc. 2023, 95, e20201503. [Google Scholar] [CrossRef] [PubMed]
  3. Teodoro, A.; Amaral, A. A Statistical and Spatial Analysis of Portuguese Forest Fires in Summer 2016 Considering Landsat 8 and Sentinel 2A Data. Environments 2019, 6, 36. [Google Scholar] [CrossRef]
  4. Corte, A.P.D.; Souza, D.V.; Rex, F.E.; Sanquetta, C.R.; Mohan, M.; Silva, C.A.; Zambrano, A.M.A.; Prata, G.; Alves de Almeida, D.R.; Trautenmüller, J.W.; et al. Forest Inventory with High-Density UAV-Lidar: Machine Learning Approaches for Predicting Individual Tree Attributes. Comput. Electron. Agric. 2020, 179, 105815. [Google Scholar] [CrossRef]
  5. Coops, N.C.; Tompalski, P.; Goodbody, T.R.H.; Queinnec, M.; Luther, J.E.; Bolton, D.K.; White, J.C.; Wulder, M.A.; van Lier, O.R.; Hermosilla, T. Modelling Lidar-Derived Estimates of Forest Attributes over Space and Time: A Review of Approaches and Future Trends. Remote Sens. Environ. 2021, 260, 112477. [Google Scholar] [CrossRef]
  6. 6Toivonen, J.; Kangas, A.; Maltamo, M.; Kukkonen, M.; Packalen, P. Assessing Biodiversity Using Forest Structure Indicators Based on Airborne Laser Scanning Data. Ecol. Manag. 2023, 546, 121376. [Google Scholar] [CrossRef]
  7. Xie, Y.; Hirabayashi, S.; Hashimoto, S.; Shibata, S.; Kang, J. Exploring the Spatial Pattern of Urban Forest Ecosystem Services Based on I-Tree Eco and Spatial Interpolation: A Case Study of Kyoto City, Japan. Environ. Manag. 2023, 72, 991–1005. [Google Scholar] [CrossRef] [PubMed]
  8. Kalivas, D.P.; Kollias, V.J.; Apostolidis, E.H. Evaluation of Three Spatial Interpolation Methods to Estimate Forest Volume in the Municipal Forest of the Greek Island Skyros. Geo-Spat. Inf. Sci. 2013, 16, 100–112. [Google Scholar] [CrossRef]
  9. Shukla, K.; Kumar, P.; Mann, G.S.; Khare, M. Mapping Spatial Distribution of Particulate Matter Using Kriging and Inverse Distance Weighting at Supersites of Megacity Delhi. Sustain. Cities Soc. 2020, 54, 101997. [Google Scholar] [CrossRef]
  10. Yamamoto, J.K. Estatística, Análise e Interpolação de Dados Geoespaciais, 1st ed.; Gráfica Paulos: São Paulo, Brazil, 2020; ISBN 978-65-990727-2-7. [Google Scholar]
  11. Li, W.; Niu, Z.; Liang, X.; Li, Z.; Huang, N.; Gao, S.; Wang, C.; Muhammad, S. Geostatistical Modeling Using LiDAR-Derived Prior Knowledge with SPOT-6 Data to Estimate Temperate Forest Canopy Cover and above-Ground Biomass via Stratified Random Sampling. Int. J. Appl. Earth Obs. Geoinf. 2015, 41, 88–98. [Google Scholar] [CrossRef]
  12. Soma, M.; Pimont, F.; Allard, D.; Fournier, R.; Dupuy, J.L. Mitigating Occlusion Effects in Leaf Area Density Estimates from Terrestrial LiDAR through a Specific Kriging Method. Remote Sens. Environ. 2020, 245, 111836. [Google Scholar] [CrossRef]
  13. Filho, A.M.; Netto, S.P.; Machado, S.A.; Corte, A.P.D.; Behling, A. Site Classification for Eucalyptus Sp. in a Tropical Region of Brazil. Acad. Bras. Cienc. 2023, 95. [Google Scholar] [CrossRef]
  14. Adeli, A.; Dowd, P.; Emery, X.; Xu, C. Using Cokriging to Predict Metal Recovery Accounting for Non-Additivity and Preferential Sampling Designs. Min. Eng. 2021, 170, 106923. [Google Scholar] [CrossRef]
  15. Goovaerts, P. Geostatistical Approaches for Incorporating Elevation into the Spatial Interpolation of Rainfall. J. Hydrol. 2000, 228, 113–129. [Google Scholar] [CrossRef]
  16. Goovaerts, P. Ordinary Cokriging Revisited. Math. Geol. 1998, 30, 21–42. [Google Scholar] [CrossRef]
  17. Payares-Garcia, D.; Osei, F.; Mateu, J.; Stein, A. A Poisson Cokriging Method for Bivariate Count Data. Spat. Stat. 2023, 57, 100769. [Google Scholar] [CrossRef]
  18. Instituto da Conservação da Natureza e das Florestas. IFN6—Anexo Técnico; Instituto da Conservação da Natureza e das Florestas: Lisboa, Portugal, 2019. [Google Scholar]
  19. Marques, M.; Juerges, N.; Borges, J.G. Appraisal Framework for Actor Interest and Power Analysis in Forest Management—Insights from Northern Portugal. Policy Econ. 2020, 111, 102049. [Google Scholar] [CrossRef]
  20. EUR-Lex. DIRECTIVA 92/43/CEE DO CONSELHO Relativa à Preservação Dos Habitats Naturais e Da Fauna e Da Flora Selvagens; Comunidade Económica Europeia: Bruxelas, Belgium, 1992; Available online: https://eur-lex.europa.eu/legal-content/PT/TXT/?uri=celex%3A31992L0043 (accessed on 5 March 2024).
  21. Florestas. ICNF-8o RELATÓRIO PROVISÓRIO DE INCÊNDIOS RURAIS—Direção Nac. De Gestão Do Programa De Fogos Rurais. 2023. Available online: https://icnf.pt/florestas/flestudosdocumentosestatisticasindicadores (accessed on 18 January 2024).
  22. Marques, M.; Reynolds, K.M.; Marto, M.; Lakicevic, M.; Caldas, C.; Murphy, P.J.; Borges, J.G. Multicriteria Decision Analysis and Group Decision-Making to Select Stand-Level Forest Management Models and Support Landscape-Level Collaborative Planning. Forests 2021, 12, 399. [Google Scholar] [CrossRef]
  23. Pavani-Biju, B.; Borges, J.G.; Marques, S.; Teodoro, A.C. Fire Risk Analysis Based on a Neural Network Framework and Remote Sensing Data; Manuscript in Preparation; Faculty of Science: Porto, Portugal, 2024. [Google Scholar]
  24. Marques, S. Guia de Campo—Inventário Florestal; Unpublished Work; Forest Research Centre and Associate Laboratory Terra, School of Agriculture, University of Lisbon: Lisboa, Portugal, 2022. [Google Scholar]
  25. FIRE-RES—Innovative Solutions for Fire Resilient Territories in Europe—FIRE RES. Available online: https://fire-res.eu/ (accessed on 5 June 2024).
  26. Sánchez-González, M.; Tomé, M.; Montero, G. Modelling Height, and Diameter Growth of Dominant Cork Oak Trees in Spain. Ann. Sci. 2005, 62, 633–643. [Google Scholar] [CrossRef]
  27. Barrio Anta, M.; Diéguez-Aranda, U. Site Quality of Pedunculate Oak (Quercus robur L.) Stands in Galicia (Northwest Spain). Eur. J. Res. 2005, 124, 19–28. [Google Scholar] [CrossRef]
  28. Chen, Y.; Zhu, X. Site Quality Assessment of a Pinus Radiata Plantation in Victoria, Australia, Using LiDAR Technology. South. For. J. For. Sci. 2012, 74, 217–227. [Google Scholar] [CrossRef]
  29. RStudio Team. RStudio: Integrated Development for R; RStudio, PBC: Boston, MA, USA, 2020; Available online: http://www.rstudio.com/ (accessed on 10 January 2024).
  30. Woods, M.; Lim, K.; Treitz, P. Predicting Forest Stand Variables from LiDAR Data in the Great Lakes—St. Lawrence Forest of Ontario. For. Chron. 2008, 84, 827–839. [Google Scholar] [CrossRef]
  31. Wackernagel, H. Cokriging versus Kriging in Regionalized Multivariate Data Analysis. Geoderma 1994, 62, 83–92. [Google Scholar] [CrossRef]
  32. Packalen, P.; Strunk, J.; Packalen, T.; Maltamo, M.; Mehtätalo, L. Resolution Dependence in an Area-Based Approach to Forest Inventory with Airborne Laser Scanning. Remote Sens. Environ. 2019, 224, 192–201. [Google Scholar] [CrossRef]
  33. Stein, A.; Corsten, L.C.A. Universal Kriging and Cokriging as a Regression Procedure. Biometrics 1991, 47, 575. [Google Scholar] [CrossRef]
  34. Camargo, E.; Druck, S.; Câmara, G. Análise de Superfícies Por Geoestatística Linear. In Análise Espacial de Dados Geográficos; Druck, S., Carvalho, M., Câmara, G., Monteiro, A.M.V., Eds.; EMBRAPA: Brasília, Brazil, 2004; ISBN 85-7383-260-6. [Google Scholar]
  35. Webster, R.; Oliver, M.A. Geostatistics for Environmental Scientists; Wiley: Hoboken, NJ, USA, 2007; ISBN 9780470028582. [Google Scholar]
  36. McKinley, J.M.; Atkinson, P.M. A Special Issue on the Importance of Geostatistics in the Era of Data Science. Math. Geosci. 2020, 52, 311–315. [Google Scholar] [CrossRef]
  37. Helterbrand, J.D.; Cressie, N. Universal Cokriging under Intrinsic Coregionalization. Math. Geol. 1994, 26, 205–226. [Google Scholar] [CrossRef]
  38. Stein, A. Spatial Interpolation of Soil Moisture Data with Universal Cokriging. In Geostatistics Tróia ’92; Soares, A., Ed.; Quantitative Geology and Geostatistics; Springer: Dordrecht, The Netherlands, 1993; Volume 5, pp. 841–852. ISBN 978-94-011-1739-5. [Google Scholar]
  39. Johnston, K.; Ver Hoef, J.M.; Krivoruchko, K.; Lucas, N. ArcGIS ® 9 Using ArcGIS ® Geostatistical Analyst; Environmental Systems Research Institute, Ed.; ESRI Press: Redlands, CA, USA, 2001; ISBN 978-1589481060. [Google Scholar]
  40. Felgueiras, C.; Celso, E.; Felgueiras, C.A.; Ortiz, J.O.; Camargo, E.C.G. Application of Geostatistical Conflation Techniques to Improve the Accuracy of Digital Elevation Models. GeoInfo 2014, 2014, 149–155. [Google Scholar]
  41. Gorgens, E.B.; Silva, A.G.P.; Rodriguez, L.C.E. LiDAR: Aplicações Florestais, 1st ed.; CRV: Curitiba, Brazil, 2014; ISBN 978-85-4440-105-7. [Google Scholar]
  42. Guedes, L.P.C.; Bach, R.T.; Uribe-Opazo, M.A. Nugget Effect Influence on Spatial Variability of Agricultural Data. Eng. Agric. 2020, 40, 96–104. [Google Scholar] [CrossRef]
  43. Amaral, L.R.; Justina, D.D.D. Spatial Dependence Degree and Sampling Neighborhood Influence on Interpolation Process for Fertilizer Prescription Maps. Eng. Agric. 2019, 39, 85–95. [Google Scholar] [CrossRef]
  44. Paz-Ferreiro, J.; Vázquez, E.V.; Vieira, S.R. Geostatistical Analysis of a Geochemical Dataset. Bragantia 2010, 69, 121–129. [Google Scholar] [CrossRef]
  45. Alegria, C.; Roque, N.; Albuquerque, T.; Gerassis, S.; Fernandez, P.; Ribeiro, M.M. Species Ecological Envelopes under Climate Change Scenarios: A Case Study for the Main Two Wood-Production Forest Species in Portugal. Forests 2020, 11, 880. [Google Scholar] [CrossRef]
  46. Vaz da Rocha, P.; José Gomes de Araújo, E.; Augusto Morais, V.; Antonio Monte, M.; Henrique dos Santos Ataíde, D.; Candido Silva, L. Site index in eucalyptus stands applying ordinary kriging: An approach with different models and methods of classification. Floresta 2021, 51, 1000–1009. [Google Scholar] [CrossRef]
  47. Raimundo, M.R.; Scolforo, H.F.; de Mello, J.M.; Scolforo, J.R.S.; McTague, J.P.; dos Reis, A.A. Geostatistics Applied to Growth Estimates in Continuous Forest Inventories. For. Sci. 2017, 63, 29–38. [Google Scholar] [CrossRef]
Figure 1. Land cover classes of the study area (Vale do Sousa).
Figure 1. Land cover classes of the study area (Vale do Sousa).
Sustainability 16 06423 g001
Figure 2. Canopy Height Model (CHM) and Digital Elevation Model (DEM) Raster data generated from the LiDAR cloud points with a resolution of 0.4 m.
Figure 2. Canopy Height Model (CHM) and Digital Elevation Model (DEM) Raster data generated from the LiDAR cloud points with a resolution of 0.4 m.
Sustainability 16 06423 g002
Figure 3. SI (m) frequency distribution (histogram) of the chosen species.
Figure 3. SI (m) frequency distribution (histogram) of the chosen species.
Sustainability 16 06423 g003
Figure 4. SI Pearson correlation matrices for EC, MP, PO, CO, and their specific LiDAR metrics. The moderately and highly correlated metrics are in red (0.4 above), the metrics with the lowest correlation are in blue (−0.4 to −0.8). Grey represents metrics with low positive, negative, or zero correlation (−0.2 to 0.2).
Figure 4. SI Pearson correlation matrices for EC, MP, PO, CO, and their specific LiDAR metrics. The moderately and highly correlated metrics are in red (0.4 above), the metrics with the lowest correlation are in blue (−0.4 to −0.8). Grey represents metrics with low positive, negative, or zero correlation (−0.2 to 0.2).
Sustainability 16 06423 g004
Figure 5. CK spherical method was the best ranked model. The primary variable is the SI of EC, and the secondary variable is the PCT80 extracted from the CHM raster data. Cross-covariance shows the positive similarity between the SI and PCT90.
Figure 5. CK spherical method was the best ranked model. The primary variable is the SI of EC, and the secondary variable is the PCT80 extracted from the CHM raster data. Cross-covariance shows the positive similarity between the SI and PCT90.
Sustainability 16 06423 g005
Figure 6. CK prediction (SI classes 1 to 5) and error (CK variance) maps for SI of EC (a), MP (b), PO (c), and CO (d). Points represent sample locations distributed over the study area.
Figure 6. CK prediction (SI classes 1 to 5) and error (CK variance) maps for SI of EC (a), MP (b), PO (c), and CO (d). Points represent sample locations distributed over the study area.
Sustainability 16 06423 g006
Figure 7. Visual comparison of CK and IDW interpolation maps. CK maps provide a continuous and smoother transition across SI classes (1–5), whereas IDW maps have an abrupt transition due to the discontinued surface.
Figure 7. Visual comparison of CK and IDW interpolation maps. CK maps provide a continuous and smoother transition across SI classes (1–5), whereas IDW maps have an abrupt transition due to the discontinued surface.
Sustainability 16 06423 g007
Table 1. Site Index (SI) class interval for each selected specie.
Table 1. Site Index (SI) class interval for each selected specie.
SI ClassEucalyptus (EC) (m)Maritime Pine (MP) (m)Pedunculate Oak (PO) (m)Cork Oak (CO) (m)
1<13<14<7<6
21618128
319221710
421242212
523262714
Table 2. LiDAR metrics derived from CHM and intensity raster data generated from LiDAR point clouds.
Table 2. LiDAR metrics derived from CHM and intensity raster data generated from LiDAR point clouds.
LiDAR ParameterMetricsDescription
HeightZmin, Zmax, Zmean, Zmedian, Zmode, Zsd, Zcv, Zskew, Zkurt, Z10, Z20, Z25, Z30, Z40, Z50, Z60, Z70, Z75, Z80, Z90, Z95Minimum height, maximum height, average height, median height, mode height, standard deviation height, kurtosis height, 10 to 95th percentile height
Intensity of returnsImin, Imax, Imean, Imedian, Imode, Isd, Zcv, Iskew, Ikurt, I10, I20, I25, I30, I40, I50, I60, I70, I75, I80, I90, I95Minimum intensity, maximum intensity, average intensity, median intensity, mode intensity, standard deviation intensity, kurtosis intensity, 10 to 99th percentile intensity
DensityZpcum (x = 1…9) (%)Cumulative percentage of return in the xth layer, according to Woods et al. [30]
CCCanopy density (%)
OverAllD (x = 1...9) (%)
Height (2 to 60 m)
Overall relative density, the fraction of points in each height stratum compared to the total number of points in each cell
NormRD (x = 1…9) (%)
Height (2 to 60 m)
The normalized relative density is the number of points in each stratum divided by the total number of points within and below the stratum of interest
Table 3. Summary statistics of Hdom for EC, MP, PO, CO, and LiDAR metrics.
Table 3. Summary statistics of Hdom for EC, MP, PO, CO, and LiDAR metrics.
VariablesMinMaxQ25Q75MeanMedianVarSd
EC (m)8.2830.0412.7420.2216.715.2926.515.14
MP (m)5.75238.6212.3411.4810.820.264.50
PO (m)318.16711.69.257.8523.304.83
CO (m)218.154.629.087.296.3817.964.24
EC-PCT805.1925.148.2914.6511.4810.2525.125.01
MP-PCT905.7927.60916.1012.8211.5328.415.33
PO-PCT996.9635.7313.7423.2818.9019.3236.946.09
CO-Imean32,634.1244,824.8734,627.9636,930.6235,944.8235,664.233,934,0701984
Table 4. LiDAR metrics and SI correlation coefficients. Values were calculated by applying the Pearson Correlation test in Rstudio [29], which also resulted in Figure 4.
Table 4. LiDAR metrics and SI correlation coefficients. Values were calculated by applying the Pearson Correlation test in Rstudio [29], which also resulted in Figure 4.
Species MetricCorrelation Coefficient (CC)
EucalyptusPercentile 90 (PCT90)
(m)
CC 0.64
(p < 0.05; 95%; Confidence Interval 0.33 to 0.82)
Maritime PinePercentile 80 (PCT80) (m)CC 0.69
(p < 0.05; 95%; Confidence Interval 0.40 to 0.85)
Pedunculate OakPercentile 99 (PCT99) (m)CC 0.66
(p < 0.05; 95%; Confidence Interval 0.01 to 0.92)
Cork OakMEAN (Imean)CC 0.69
(p < 0.05; 95%; Confidence Interval 0.26 to 0.89)
Table 5. The ranked cokriging method and theoretical models according to the cross-validation results. Ordered based on the lowest difference between RMSE and ASE, the lowest RMSE-STD, ME-STD, and ME.
Table 5. The ranked cokriging method and theoretical models according to the cross-validation results. Ordered based on the lowest difference between RMSE and ASE, the lowest RMSE-STD, ME-STD, and ME.
Geostatistic LayerRankCokriging
Method
RMSEMEME STDRMSE STDASE
Eucalyptus1Ordinary Kriging–Spherical2.75 m0.02 m−0.011.012.75 m
2Ordinary Kriging–Exponential2.78 m0.10 m0.021.002.80 m
3Ordinary Kriging–Circular2.75 m0.00 m−0.021.052.66 m
4Simple Kriging–Exponential2.70 m0.06 m0.031.072.57 m
5Ordinary Kriging–Gaussian2.77 m−0.03 m−0.031.082.61 m
6Simple Kriging–Circular2.75 m0.07 m0.051.182.45 m
7Simple Kriging–Gaussian2.70 m−0.01 m0.021.172.37 m
8Simple Kriging–Spherical2.72 m0.06 m0.051.192.37 m
Maritime Pine1Ordinary Kriging–Spherical2.83 m0.02 m0.001.012.83 m
2Ordinary Kriging–Exponential2.91 m0.13 m0.031.012.92 m
3Ordinary Kriging–Gaussian2.80 m−0.06 m−0.031.032.77 m
4Simple Kriging–Spherical2.71 m0.05 m0.031.022.66 m
5Ordinary Kriging–Exponential2.79 m0.01 m0.001.082.61 m
6Simple Kriging–Circular2.78 m0.05 m0.051.132.50 m
7Simple Kriging–Gaussian2.79 m0.04 m0.051.182.39 m
8Simple Kriging–Exponential2.84 m0.13 m0.101.222.42 m
Pedunculate Oak1Ordinary Kriging–Exponential3.13 m0.04 m0.001.003.13 m
2Ordinary Kriging–Spherical3.14 m0.01 m−0.011.003.16 m
3Ordinary Kriging–Gaussian3.12 m−0.03 m−0.020.993.17 m
4Simple Kriging–Gaussian3.10 m−0.07 m−0.011.033.03 m
5Simple Kriging–Exponential3.08 m0.01 m0.011.042.98 m
6Simple Kriging–Spherical3.08 m−0.02 m0.001.042.99 m
7Ordinary Kriging–Circular3.14 m0.02 m−0.010.983.20 m
8Simple Kriging–Circular3.09 m−0.02 m0.001.052.96 m
Cork Oak1Ordinary Kriging–Circular2.87 m0.00 m0.001.012.86 m
2Simple Kriging–Circular2.89 m0.09 m0.021.002.93 m
3Simple Kriging–Spherical2.90 m0.10 m0.021.002.94 m
4Simple Kriging–Gaussian2.91 m0.05 m0.011.012.92 m
5Simple Kriging–Exponential2.96 m0.15 m0.031.003.01 m
6Ordinary Kriging–Gaussian3.79 m−0.02 m−0.011.053.61 m
7Ordinary Kriging–Spherical3.83 m−0.15 m−0.041.083.56 m
8Ordinary Kriging–Exponential3.85 m−0.18 m−0.051.073.60 m
Table 6. Semivariogram and cross-covariance variable outputs for each selected CK interpolation method, ranked according to the cross-validation outputs.
Table 6. Semivariogram and cross-covariance variable outputs for each selected CK interpolation method, ranked according to the cross-validation outputs.
SpeciesModelVariableSillPartial SillNugget EffectRange (m)
ECSp-OCKVariable 10.0220.010.0125500
Covariance-0.004-
Variable 20.290.090.2
MPSp-OCKVariable 10.0260.010.0169300
Covariance-0.01-
Variable 20.290.230.06
POExp-OCKVariable 10.030.010.029000
Covariance-0.07-
Variable 20.810.220.59
COCir-OCKVariable 19.152.46.756000
Covariance-−0.02-
Variable 20.0140.0040.01
Table 7. Summary and error statistics, and accuracy comparison between CK and IDW. IDW cross-validation solely provided the RMSE and ME statistics.
Table 7. Summary and error statistics, and accuracy comparison between CK and IDW. IDW cross-validation solely provided the RMSE and ME statistics.
SpeciesInterpolation MethodRMSEMEMinMaxMeanSdAccuracy Improvement
EucalyptusSimple Kriging–Spherical2.80.012.924.3201.0936.6%
IDW4.90.3228.418.581.72
Maritime PineSimple Kriging–Spherical2.80.013.823.522.80.962%
IDW6.50.51350202.37
Pedunculate OakOrdinary Kriging–Exponential3.10.16.925.919.90.8372%
IDW7.80.84.36019.232.96
Cork OakOrdinary Kriging–Circular2.90.05.9214.811.420.8543%
IDW4.20.2328191.49
Table 8. Data collection and statistical approaches applied for SI assessment in previous studies.
Table 8. Data collection and statistical approaches applied for SI assessment in previous studies.
StudyField Work Data CollectionRemote Sensing DataMain Statistical Approach Used
Species Ecological Envelopes under Climate Change Scenarios: A Case Study for the Main Two Wood-Production Forest Species in Portugal [45]The SIs for maritime pine (n = 744) and eucalyptus (n = 612) were calculated using measurements from National Forest Inventory Plots in pure standsClimate variables and scenarios, elevation, and soil data (1 km × 1 km grid).
COS is a thematic map of Portuguese land cover and land use from 2015
Empirical models for SI assessment, SK, IDW, and Bayesian machine learning methods. Climate parameters were interpolated with IDW.
Site Index in Eucalyptus Stands Applying Ordinary Kriging: An Approach with Different Models and Methods of Classification [46]Forest inventory plots (n = 170) for 2119 hectaresRS data were not used in this workAlgebraic difference methods (regression models) to estimate SI and ordinary kriging for spatialization.
Geostatistics Applied to Growth Estimates in Continuous Forest Inventories [47]Continuous forest inventories (n = 89 plots)RS data were not used in this workAlgebraic difference methods used for SI classification.
Geostatistical approach used for Hdom stratification.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Pavani-Biju, B.; Borges, J.G.; Marques, S.; Teodoro, A.C. Enhancing Forest Site Classification in Northwest Portugal: A Geostatistical Approach Employing Cokriging. Sustainability 2024, 16, 6423. https://doi.org/10.3390/su16156423

AMA Style

Pavani-Biju B, Borges JG, Marques S, Teodoro AC. Enhancing Forest Site Classification in Northwest Portugal: A Geostatistical Approach Employing Cokriging. Sustainability. 2024; 16(15):6423. https://doi.org/10.3390/su16156423

Chicago/Turabian Style

Pavani-Biju, Barbara, José G. Borges, Susete Marques, and Ana C. Teodoro. 2024. "Enhancing Forest Site Classification in Northwest Portugal: A Geostatistical Approach Employing Cokriging" Sustainability 16, no. 15: 6423. https://doi.org/10.3390/su16156423

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop