Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
Next Article in Journal
Genome-Wide Profiling of the Genes Related to Leaf Discoloration in Zelkova schneideriana
Previous Article in Journal
Minor Effects of Canopy and Understory Nitrogen Addition on Soil Organic Carbon Turnover Time in Moso Bamboo Forests
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Estimation of Picea Schrenkiana Canopy Density at Sub-Compartment Scale by Integration of Optical and Radar Satellite Images

1
School of Geography and Remote Sensing, Guangzhou University, Guangzhou 510006, China
2
Aerospace Information Research Institute, Chinese Academy of Sciences, Beijing 100101, China
3
Tianjin Centre of Geological Survey, China Geological Survey, Tianjin 300170, China
4
North China Center for Geoscience Innovation, China Geological Survey, Tianjin 300170, China
5
Institute of Remote Sensing Satellite, China Academy of Space Technology, Beijing 100095, China
*
Author to whom correspondence should be addressed.
Forests 2024, 15(7), 1145; https://doi.org/10.3390/f15071145
Submission received: 10 May 2024 / Revised: 18 June 2024 / Accepted: 28 June 2024 / Published: 1 July 2024

Abstract

:
This study proposes a novel approach to estimate canopy density in Picea Schrenkiana var. Tianschanica forest sub-compartments by integrating optical and radar satellite data. This effort is aimed at enhancing methodologies for forest resource surveys and monitoring, particularly vital for the sustainable development of semi-arid mountainous areas with fragile ecological environments. The study area is the West Tianshan Mountain Nature Reserve in Xinjiang, which is characterized by its unique dominant tree species, Picea Schrenkiana. A total of 411 characteristic factors were extracted from Gaofen-2 (GF-2) sub-meter optical satellite imagery, Gaofen-3 (GF-3) multi-polarization synthetic aperture radar satellite imagery, and digital elevation model (DEM) data. Consequently, 17 characteristic parameters were selected based on their correlation with canopy density data to construct an estimation model. Three distinct models were developed, including a multiple stepwise regression model (a linear approach), a Back Propagation (BP) neural network model (a neural network-based method), and a Cubist model (a decision tree-based technique). The results indicate that combining optical and radar image characteristics significantly enhances accuracy, with an Average Absolute Percentage Precision (AAPP) value improvement in estimation accuracy from 76.50% (with optical image) and 78.50% (with radar image) to 78.66% (with both). Of the three models, the BP neural network model achieved the highest overall accuracy (79.19%). At the sub-component scale, the BP neural network model demonstrated superior accuracy in low canopy density estimation (75.37%), whereas the Cubist model, leveraging radar image characteristics, excelled in medium density estimations (87.46%). Notably, the integrated Cubist model combining optical and radar data achieved the highest accuracy for high canopy density estimation (89.17%). This study highlights the effectiveness of integrating optical and radar data for precise canopy density assessment, contributing significantly to ecological resource monitoring methodologies and environmental assessments.

1. Introduction

Canopy density, an essential metric in forest management, reflects the ratio of the crown’s projected area to the total area of a sub-compartment and means the canopy cover percentage. This measure is crucial for understanding the distribution of ecological factors such as water and heat within the forest ecosystem [1,2,3,4] and is pivotal in assessing forest tree crown density, spatial tree utilization, and the ecosystem’s spatial structure [5,6]. In the field of forest management, canopy density of the forest is an important basic parameter for survey of forest resources, as well as a main evaluation index to measure whether the forest stands are managed reasonably and to understand the growth status of the forest stands [7,8]. Accurate investigation and evaluation of stand canopy density is essential for effective forest management, especially in semi-arid mountainous regions where the ecological environment is notably fragile. Forest resources in these areas play a crucial role in ensuring the health and sustainable development of the local ecosystem. However, due to the typically poor accessibility of semi-arid mountainous regions, conducting timely and accurate assessments of forest resources presents a significant scientific challenge that urgently needs addressing. This challenge underscores the necessity for innovative approaches and methodologies in forest resource surveying and monitoring to maintain the ecological balance and promote sustainable environmental practices in these sensitive areas.
Advancements in remote sensing technology have broadened the scope for estimating sub-compartment canopy density [9,10,11,12]. Historically, such estimations have primarily relied on optical image data combined with ground-measured sample plot data, with little use of radar data. Widely used optical remote sensing data sources include Landsat TM/ETM+/OLI, IKONOS, SPOT, Hyperion hyperspectral data, and domestic CBERS, ZY-3, and GF-1 data. These sources provide various features like band gray value, reflectivity, band ratio, vegetation indexes, and terrain elevation features [13,14,15,16,17,18,19]. In contrast, radar data estimation typically involves regression models correlating ground measurements with radar backscatter coefficients and texture characteristics [20]. However, optical data sensors are vulnerable to solar irradiation, cloud cover, and atmospheric conditions, limiting data effectiveness. Corrective methods to counteract these influences add complexity and potential errors in canopy density estimations [21]. Radar sensors, though less affected by climate conditions, present challenges in image interpretation due to geometric distortions like perspective contraction and shadow effects [20]. Taking into account the data collaboration strategies between optical and radar imagery, these strategies have the potential to overcome the limitations of single-source data inversion. Evaluating the suitability and compatibility of data collaboration strategies with forest canopy density inversion algorithms and advancing the development of a more accurate remote sensing inversion process for forest canopy density using this technology has become an urgent scientific challenge.
In terms of remote sensing estimation models, the remote sensing estimation methods of canopy density of the forest mainly include regression models, pixel decomposition models, and physical models, each with its own advantages. Regression models have few input parameters and is operated simply and easily, but the regional quantitative estimation result of the nonparametric estimation method on the pixel scale has a large error [22,23,24,25]. Pixel decomposition models require less ground data and suit hyperspectral data but perform poorly in complex forest types [17,18,26,27,28,29]. Physical models offer a robust mechanism but are complex and less commonly used [30]. In summary, the performance of various model algorithms varies significantly depending on the data used, indicating a distinct adaptability between the form of the data and the underlying principles of the algorithms. This variability underscores the importance of carefully selecting the optimal model for forest canopy density assessment. Such selection necessitates extensive evaluation and analysis, taking into account the characteristics and nature of the input data. This approach ensures the most effective and accurate application of these models in the context of forest resource management and ecological studies [18,19,22,31,32,33,34].
Existing forest canopy density estimation studies face challenges due to their reliance on single-source data, such as optical or radar images, which are susceptible to atmospheric conditions or geometric distortions. Furthermore, current estimation models exhibit variable performance and may struggle in complex forest environments. In contrast, this study innovatively integrates optical and radar satellite data to overcome the limitations of a single data source and improve the accuracy of canopy density assessment. By developing and evaluating linear and nonlinear estimation models, the characteristics of the input data are taken into account, and the effectiveness of the proposed method at the sub-component scale is verified. This enhanced capability will lead to better-informed decisions, more effective conservation strategies, and improved overall management of forest resources.
Regarding the issue of Picea Schrenkiana canopy density estimation, on the one hand, although there are many estimation models, there is still no detailed verification and demonstration of the quantitative comparison between traditional linear regression models and nonlinear regression models. This is one of the questions that this article attempts to answer. On the other hand, remote sensing images have gradually transitioned from low-spatial-resolution multispectral data and hyperspectral data to medium- to high-spatial-resolution data. High-spatial-resolution images provide richer spectral and texture information. However, for estimating forest canopy closure in mountainous or large areas, terrain data and other image data need to be added to achieve complementary advantages of multiple data sources. The second challenge that this article aims to overcome is to determine to what extent the computational trend of combining multiple data sources can improve model accuracy. Therefore, this article attempts to construct an estimation model with strong adaptability and high accuracy using active and passive satellite image data, digital elevation model (DEM) data, and forest resource data.
The novel contributions are as follows:
  • A quantitative comparison is given between linear and nonlinear regression models for Picea Schrenkiana canopy density estimation.
  • Multiple data sources are integrated, including active and passive satellite imagery and DEM data, to improve model accuracy.
  • A discussion is provided of the estimation potential using different models and different features on canopy density estimation at the sub-component scale.

2. Materials and Methods

2.1. Study Region

The study region for this research is located at the northern base of Nalati Mountain, within the western part of the Tianshan Mountains in China. This area is defined by coordinates ranging from 43°01′ to 43°15′ N latitude and 82°36′ to 82°56′ E longitude. It encompasses the Mohe River Forest Farm and parts of the Western Tianshan Mountain Nature Reserve. Characterized by a rugged mountainous landscape, the terrain rises high in the southeast and lowers in the northwest, with elevations varying between 1099 and 3718 m.
The climate of this region is classified as a temperate continental semi-arid type, experiencing annual precipitation levels of 500 to 700 mm. The study area covers a total of 617.92 km2, with a forest coverage rate of approximately 32.04%. Forests are predominantly situated on the shady and semi-shady slopes and are mainly found at altitudes ranging from 1500 to 2800 m. This leads to a distinctive patchwork pattern of forests and pasture lands interwoven across the landscape.
There is an uneven age distribution within these forests. Mature forests form a considerable part of the area, whereas younger and middle-aged forests are less prevalent. The forest types in this region are relatively uniform, dominated by pure stands of Picea Schrenkiana. At lower altitudes and on semi-sunny slopes, there is a smaller presence of mixed coniferous and broad-leaved forests, as well as areas with only broad-leaved forests. In terms of composition, the Picea Schrenkiana species is predominant, occupying about 98.70% of the forest area and accounting for 99.60% of the forest stock. The exact location of the study area is illustrated in Figure 1.

2.2. Dataset

In this study, canopy density was estimated using active remote sensing images, passive remote sensing images, and digital elevation model (DEM) data. Active sensors emit energy towards the Earth’s surface and measure the reflected or backscattered radiation, and passive sensors detect natural radiation emitted or reflected by the Earth’s surface without emitting energy themselves. Specifically, Gaofen-2 (GF-2) optical satellite data are passive, capturing sunlight reflected off the Earth’s surface. In contrast, Gaofen-3 (GF-3) multipolar synthetic aperture radar (SAR) data are active, emitting microwave pulses and measuring the backscattered signal. These data are integrated into the construction and assessment of the proposed models, in conjunction with secondary data from forest resource surveys.

2.2.1. Image Data

The image data used in this study include high-spatial-resolution imagery from the GF-2 and GF-3 satellites and topographical data from ALOS PALSAR DEM, each offering unique insights into the forest canopy and terrain of the study area. Launched in August 2014, the GF-2 satellite represents China’s first civilian optical satellite with sub-meter spatial resolution. It provides four multispectral bands with a resolution of 3.2 m (covering wavelengths from 0.45 to 0.52 μm, 0.52 to 0.59 μm, 0.63 to 0.69 μm, 0.77 to 0.89 μm) and one panchromatic band with a resolution of 0.8 m (covering 0.45–0.9 μm). After considering factors such as coverage of the study area, clarity of data, and temporal consistency between image data and forest resource survey data, two GF-2 images dated 2018.08.29 were selected. These images underwent radiometric calibration, atmospheric correction, orthorectification, image fusion, and mosaicking, resulting in the data depicted in Figure 1b. These data are utilized for extracting optical reflectance features, vegetation indices, and texture features.
The GF-3 satellite, launched in October 2016, is China’s first C-band multi-polarization high-resolution Synthetic Aperture Radar (SAR) satellite. It offers 12 imaging modes including stripmap and scanSAR, capable of capturing a wide swath of 10,500 km with spatial resolutions ranging from 1500 m. For this study, a full-polarization stripmap (QPS I) image dated 2018.09.24 with an 8 m spatial resolution was selected after considering various factors. The processing of these data involved multi-look processing, filtering, geocoding, and radiometric calibration, leading to the results shown in Figure 1c. These data are used to extract backscatter coefficient features and radar texture features.
The DEM data are derived from the Japan Aerospace Exploration Agency’s ALOS PALSAR with a resolution of 12.5 m, accessible at Alaska Satellite Facility’s website https://asf.alaska.edu/datasets/daac/alos-palsar-radiometric-terrain-correction/ (accessed on 10 May 2023). To align these data with the study area, the downloaded DEM data were first spliced and clipped. This was followed by a detailed examination and adjustment of the DEM within the study region, involving the removal and interpolation of any anomalies to ensure the accuracy and reliability of the topographical analysis. The outcome, as shown in Figure 1d, is employed for extracting terrain features.

2.2.2. Forest Resource Data

In this study, secondary forest resource survey data were utilized as the ground truth for model construction and evaluation, as depicted in Figure 1a. The forest resource survey data used in this paper, standardized under the unified planning of the Western Tianshan Forestry Bureau, comprise a normative dataset. It includes various commonly used parameters in forest management such as land type, dominant tree species, forest category, origin, classification, species composition, age group, average tree height, diameter at breast height (DBH), canopy closure, and stock volume. The latest update of the forest resource survey data was in October 2018, offering good temporal consistency with the image data used in this study.
The forest resource survey data used in this study were provided by the West Tianshan Forestry Bureau, which was updated by forest farm managers based on the 2015 Forest Resource Inventory Data for Management, as depicted in Figure 1a. The latest update of the forest resource survey data was in October 2018, offering good temporal consistency with the image data used in this study. The forest resource survey data include various common parameters in forest resource management, such as dominant tree species, tree age, average tree height, diameter at breast height, and canopy density. The canopy density we are concerned with is measured on site by forest technicians using tools such as forest density meters.
The initial phase of the research involved rigorous verification and standardization of the topological relationships of the forest resource data to ensure data integrity and usability. Subsequently, the focus was on identifying sub-compartments predominantly classified as “Forestland”. A critical aspect of the study was the selection of the sample area. Specifically, sub-compartments where Picea Schrenkiana was the dominant species were chosen for detailed analysis. Within the study area, 500 sub-compartments prominently featuring Picea Schrenkiana were identified.
For a comprehensive and representative analysis, a stratified sampling method was employed. These sub-compartments were divided into two categories for analytical purposes: 350 sub-compartments, constituting approximately 70% of the study sample, were selected as modeling samples. The remaining 150 sub-compartments, making up the other 30%, were reserved for model validation. This methodical approach in sample selection was designed to ensure robust and reliable analysis of the forest resources in the study area.

2.3. Methods

The algorithmic workflow of this study, as illustrated in Figure 2, encompasses several key steps. Initially, the process begins with the collection and organization of input data, followed by a thorough preprocessing of this data. The next phase involves the extraction of feature factors.
Using GF-2 satellite data, a total of 204 feature parameters are extracted, including reflectance features (4), vegetation index features (8), and optical texture features (192). Similarly, the GF-3 satellite data are utilized to extract 204 feature parameters, comprising backscatter features (12) and radar texture features (192). In addition, 3 terrain feature parameters, including elevation, slope, and aspect, are extracted using the DEM data. These parameters are then used alongside the forest resource survey data to construct the model training and testing sets.
Subsequent to feature extraction, the process focuses on feature selection based on correlation. The selection criteria aim to maximize the correlation between the feature parameters and canopy closure, while minimizing the inter-correlation among the feature parameters themselves. From this process, 17 feature parameters are chosen for model construction.
The final step involves accuracy assessment of the nine recognition results generated by three different models, using the model test sample set. This assessment is critical for constructing the optimal algorithmic model. The efficacy of each model is evaluated in terms of how well it predicts the canopy density, with the aim of identifying the model that provides the most accurate and reliable results for canopy density estimation in the study area.

2.3.1. Feature Factor Extraction

  • Optical Image Processing
(1)
Vegetation Index
The Vegetation Index provides an indication of vegetation vitality, significantly bolstering the interpretative capabilities of remote sensing imagery. When estimating forest canopy density, the vegetation index method exhibits greater sensitivity compared to methods relying on single-band images. Eight vegetation indexes were calculated based on the pre-processed GF-2 image, including the Normalized Differential Vegetation Index (NDVI), Difference Vegetation Index (DVI), Renormalized Difference Vegetation Index (RDVI), Perpendicular Vegetation Index (PVI), Soil-Adjusted Vegetation Index (SAVI), Modified Soil-Adjusted Vegetation Index (MSAVI), Simple Ratio Vegetation Index (SRVI), and Ratio Vegetation Index (RVI). The computational formulas for these indexes are presented in Table 1. These vegetation indexes offer a wealth of information, enabling us to gain deeper insights into the vegetation conditions within the studied regions.
(2)
Texture Characteristics
Extracting and analyzing texture characteristics from images can provide valuable qualitative and quantitative information about surface features [31]. This process involves specific image processing techniques that discern textural details, a critical aspect in various fields of remote sensing analysis.
The Gray-Level Co-occurrence Matrix (GLCM), introduced by Haralick [37], stands as a prominent method for texture feature extraction. GLCM is particularly adept at describing the texture of an image area, offering a multi-dimensional view of textural characteristics. However, GLCM alone does not directly extract texture features from images. To address this, researchers have developed several characteristic parameters based on GLCM. These commonly include eight statistical measures: mean (ME), variance (VAR), homogeneity (HOM), contrast (CON), dissimilarity (DIS), second moment (ASM), entropy (ENT), and correlation (COR). Each of these statistics offers a unique lens through which the texture of an image can be understood and quantified. It should be emphasized that the texture characteristic parameters mentioned here, such as ME, VAR, and COR, are different from the traditional mean, variance, and correlation. The feature parameters mentioned here are based on the spatial relationship between image pixel grayscale pairs for local probability density distribution statistics. The detailed formulas and descriptions are shown in Table 2, and more detailed principles can be found in references [37,38,39].
Building upon previous studies [40,41], we conducted a Principal Component Analysis (PCA) of the GF-2 satellite images. This analysis revealed that the first principal component encapsulates a significant 94.26% of information across the four bands, effectively representing the collective data of each band. Consequently, GLCM texture feature extraction was specifically applied to this principal component, optimizing our analysis for comprehensive texture assessment.
When employing the GLCM for extracting image texture characteristics, two crucial parameters require careful setting: the lag distance between pixels and the moving window size [42]. In this study, we set the lag distance at 1 pixel, as smaller distances tend to strengthen spatial correlations. To mitigate the impact of directionality and moving window size on the extraction of texture characteristic statistics, we performed the analysis across multiple window sizes—3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11, and 13 × 13—and in four different orientations: 0°, 90°, 45°, and 135°. This approach ensured a comprehensive and unbiased extraction of textural features, providing a rich and detailed understanding of the surface characteristics within the satellite imagery. A total of 192 texture characteristics were extracted from GF-2 images.
  • Radar Image Characteristics
(1)
Backscattering Coefficient
Through meticulous processing, we derived a total of 12 backscattering coefficients. These include coefficients in four polarization modes (HH, HV, VH, and VV): HHbeta, HHgamma, HHsigma, HVbeta, HVgamma, VHbeta, VHgamma, VHsigma, VVbeta, VVgamma, and VVsigma. This extraction process encompassed several steps—data import, multi-view analysis, filtering, geocoding, and radiometric calibration preprocessing. The outcome of this process was three distinct backscattering coefficients for each polarization mode: the radar brightness coefficient (beta), the normalized backscattering coefficient (gamma), and the radar intensity coefficient (sigma). These coefficients are pivotal in interpreting and analyzing radar data, offering insights into surface properties and textural elements. The detailed principles and processes of synthetic aperture radar data processing can be found in [43,44].
(2)
Texture Characteristics
In this study, correlation analysis revealed that the intensity scattering coefficient HVsigma has the strongest correlation with canopy closure and effectively represents other backscatter coefficients. Consequently, HVsigma was selected for extracting radar texture features. This approach parallels the method applied to optical image data, where the GLCM method was employed for texture feature extraction.
In the radar image analysis, the lag distance was set to one pixel. This specific setting was chosen to enhance the spatial correlation in texture analysis, a critical aspect of accurately characterizing surface features in radar imagery. To minimize the impact of directional bias and variations in moving window size on the extraction of texture characteristic statistics, a comprehensive analysis was undertaken. Under the settings of four different angles and six varying window sizes, a total of 192 radar texture features were extracted. This meticulous approach in analyzing texture characteristics allowed for a detailed and nuanced understanding of the surface features depicted in the radar images. By employing this method, the study was able to accurately capture the complex textural patterns present in the radar data, which are essential for understanding and modeling the canopy density of the forests under study.
  • Topographic Characteristics
Forests, predominantly located in mountainous regions, are profoundly influenced by topographical and climatic factors in their growth environments [45,46]. The vertical zonation phenomenon vividly illustrates how altitude impacts temperature and precipitation, essential elements for forest development. Additionally, slope and aspect are crucial in determining the intensity of sunlight, the degree of weathering, and soil moisture levels, all of which indirectly but significantly affect canopy density. This interplay suggests a strong correlation between forest canopy density and terrain factors.
In light of this, the consideration of terrain in the remote sensing estimation of canopy density is indispensable. For this study, we focused on extracting key terrain factors using DEM with 12.5 m spatial resolution. Utilizing the 3D Analysis Tool within ArcMap 10.6 Software, we meticulously extracted data on altitude, slope, and aspect within the study area. These factors provide a comprehensive understanding of the terrain’s influence on forest canopy density, allowing for a more nuanced and accurate remote sensing analysis. By integrating these topographic characteristics into our study, we aim to enhance the precision of our canopy density estimations, ensuring they reflect the complex interplay between forests and their mountainous environments.

2.3.2. Modeling Fundamentals

Multiple Stepwise Regression

Multivariate stepwise regression is a highly effective technique, especially for the inversion of forest parameters [47]. Its core principle revolves around selecting independent variables that exhibit a strong correlation with the dependent variable. This method involves rigorous significance testing using the variance ratio to decide the inclusion of variables in the regression equation. The process begins with a set of variable factors, x 1 , x 2 , , x n , where ‘ n ’ represents the total number of variables. The regression equation incorporates these variables sequentially, starting with the one that has the greatest impact on the dependent variable and progressively including others based on their influence. This selective approach means that some factors with lesser impacts might not be included in the regression equation at all. However, the inclusion of a new variable, x i + 1 , necessitates re-evaluating the previously included i variables. If the introduction of x i + 1 renders an earlier variable, x j (where j < i ), statistically insignificant, then xj is removed from the equation. This process of adding and removing factors constitutes the steps in stepwise regression.
Each step is scrutinized under an F-test, ensuring that the regression equation at any point contains only those factors which significantly affect the dependent variable. Variables with minimal impact are systematically eliminated. This iterative process continues until no more variables can be feasibly added or removed, culminating in an optimal regression equation. This final equation will encompass all the independent variables that have a significant effect on the dependent variable. The general form of a multiple stepwise regression equation is defined by a mathematical expression that relates the dependent variable to the selected independent variables, factoring in their respective coefficients and statistical significance. This equation becomes a powerful tool for understanding and predicting the behavior of the dependent variable based on the interplay of various significant independent factors.
y = a 0 + a 1 x 1 + a 2 x 2 + + a k x k
where a 0 is a constant, x 1 , x 2 , x k are variable factors, and a 1 , a 2 , a k are corresponding coefficients of each variable factor.

Back Propagation Neural Network

The neural network method is another effective way to deal with the complicated relationship of remote sensing variables [48]. The application of Back Propagation (BP) neural network is the most extensive among neural network methods with good classification and prediction functions. The structure of the BP neural network model includes an input layer, a hidden layer, and an output layer. The parameters mainly include the number of nodes in the input layer, hidden layer, and output layer, as well as the learning rate, target error, excitation function, transfer function, and maximum number of iterations. Among them, the number of nodes in the input layer and the output layer is equal to the number of independent variables and dependent variables of the training sample, and most of the excitation functions use S-type functions; the learning rate, transfer function, target error, and maximum number of iterations shall be determined through continuous tests based on the correlation coefficient between the estimated value and the measured value of canopy density [49]. The number of hidden layer nodes is determined by the following empirical formula [48]:
m = n + l + δ
where m represents the number of hidden layer nodes, n represents the number of input layer nodes, l represents the number of output layer nodes, and δ represents a constant between 1 and 10.

Cubist Model

The Cubist model, an advanced rule-based modeling approach, serves as an extension of the M5 model tree and is particularly adept at predicting continuous values. This model distinguishes itself by dividing the data space into multiple subspaces, each modeled individually. This strategy enhances prediction accuracy by employing a piecewise linear model to fit the relationship between predictive values and influencing factors [50]. At its core, the Cubist model is grounded in machine learning principles, enabling efficient data mining within extensive datasets. It excels in identifying relevant factors from a multitude of variables and constructing a predictive model based on these findings [51]. The model operates on a set of rules that guide its predictions. Each rule is essentially a linear regression model that is applied based on specific conditions. For instance:
  • Rule 1:
    • If conditions such as “Altitude ≤ x” and “NDVI ≥ y” are met,
    • Then, the prediction is given by the equation “FCC = a 1 + a 2 × Altitude…”, where ‘ a 1 ’, ‘ a 2 ’, etc., are coefficients of the linear regression model.
  • Rule m:
    • If conditions like “Altitude ≥ x” and “NDVI ≤ y” are met,
    • Then, the prediction is given by “FCC = b 1 + b 2 × NDVI…”, with ‘ b 1 ’, ‘ b 2 ’, etc., being linear regression coefficients.
Two specific conditions trigger the cessation of this division process:
(i)
The node sample count falls below a predetermined threshold;
(ii)
The ratio of the node’s sample target attribute standard deviation to the overall sample target attribute standard deviation is below a certain limit.
Post-division, the model tree undergoes pruning to eliminate superfluous branches. Following this, a smoothing process is applied to address discontinuities at leaf nodes. This smoothing incorporates considerations for the parent node of each leaf, achieved by merging the child and parent node fitting equations into a new linear equation. This comprehensive process ensures the Cubist model’s accuracy and efficiency in predicting outcomes based on a set of defined rules and conditions.
The Cubist model is suitable for modeling and predicting complex feature data and can achieve prediction and analysis tasks such as land classification and resource investigation by analyzing and modeling the extracted remote sensing data. A decision tree model can be established by combining multidimensional feature data from remote sensing images, such as spectral information, texture features, shape features, etc., to achieve classification and recognition of land cover types. To a certain extent, it will rely on the nearest neighbor algorithm, which reduces its robustness to noisy data. It is a common phenomenon commonly used in remote sensing image processing for tasks such as image classification, object detection, image denoising, and super-resolution reconstruction, especially in the classification and recognition of remote sensing data based on local similarity. Overall, the Cubist model can handle remote sensing data with multiple features and attributes when modeling and analyzing complex multidimensional data. During this process, the nearest neighbor algorithm mainly interferes with tasks such as pixel-level classification, object detection, and image restoration based on similarity.
It is very important to perform feature selection and preprocessing on remote sensing image data before using the Cubist model to avoid interference from nearest neighbor algorithms in remote sensing image processing. It must be ensured that the selected features effectively describe the type or target of the feature and match the task requirements. Preprocessing operations such as data normalization, dimensionality reduction, and removal of outliers are performed to improve the stability and generalization ability of the model. In addition, the quality and representativeness of the training dataset are ensured, avoiding noise, incorrect labeling, or imbalance in the dataset and reducing interference in the model by abnormal data. The quality and diversity of the dataset can be improved through methods such as data augmentation and cross validation. On this basis, it is crucial to select model parameters reasonably and optimize them when constructing the Cubist model. By using techniques such as cross validation to select the optimal parameter settings, the model’s generalization ability and robustness can be improved. Overfitting and underfitting can be avoided to ensure that the model performs well on both the training and testing sets. Finally, integration of the Cubist model with other algorithms such as the nearest neighbor algorithm should be considered to complement each algorithm’s strengths and improve the overall performance and stability of the model. The prediction results of multiple models are integrated to reduce the error and interference effects of a single model. The comprehensive application of the above strategies can effectively reduce the interference of the Cubist model by nearest neighbor algorithms in remote sensing image processing, improve the accuracy and stability of the model, and thus better apply it to the analysis and prediction tasks of remote sensing image data.

2.3.3. Model Accuracy Evaluation

Using the sample reservation method, the predicted results are tested by calculating the Root Mean Square Error (RMSE) between the measured value of the reserved sample and the predicted value of the model and the Average Absolute Percentage Precision (AAPP) using the measured data of the reserved sub-compartment canopy density. RMSE is the standard error between the measured value and the predicted value. The smaller the RMSE value, the better the prediction effect of the model; AAPP is the estimation accuracy. The larger the AAPP is, the higher the estimation accuracy of the model is. The calculation formula of the two indexes is as follows:
R M S E = 1 n i = 1 n y i x i 2
A A P P i = 1 y i x i y i 100
A A P P = i n A A P P i n
where AAPPi is the estimation accuracy of the sample point, AAPP is the overall estimation accuracy, y i is the value of the ith training sample, x i corresponds to the ith estimate of the training samples, and n is the number of samples.

3. Results

3.1. Correlation Analysis of Image Characteristic and Canopy Density

In this paper, a total of 411 remote sensing image feature factors have been extracted, encompassing four-band grayscale values, eight vegetation indexes including NDVI, and a comprehensive set of 192 texture characteristic statistics derived from the first principal component across four directions at six window sizes, such as 3 × 3. Within the Gaofen-3 imagery, there are 92 remote sensing factors, among which 12 are backscatter coefficients. Moreover, a total of 192 texture characteristic statistics have been extracted from the intensity scattering coefficient HVsigma. Additionally, three topographic factors, including elevation, slope, and aspect, have been extracted based on DEM data.
A comprehensive analysis was carried out in this study, examining all 411 extracted feature factors for their correlation with canopy density. Given the collinearity among these factors, a carefully selected set of 17 feature factors was identified as the most suitable inputs for the model. This selection encompasses NDVI, a range of texture characteristics from the GF-2 satellite images such as PCA3_0VAR, PCA3_0ASM, PCA5_0ENT, PCA5_90COR, PCA9_0HOM, PCA13_0CON, and PCA13_0DIS, and backscatter coefficient texture characteristics from the Gaofen-3 satellite imagery, including HV3_0ASM, HV3_0ENT, HV3_90COR, HV3_135HOM, HV3_135DIS, HV3_135CON, and HV5_0VAR. Additionally, topographic factors like elevation and slope are also integrated. It is important to note that abbreviations like PCA3_0VAR represent the variance (VAR) feature of the PCA band in a 3 × 3 window at 0° direction, and similarly, the naming convention applies to other texture feature parameters. This approach ensures a detailed and nuanced understanding of the factors influencing canopy density, providing a solid foundation for accurate model construction and analysis.
Table 3 presents the correlation analysis between the selected factors and canopy density. The integrated utilization of these feature factors is anticipated to enhance our ability to comprehensively and accurately evaluate and predict canopy density and vegetation conditions within the studied regions.

3.2. Models Construction and Parameter Optimization

After the feature selection process outlined in Section 3.1, a total of 17 characteristics were included in the model input parameters. Among these, eight characteristics were derived from optical imagery from the GF-2 satellite, seven characteristics were from radar imagery obtained from GF-3, and two characteristics were from digital elevation models (DEMs). Based on three different input schemes—optical image features + terrain features (optical image feature input scheme), radar image features + terrain features (radar image feature input scheme), and optical image features + radar image features + terrain features (combined feature input scheme)—we constructed a forest density estimation model using stepwise regression, a BP neural network, and a Cubist model.

3.2.1. Multiple Stepwise Regression Model

As for joint features (optical image features + radar image features + terrain features), 17 features are used as independent variables, canopy density data are used as independent variables, and the sequential forward method is used to build the model. Due to the large differences in magnitude and dimension between features, the data need to be standardized to reduce the impact of dimensions and magnitude. Referring to previous research [47], when the significance is 0.05, the independent variables are screened through the probability of the F test. When the F probability is less than 0.05, it is included in the model, and when the F probability is greater than 0.1, the variable is eliminated. The characteristic parameters included in the model are PCA5_0ENT, Altitude, HV3_0ASM, and NDVI. With the addition of characteristic parameters, the changes in the model are shown in Table 4. It can be seen that as the parameters increase, the complex correlation coefficient of the model reaches 0.776, the coefficient of determination R2 reaches 0.590, the standard error decreases to 0.097, and the significance sig = 0.000 < 0.05, so the canopy density is consistent with the estimation value of the model with the 4 step parameter. The information of the two variables is significantly related, and the coefficient of the residuals tested by the Durbin–Watson method is 1.932, which is relatively close to 2, so it can basically show that the residuals are independent of each other. From the above, it can be concluded that a linear equation can be constructed through stepwise regression, and the multivariate linear regression model established here can be used to estimate canopy density of Picea Schrenkiana.
After conducting the experiments, the final parameters for the multiple stepwise regression mode for different feature modeling schemes are determined as follows:
For the joint feature modeling scheme, the multiple stepwise regression mode is determined as follows:
FCC = 0.236 × PCA5_0ENT + 0.062 × Altitude + 5.166 × HV3_0ASM + 0.654 × NDVI − 1.157.
Similarly, for the optical image feature modeling scheme, the multiple stepwise regression mode is determined as follows:
FCC = 0.352 × PCA5_0ENT + 0.058 × Altitude − 0.348.
For the radar image feature modeling scheme, the multiple stepwise regression mode is determined as follows:
FCC = 0.085 × Altitude − 1.479 × HV3_0ENT + 3.480.

3.2.2. BP Neural Network Model

Establishing a BP neural network includes selecting the appropriate network structure and efficient network parameter settings for training algorithms. In this study, parameter optimization is conducted through the meta-learner CVParameterSelection. Taking the complex joint feature modeling input scheme as an example, the modeling is carried out.
Firstly, based on previous research [48], a four-layer initial BP neural network model is constructed with the following parameters: 10 nodes in the input layer, 1 node in the output layer, logsig transfer function for the first hidden layer, purelin transfer function for the second hidden layer, maximum iteration of 500, target error of 0.00001, learning rate of 0.1, training function as trainlm, and hidden layer nodes ranging from 4 to 13. Then, the learning rate is determined. The learning rate ranges from 0.01 to 0.1, and experiments show that the variation in the learning rate has little impact on estimation accuracy in this study. Therefore, the learning rate is set to 0.1 to ensure efficiency.
Next, the number of nodes in each hidden layer is set. The problem of setting hidden layer nodes is complex. Setting fewer nodes may result in insufficient information reflection of the data by the network, leading to unsolved problems. On the other hand, setting too many nodes not only prolongs learning time but also may not necessarily minimize errors, easily leading to overfitting and loss of generality in the neural network.
Afterward, the selection of transfer functions in the hidden layers of the BP neural network model is addressed. The commonly used transfer functions include tansig, logsig, and purelin. Six combinations of these three functions are tested pairwise (logsig and tansig, tansig and logsig, tansig and purelin, purelin and tansig, logsig and purelin, and purelin and logsig). Keeping other parameters constant, training is conducted with each combination of transfer functions and hidden layer node numbers. The optimal parameters for each transfer function combination are obtained through hundreds of experiments.
From Table 5, it can be observed that among the six combination methods, considering both R2 and RMSE, when the transfer function combination is tansig and logsig, although R2 is not the highest, both RMSEtrain and RMSEtest are relatively small; in particular, RMSEtest is the smallest among the six combinations. Therefore, the transfer function combination for constructing the neural network is selected as tansig and logsig, with two and four hidden layer nodes, respectively.
After conducting the experiments, the final parameters for the BP neural network models for different feature modeling schemes are determined as follows.
For the joint feature modeling scheme, the BP neural network is configured with an input layer comprising 17 nodes, followed by a first hidden layer with two nodes utilizing the tansig transfer function, and a second hidden layer with four nodes employing the logsig transfer function. The output layer consists of one node. The model is trained with a maximum of 500 iterations, aiming for a target error of 0.00001, using a learning rate of 0.1, and the training function trainlm. The R2train is 0.7278, and the RMSEtrain is 0.0792.
Similarly, for the optical image feature modeling scheme, the BP neural network is configured with an input layer of 10 nodes, a first hidden layer with 5 nodes utilizing the purelin transfer function, and a second hidden layer with 8 nodes employing the logsig transfer function. The output layer remains with one node. The training parameters remain consistent with the joint feature modeling scheme. The R2train is 0.7286, and the RMSEtrain is 0.0792.
For the radar image feature modeling scheme, the BP neural network parameters are set with an input layer of nine nodes, a first hidden layer with five nodes employing the logsig transfer function, and a second hidden layer with four nodes utilizing the purelin transfer function. The output layer remains with one node. Training parameters remain consistent across all schemes. The R2train is 0.4500, and the RMSEtrain is 0.1126.

3.2.3. Cubist Model

The Cubist model is implemented in the R language using the Cubist package, which was constructed by Max Kuhn et al. based on Quinlan’s Cubist algorithm [52,53,54]. The Cubist package uses the technique of “hierarchical regression” to predict numerical target variables by constructing a series of regression trees and is widely used in data modeling and prediction fields. For further understanding and use of this package, the reader can access https://cran.r-project.org/web/packages/Cubist/index.html (accessed on 10 May 2023). Two important parameters that need to be set are the number of model trees (committees) and the number of nearest neighbor samples (neighbors) used for model adjustment. As there are no explicit rules for determining these parameters, they are typically chosen based on extensive experimentation with different parameter combinations on the sample set. The model’s accuracy is then used to determine the optimal parameter combination. Referring to previous research [52,53,54], the initial range of committees and neighbors is set between 2 to 10 and 1 to 9, respectively. Similarly, using a complex joint feature input scheme as an example, the usage rates of independent variables in branch conditions (Conds) and linear regression models (Model) of the constructed Cubist model are shown in Table 6.
Based on R2 and RMSE, the Cubist model parameters’ committees and neighbors are selected. After experimenting with different combinations of committees and neighbors, it was found that setting committees = 8 and neighbors = 4 resulted in the Cubist model achieving the maximum R2 and minimum RMSE. At these settings, the value of R2 is 0.6816, and the value of RMSE is 0.0854.
Similarly, following the aforementioned steps, the parameter settings for the Cubist model for the optical image feature modeling scheme and radar image feature modeling scheme are determined as follows.
For the optical image feature modeling scheme: After experimenting with different combinations of committees and neighbors, it was found that setting committees = 4 and neighbors = 3 resulted in the Cubist model achieving the maximum R2 and minimum RMSE. At these settings, the value of R2 is 0.6415, and the value of RMSE is 0.0955.
For the radar image feature modeling scheme: After experimenting with different combinations of committees and neighbors, it was found that setting committees = 3 and neighbors = 1 resulted in the Cubist model achieving the maximum R2 and minimum RMSE. At these settings, the value of R2 is 0.3818, and the value of RMSE is 0.1189.
It should be noted that all the calculations required for the above research were completed on an HP Z8 G5 graphics workstation, which is equipped with two 64 core Xeon processors and two NVIDIA RTX™ 6000 Ada Generation GPUs equipped with 1 TB of DDR5 memory.

3.3. Comparison of Modeling Results

3.3.1. Analysis of Overall Accuracy Evaluation

In this study, the canopy density of the Picea Schrenkiana forest sub-compartment was estimated using three sophisticated modeling approaches: the multiple stepwise regression model, the BP neural network model, and the Cubist model. To validate the accuracy of these models, 30% of the selected samples, amounting to 150 completely independent samples, were used.
The precision of each model was evaluated using the RMSE and AAPP, comparing the measured values with the predicted values. An extensive comparative analysis was conducted to assess the performance of each model across three different data sources. The comprehensive results of this accuracy verification for the three models are methodically tabulated in Table 7. According to Table 7, the BP neural network model based on optical image characteristics emerged as the most accurate, achieving a value of AAPP at 80.51%. The combined method of optical and radar images followed closely with the second-highest accuracy, with an AAPP value of 79.66%. It is noteworthy that when considering the average accuracy across all three models, the combination of optical and radar image characteristics demonstrated superior performance, with the highest average AAPP value of 78.66% and the lowest average RMSE of 0.1059. Therefore, the utilization of the combined characteristics can improve the stability of the different types of estimation models, which proves the effectiveness of combining optical images and radar images to improve the estimation results of the canopy density.
Additionally, among all the models developed from these three data sources, the BP neural network model demonstrated superior estimation accuracy, recording accuracies of 80.51%, 77.39%, and 79.66% respectively. Therefore, it is evident that the BP neural network model, based on optical image characteristics, stands out as the most effective model for estimating the canopy density of the Picea Schrenkiana forest sub-compartment in the study area. Following closely are the BP neural network model, which amalgamates both characteristics, and the Cubist model, based on optical image characteristics. This hierarchical performance of the models provides insightful guidance for selecting the most appropriate modeling approach in similar forest canopy density estimation studies.

3.3.2. Accuracy Evaluation and Analysis of Different Canopy Density Grades

In line with the canopy density classification standards [47], canopy density is categorized into three distinct grades. To validate the accuracy of our models across these different canopy density grades, we utilized the same independent samples as in Section 3.3.1. The model accuracy evaluation results under different canopy density grades are shown in Table 8.
From the data presented in Table 8, it is evident that for low canopy density, models based on optical image characteristics perform exceptionally well. In this category, the BP neural network model stands out with the highest estimation accuracy, achieving a minimal RMSE of 0.0847 and an impressive accuracy rate of 75.37%. When it comes to medium canopy density, all nine models constructed from the three data sources exhibit estimation accuracies exceeding 80%. Notably, models derived from radar images slightly edge out others in accuracy, with both the BP neural network model and the Cubist model based on radar images demonstrating accuracies greater than 87%.
For high canopy density estimations, models integrating both optical and radar image characteristics show remarkable accuracy, with all three models achieving over 80%. In particular, the Cubist model leads in this category, reaching a maximum accuracy of 89.17%, closely followed by the BP neural network model at 88.77%. Overall, the models developed in this study for estimating the canopy density of Picea Schrenkiana forest on a sub-compartment scale show robust performance, especially in the medium canopy density category, where each model maintains an accuracy of over 80%. The highest accuracy is observed in the high canopy density estimation, with the combined optical and radar image characteristic-based Cubist model achieving 89.17%. However, the accuracy dips in the low canopy density estimations. Aside from the BP neural network model based on optical image characteristics, which achieves an accuracy of 75.37%, the accuracies of other models fall below 70%. This analysis underscores the relative strengths and areas for improvement in each modeling approach, offering valuable insights for future applications in forest canopy density estimation.
In our study, we constructed accuracy change curves to visually represent the variation in estimation accuracy across different canopy densities. These curves were plotted with canopy density on the horizontal axis, ranging from 0.2 to 0.9, and the average accuracy of each canopy density layer on the vertical axis. The accuracy change curves for the three models based on optical image characteristics, radar image characteristics, and a combination of both are depicted in Figure 3a, Figure 3b, and Figure 3c, respectively.
From Figure 3a, it is apparent that the models based on optical image characteristics display significant variations in estimation accuracy at canopy densities of 0.2, 0.8, and 0.9. Notably, at a canopy density of 0.2, the BP neural network model demonstrates the highest accuracy. Conversely, at canopy densities of 0.8 and 0.9, the Cubist model emerges as the most accurate. Figure 3b illustrates a fluctuating trend in the accuracy of models based on radar image characteristics, with accuracy initially increasing and then decreasing as canopy density rises. A notable observation is the low accuracy of these models at a canopy density of 0.2, where the discrepancy between estimated and measured values is significantly high, indicating a less reliable performance at lower canopy densities. In Figure 3c, the models combining optical and radar image characteristics exhibit a generally consistent trend in accuracy changes. All three models show the lowest accuracy at a canopy density of 0.2. Between canopy densities of 0.3 and 0.7, the accuracy differences among the models are minimal. However, at a canopy density of 0.9, the Cubist model distinctly outperforms the others, achieving the highest estimation accuracy.
These observations from the accuracy change curves provide insightful benchmarks for understanding the strengths and limitations of each model under varying canopy density conditions. The nuanced differences in performance across different models and data sources highlight the importance of selecting the appropriate modeling approach based on specific canopy density ranges for more accurate forest canopy density estimation.
Table 9 exhibits a comparison of related research in the same field. Experimental results show that the accuracy of using multi-source remote sensing data to construct feature sets for canopy density estimation (76.50%~89.17%) is comparable to previous related studies (65%~90%) [55,56,57,58,59,60]. In the same field, Xie et al. developed a new bounding envelope methodology based on vegetation indices, using Landsat 8 OLI and Sentinel-2 images to extract NDVI, MBSI, and BSI, combined with the bipartite pixel model (DPM) to draw the forest canopy closure in the Chifeng area. The results show that the extraction accuracy of Landsat 8 and Sentinel-2 images is 80.00% and 86.00%, respectively, which can be used for large-scale canopy density mapping [55]. Hua and Zhao used DEM and Sentinel-2 data to extract spectral features, a vegetation index, a red edge index, texture features, and terrain factor features, combined with multivariate stepwise regression, a BP neural network, and a geometric optics model to conduct canopy density estimation research [59]. The results proved that the red edge can improve the accuracy of model estimation, but there is a fitness problem between the model and the feature set. Xu et al. used GF-2 images and UAV radar images to establish the UnetR canopy cover estimation model and compared it with three traditional models. The results showed that all four models had overestimation of low canopy cover and underestimation of high canopy cover [60]. This paper uses GF-2, GF-3, and DEM data to construct multi-source features and combines multiple models to estimate canopy density. The maximum accuracy can reach 89.17%, which is slightly better than previous studies. There are two main reasons for the improvement in accuracy. On the one hand, this article combines multi-source data to build a huge feature set, optimizes the input of the model based on feature selection, and improves the estimation accuracy. On the other hand, the forest types in the experimental areas of other studies are more complex, including a variety of coniferous and broad-leaved tree species, whereas the forest types in the study area of this article are single, mainly natural larch, which may be the reason for the high estimation accuracy of this study compared with other studies mentioned above.

3.3.3. Remote Sensing Inversion Mapping of Canopy Density

In our study, we meticulously extracted characteristic factor data from various sub-compartments within the study area. These data were then applied to three distinct models, each based on different data sources with finely tuned parameters. This process enabled us to estimate the canopy density of the Picea Schrenkiana forest sub-compartments and to create inverse mapping visualizations of these canopy densities.
The inversion results obtained from the models based on optical image features, radar image features, and a combination of both are presented in Figure 4, Figure 5, and Figure 6, respectively. These figures provide a visual representation of the canopy density distribution across the study area, as interpreted by each model. Particularly noteworthy is the inversion map from the BP neural network model based on optical image features, which exhibited the highest overall accuracy. This map reveals that the canopy density of Picea Schrenkiana forest predominantly falls within the 0.3 to 0.7 range. Areas with canopy densities below 0.3 or above 0.7 are less common, suggesting that the forest generally exhibits a moderate canopy density, with fewer occurrences of either very high or very low densities.
In terms of spatial distribution, the map indicates that areas with canopy densities below 0.3 are mainly situated in the northwestern part of the study area. The more moderate densities ranging from 0.3 to 0.7 are primarily found in the southwestern, eastern, and northeastern parts, with the 0.5 to 0.7 range being the most prevalent. In contrast, areas with canopy densities above 0.7 are chiefly located in the eastern and southern parts of the study area, often interspersed among zones of medium canopy density.
This nuanced understanding of canopy density distribution, enabled by advanced modeling and inversion mapping, provides valuable insights for forest management and conservation efforts in the study area. It highlights the diverse canopy density characteristics of the forest, aiding in the identification of areas that may require different management approaches based on their canopy density profiles.

4. Discussion

This research endeavors to enhance our understanding of forest canopy density, a critical parameter for the survey of forest resources and ecological environment monitoring and a vital index for effective forest management [61,62,63,64]. Though traditional methods for measuring canopy density such as visual assessment, hemispherical photography, and canopy projection have been widely used, they often require significant resources and yield point-scale data, which are less conducive to large-scale, rapid assessments of spatial distribution and changes in canopy density [55,65,66,67,68]. The advent and progression of remote sensing technology have revolutionized this field, providing a technical foundation for rapid and precise canopy density estimation.
The application of remote sensing models offers a far more efficient and effective means of obtaining regional or large-scale canopy density data, crucial for resource surveys and monitoring. The accuracy of these remote sensing models is fundamentally linked to the quality of the models themselves and the selection of remote sensing feature factors. In our study, we selected the Picea Schrenkiana forest area in the Western Tianshan Mountains as our focal area. We aimed to quantitatively estimate the canopy density of sub-compartments of the Picea Schrenkiana var. Tianschanica forest using remote sensing data from GF-2 and GF-3 satellites, combined with optical and radar image characteristics. By selecting optimal characteristic factors and constructing multiple stepwise regression, BP neural network, and Cubist models, we evaluated their accuracy to explore the feasibility of quantitative canopy density estimation at the sub-compartment level.
From 411 feature factors extracted in our study, 17 were selected as model input parameters, including NDVI, VAR, HOM, CON, DIS, ENT, ASM, COR, and topographic factors like elevation and slope. This selection underscores the need for a combination of spectral and DEM feature information in alpine areas to enhance model estimation accuracy [18,69,70,71,72,73,74,75]. DEM feature information, including elevation and slope, affects the forest canopy density by affecting the growth and distribution of vegetation [46,71]. For example, in areas with larger slopes, there is less vegetation and lower canopy density due to factors such as soil erosion. In addition, the NDVI reflects the greenness and growth status of vegetation and is closely related to canopy density [72]. Additionally, our findings align with previous studies emphasizing the importance of utilizing multi-source features to improve inversion accuracy in canopy density estimation. Similar to existing research, our study demonstrates that integrating satellite data and DEM information leads to enhanced precision in canopy density assessment [70]. In terms of model accuracy, the estimation model based on the combination of optical and radar image characteristics exhibited the highest average accuracy (78.56%), followed by models with optical image characteristics (78.50%) and radar image characteristic-based models (76.90%). The results showed that the polarization characteristics provided by radar images can improve the stability of different types of estimation models. Notably, the BP neural network model demonstrated superior accuracy across these categories.
Our analysis revealed varying accuracy levels across different canopy density grades. Medium canopy density estimations generally showed high accuracy (average 84.33%), followed by high canopy density (average 81.86%) and low canopy density (average 55.48%). For low canopy density, the optical image characteristic-based BP neural network model achieved the highest accuracy (75.37%), and models based on radar image characteristics performed well for medium canopy density, with the Cubist model attaining the highest accuracy (87.46%). For high canopy density, models combining optical and radar image characteristics were highly accurate, with the Cubist model again leading (accuracy of 89.17%). These insights provide a novel perspective for canopy density estimation studies, suggesting the use of different models tailored to specific canopy density levels.
In our study, we compared three models for canopy density estimation: multiple stepwise regression, BP neural network, and Cubist. Whereas the multiple stepwise regression model showed lower accuracy due to its reliance on linear relationships, the BP neural network mode achieved the highest accuracy by capturing complex non-linear patterns. The Cubist model, leveraging radar image features, performed well in medium and high density estimations, particularly when combined with optical data. The limitations of the multiple stepwise regression model underscore the need for advanced techniques capable of handling non-linear relationships, highlighting the importance of neural networks and decision trees in improving accuracy for canopy density estimation. Our results are consistent with existing studies demonstrating the effectiveness of machine learning algorithms, such as neural networks and decision trees, in improving the estimation accuracy of canopy density compared to traditional linear regression models [15].

5. Conclusions

In this study, we proposed a method of estimating the canopy density synergistically using optical and radar satellite images and topographic features and validated the effectiveness of the proposed model at the sub-components scale. Furthermore, we explored the impact of characteristics from different sources and their combined characteristics as well as three different types of estimation models on the estimation model.
We extracted 411 characteristic factors from GF-2 optical images, GF-3 radar satellite data, and DEM. Subsequently, 17 multi-features were chosen based on their correlation with canopy density survey data. Three estimation models were then developed: multiple stepwise regression, BP neural network, and Cubist. The results indicate that the estimation accuracy of the model was above 75%, and the BP neural network model based on optical image features had the highest estimation accuracy, reaching 80.51%. Additionally, the results indicate that leveraging complementary characteristics from optical and radar sources enhances the stability and accuracy of the estimation model. Compared with only using optical images or radar images, the estimation accuracy when combining the two types of images increased from 76.50% and 78.50% to 78.66%.
Additionally, for different estimation models, the BP neural network model outperforms all other models in overall accuracy, achieving an impressive 80.51%. This model particularly excels in estimating low canopy density, reaching an accuracy of 75.37%. For medium canopy density estimations, the Cubist model based on radar image characteristics emerges as the most accurate, with an outstanding accuracy of 87.46%. In high canopy density scenarios, the Cubist model, integrating both optical and radar image characteristics, achieves the highest accuracy at 89.17%.
The results from this research underscore the complementary strengths of domestic optical and radar image data in the context of canopy density estimation. The use of these data sources, in tandem, enhances the precision and reliability of our estimations for the Picea Schrenkiana forest sub-compartments. This comprehensive approach offers valuable insights and serves as a reference point for similar studies in other regions, demonstrating the significant potential of integrated remote sensing techniques in forest canopy density assessment. This highlights the importance of harnessing complementary information from optical and radar sources to produce more reliable and precise estimates of canopy density.
Though this paper achieved high estimation accuracy, it faces limitations. The relatively small sample size, constrained by traffic accessibility and sampling costs, may limit the model’s universality. Additionally, the linear-based feature selection method may overlook potential nonlinear correlations. Future research should aim to expand sample sizes across diverse regions (adding random noise, bagging sampling, etc.) and explore alternative feature selection algorithms (recursive feature elimination, genetic algorithms, etc.) to improve model robustness and interpretability. By addressing these limitations, future studies can advance canopy density estimation for more effective forest resource monitoring and management.

Author Contributions

In this research paper, each author made significant contributions: Y.W. and J.W. led the conceptualization and design and were heavily involved in data analysis and manuscript drafting. X.L., as the corresponding author, oversaw the project coordination and supervision and played a key role in shaping the research framework. X.Y. and D.Z. were instrumental in data acquisition and fieldwork, providing valuable insights for data analysis. W.Q. brought expertise in remote sensing and GIS, crucial for the processing and interpretation of satellite data. Collectively, their diverse skills and expertise were essential for the successful completion of this study. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by National Key R&D Program of China (2022YFB3902000, 2021YFE0117300), the Major Project of the High-Resolution Earth Observation System (Grant No. 30-Y60B01-9003-22/23), and the National Natural Science Foundation of China (42201392).

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We are also thankful to all anonymous reviewers for their constructive comments provided on the study.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Gschwantner, T.; Schadauer, K.; Vidal, C.; Lanz, A.; Tomppo, E.; di Cosmo, L.; Robert, N.; Duursma, D.E.; Lawrence, M. Common tree definitions for national forest inventories in Europe. Silva Fenn. 2009, 43, 303–321. [Google Scholar] [CrossRef]
  2. Jennings, S.B.; Brown, N.D.; Sheil, D. Assessing forest canopies and understorey illumination: Canopy density, canopy cover and other measures. Forestry 1999, 72, 59–74. [Google Scholar] [CrossRef]
  3. IPCC. Good Practice Guidance for Land Use, Land-Use Change and Forestry; Institute for Global Environmental Strategies (IGES): Hayama, Japan, 2003. [Google Scholar]
  4. Xu, B.; Gong, P.; Pu, R. Crown closure estimation of oak savannah in a dry season with Landsat TM imagery: Comparison of various indices through Correlation Analysis. Int. J. Remote Sens. 2003, 24, 1811–1822. [Google Scholar] [CrossRef]
  5. Chen, G.; Lou, T.; Jing, W.; Wang, Z. Sparkpr: An Efficient Parallel Inversion of Canopy density of the forest. IEEE Access 2019, 7, 135949–135956. [Google Scholar] [CrossRef]
  6. Smith, A.M.; Ramsay, P.M. A comparison of ground-based methods for estimating canopy density for use in phenology study. Agric. For. Meteorol. 2018, 252, 18–26. [Google Scholar] [CrossRef]
  7. Yang, C.; Ni, J.; Zhou, Q.; Cheng, W.; Han, S. Correlation between canopy density of different stands and remote sensing data. J. Ecol. 2015, 35, 2119–2125. [Google Scholar]
  8. Gu, C. Study on Estimation of Mountain Forest Parameters Using Geometric Optical Model Coupled with Multi-Source Remote Sensing Data. Ph.D. Thesis, China Academy of Forestry Sciences, Beijing, China, 2018. [Google Scholar]
  9. Chopping, M.; Moisen, G.G.; Su, L.; Laliberte, A.; Rango, A.; Martonchik, J.V.; Peters, D.P.C. Large area mapping of southwestern forest crown cover, canopy height, and biomass using the NASA Multiangle Imaging Spectro-Radiometer. Remote Sens. Environ. 2008, 112, 2051–2063. [Google Scholar] [CrossRef]
  10. Smith, A.M.S.; Falkowski, M.J.; Hudak, A.T.; Evans, J.S.; Robinson, A.P.; Steele, C.M. A cross-comparison of field, spectral, and lidar estimates of forest canopy cover. Can. J. Remote Sens. 2009, 35, 447–459. [Google Scholar] [CrossRef]
  11. Hill, M.J.; Román, M.O.; Schaaf, C.B.; Hutley, L.; Brannstrom, C.; Etter, A.; Hanan, N.P. Characterizing vegetation cover in global savannas with an annual foliage clumping index derived from the MODIS BRDF product. Remote Sens. Environ. 2011, 115, 2008–2024. [Google Scholar]
  12. Chopping, M.; North, M.; Chen, J.; Schaaf, C.B.; Blair, J.B.; Martonchik, J.V.; Bull, M.A. Forest Canopy Cover and Height from MISR in Topographically Complex Southwestern US Landscapes Assessed with High Quality Reference Data. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2012, 5, 44–58. [Google Scholar] [CrossRef]
  13. Ripple, W.J. Determining coniferous forest cover and forest fragmentation with NOAA-9 advanced very High-resolution radiometer data. Photogramm. Eng. Remote Sens. 1994, 60, 533–540. [Google Scholar]
  14. Boyd, D.S.; Foody, G.M.; Ripple, W.J. Evaluation of approaches for forest cover estimation in the Pacific Northwest, USA, using remote sensing. Appl. Geogr. 2002, 22, 375–392. [Google Scholar] [CrossRef]
  15. Wang, C.; Du, H.; Xu, X.; Han, N.; Zhou, G.; Sun, S.; Gao, G. Multi-scale crown closure retrieval for moso bamboo forest using multi-source remotely sensed imagery based on geometric-optical and Erf-BP Neural Network Models. Int. J. Remote Sens. 2015, 36, 5384–5402. [Google Scholar] [CrossRef]
  16. Morales, R.M.; Miura, T.; Idol, T. An assessment of Hawaiian dry forest condition with fine resolution remote sensing. For. Ecol. Manag. 2008, 255, 2524–2532. [Google Scholar] [CrossRef]
  17. Tan, B.; Li, Z.; Chen, E.; Pang, Y.; Lei, Y. Quantitative estimation of canopy density of the forest using Hyperion hyperspectral data. J. Beijing For. Univ. 2006, 3, 95–101. [Google Scholar]
  18. Dan, Y.; Shi, J.; Li, X.; Luo, S.; Li, M. Estimation of canopy closure based on improved dimidiate pixel model. J. Beijing For. Univ. 2019, 41, 35–43. [Google Scholar]
  19. Sun, S.; Tian, X.; Gu, C.; Han, Z.; Wang, C.; Zhang, Z. Remote sensing estimation of canopy density of the forest of Genhe River in Inner Mongolia based on KNN-FIFS. Remote Sens. Technol. Appl. 2019, 34, 959. [Google Scholar] [CrossRef]
  20. Ning, K. Study on the Method of Retrieving Canopy Density of the Forest in Mountain Area Based on SAR Image. Master’s Thesis, Southwest Jiaotong University, Chengdu, China, 2014. [Google Scholar]
  21. Chen, J.M.; Pavlic, G.; Brown, L.; Cihlar, J.; Leblanc, S.G.; White, H.P.; Hall, R.J.; Peddle, D.R.; King, D.J.; Trofymow, J.A.; et al. Derivation and validation of Canada-wide coarse-Resolution leaf area index maps using high-resolution satellite imagery and ground measurements. Remote Sens. Environ. 2002, 80, 165–184. [Google Scholar] [CrossRef]
  22. Li, C.; Tan, B. Study on the method of forest resources remote sensing survey based on GIS. For. Sci. 2004, 40, 40–45. [Google Scholar]
  23. Diemer, C.; Lucaschewski, I.; Spelsberg, G.; Tomppo, E.; Pekkarinen, A. Integration of terrestrial forest sample plot data, map information and Satellite data: An operational multisource-inventory concept. In Proceedings of the Third Conference “Fusion of Earth Data: Merging Point Measurements, Raster Maps and Remotely Sensed Images”, Sophia Antipolis, France, 26–28 January 2000. [Google Scholar]
  24. Montesano, P.M.; Neigh, C.S.R.; Sexton, J.; Feng, M.; Channan, S.; Ranson, K.J.; Townshend, J.R. Calibration and Validation of Landsat Tree Cover in the Taiga−Tundra Ecotone. Remote Sens. 2016, 8, 551. [Google Scholar] [CrossRef]
  25. Hadi; Korhonen, L.; Hovi, A.; Rönnholm, P.; Rautiainen, M. The accuracy of large-area forest canopy cover estimation using Landsat in boreal region. Int. J. Appl. Earth Obs. Geoinf. 2016, 53, 118–127. [Google Scholar] [CrossRef]
  26. Gong, P.; Pu, R. Hyperspectral Remote Sensing and Its Application; Higher Education Press: Beijing, China, 2000. [Google Scholar]
  27. Tong, S.; Zhang, J.; Ha, S.; Lai, Q.; Ma, Q. Dynamics of Fractional Vegetation Coverage and Its Relationship with Climate and Human Activities in Inner Mongolia, China. Remote Sens. 2016, 8, 776. [Google Scholar] [CrossRef]
  28. Xiao, Q.; Tao, J.; Xiao, Y.; Qian, F. Monitoring vegetation cover in Chongqing between 2001 and 2010 using remote sensing data. Environ. Monit. Assess. 2017, 189, 493. [Google Scholar] [CrossRef] [PubMed]
  29. Ding, Y.; Zheng, X.; Zhao, K.; Xin, X.; Liu, H. Quantifying the Impact of NDVIsoil Determination Methods and NDVIsoil Variability on the Estimation of Fractional Vegetation Cover in Northeast China. Remote Sens. 2016, 8, 29. [Google Scholar] [CrossRef]
  30. Yang, X.; He, P.; Yu, Y.; Fan, W. Stand Estimation of canopy density in Planted Forests Using a Geometric-Optical Model Based on Remote Sensing. Remote Sens. 2022, 14, 1983. [Google Scholar] [CrossRef]
  31. Wu, Y.; Zhang, D.; Zhang, H.; Wu, H. Remote sensing estimation of canopy density of the forest combined with image texture characteristic. For. Sci. 2012, 48, 48–53. [Google Scholar]
  32. Zheng, D.; Zeng, W.; Zhi, C.; Shi, P. Quantitative estimation of canopy density of the forest by remote sensing in the Three Gorges Reservoir Area. J. Cent. South For. Univ. 2013, 33, 1–4. [Google Scholar]
  33. Wang, R.; Xing, Y.; Wang, L.; You, H.; Qiu, S.; Wang, A. Estimation of canopy density of the forest by combining spaceborne ICESat-GLAS waveform and multispectral Landsat TM image. J. Appl. Ecol. 2015, 26, 1657–1664. [Google Scholar]
  34. Li, Q.; Wang, Z.; Wang, Y.; Liu, M.; Yang, Y. Study on estimating canopy density of Picea Schrenkiana. var. Tianschanica Forest based on GF-2 remote sensing image. J. Cent. South For. Univ. 2019, 39, 48–54. [Google Scholar]
  35. Gao, L.; Wang, X.; Johnson, B.A.; Tian, Q.; Wang, Y.; Verrelst, J.; Mu, X.; Gu, X. Remote Sensing Algorithms for Estimation of Fractional Vegetation Cover Using Pure Vegetation Index Values: A Review. ISPRS J. Photogramm. Remote Sens. 2020, 159, 364–377. [Google Scholar] [CrossRef]
  36. Bannari, A.; Morin, D.; Bonn, F.; Huete, A.R. A Review of Vegetation Indices. Remote Sens. Rev. 1995, 13, 95–120. [Google Scholar] [CrossRef]
  37. Haralick, R.M.; Shanmugam, K.; Dinstein, I. Textural Characteristic for Image Classification. IEEE Trans. Syst. Man Cybern. 1973, smc-3, 610–621. [Google Scholar] [CrossRef]
  38. Patel, D.; Stonham, T.J. Texture Image Classification and Segmentation Using RANK-Order Clustering. In Proceedings of the 11th IAPR International Conference on Pattern Recognition. Vol. III. Conference C: Image, Speech and Signal Analysis, The Hague, Netherlands, 30 August 1992; IEEE Computer Society: Piscataway, NJ, USA, 1992; Volume 1, pp. 92–95. [Google Scholar]
  39. Warner, T. Kernel-based Texture in Remote Sensing Image Classification. Geogr. Compass 2011, 5, 781–798. [Google Scholar] [CrossRef]
  40. Xu, H.; Pan, P.; Yang, W.; Ou, Y.; Ning, J.; Shao, J.; Li, Q. Classification and precision evaluation of forest resources based on multi-source remote sensing images. J. Jiangxi Agric. Univ. 2019, 5, 1–12. [Google Scholar]
  41. Hao, N. Forest Classification and Landscape Pattern Analysis Based on GF-1 Image—Taking Liangshui Nature Reserve as an Example. Master’s Thesis, Xi’an University of Science and Technology, Xi’an, China, 2016. [Google Scholar]
  42. Li, M.; Tan, Y.; Pan, J.; Peng, S. Study on forest biomass Modeling combining spectral, texture and topographic characteristic. Remote Sens. Inf. 2006, 6, 6–9+66. [Google Scholar]
  43. Moreira, A.; Prats-Iraola, P.; Younis, M.; Krieger, G.; Hajnsek, I.; Papathanassiou, K.P. A Tutorial on Synthetic Aperture Radar. IEEE Geosci. Remote Sens. Mag. 2013, 1, 6–43. [Google Scholar] [CrossRef]
  44. Yang, L.; Shi, L.; Sun, W.; Yang, J.; Li, P.; Li, D.; Liu, S.; Zhao, L. Radiometric and Polarimetric Quality Validation of Gaofen-3 over a Five-Year Operation Period. Remote Sens. 2023, 15, 1605. [Google Scholar] [CrossRef]
  45. Wu, Y.; Zhang, W.; Zhang, L.; Wu, J. Correlation Analysis of terrain and vegetation distribution based on DEM. Northeast For. Univ. News 2012, 40, 96–98. [Google Scholar]
  46. Abdollahnejad, A.; Panagiotidis, D.; Shataee Joybari, S.; Surový, P. Prediction of Dominant Forest Tree Species Using QuickBird and Environmental Data. Forests 2017, 8, 42. [Google Scholar] [CrossRef]
  47. Richardson, H.J.; Hill, D.J.; Denesiuk, D.R.; Fraser, L.H. A comparison of geographic datasets and field meas urements to model soil carbon using random forests and stepwise regressions (British Columbia, Canada). GIScience Remote Sens. 2017, 54, 573–591. [Google Scholar] [CrossRef]
  48. Liu, Z.; Chang, Y.; Chen, H. Estimation of forest volume in Huzhong forest region based on remote sensing, geographic information system and Artificial Neural Network. J. Appl. Ecol. 2008, 9, 1891–1896. [Google Scholar]
  49. Li, C.; Zhao, X.; Li, C. Theory and Implementation of Forest Volume Estimation by Remote Sensing; Science Press: Beijing, China, 2006. [Google Scholar]
  50. Witten, I.H.; Frank, E.; Hall, M.A. Data mining: Practical Machine Learning Tools and Techniques, 2nd ed.; Morgan Kaufmann: San Francisco, CA, USA, 2005. [Google Scholar]
  51. Kuhn, M.; Johnson, K. Applied Predictive Modeling; Springer: New York, NY, USA, 2013. [Google Scholar]
  52. Kuhn, M.; Weston, S.; Keefer, C.; Coulter, N. Cubist Models for Regression. R Package Vignette R Package Version 0.0 2012, 18, 480. [Google Scholar]
  53. Quinlan, J.R. Combining Instance-Based and Model-Based Learning. In Proceedings of the Tenth International Conference on Machine Learning, Amherst, MA, USA, 27–29 June 1993; Morgan Kaufmann: San Fracisco, CA, USA, 1993; pp. 236–243. [Google Scholar]
  54. Quinlan, J.R. Learning with Continuous Classes. In Proceedings of the 5th Australian Joint Conference on Artificial Intelligence, Hobart, Australia, 16–18 November 1992; World Scientific: Singapore, 1992; Volume 92, pp. 343–348. [Google Scholar]
  55. Xie, B.; Cao, C.; Xu, M.; Yang, X.; Duerler, R.S.; Bashir, B.; Huang, Z.; Wang, K.; Chen, Y.; Guo, H. Improved Forest Canopy Closure Estimation Using Multispectral Satellite Imagery within Google Earth Engine. Remote Sens. 2022, 14, 2051. [Google Scholar] [CrossRef]
  56. Wachid, M.N.; Hapsara, R.P.; Cahyo, R.D.; Wahyu, G.N.; Syarif, A.M.; Umarhadi, D.A.; Fitriani, A.N.; Ramadhanningrum, D.P.; Widyatmanti, W. Mangrove canopy density analysis using Sentinel-2A imagery satellite data. In Proceedings of the 3rd International Conference of Planning in the Era of Uncertainty (ICPEU), Malang, Indonesia, 3–7 March 2017. [Google Scholar]
  57. Humagain, K.; Portillo-Quintero, C.; Cox, R.D.; Cain, J.W. Mapping Tree Density in Forests of the Southwestern USA Using Landsat 8 Data. Forests 2017, 8, 287. [Google Scholar] [CrossRef]
  58. Cilek, A.; Berberoglu, S.; Donmez, C.; Sahingoz, M. The use of regression tree method for Sentinel-2 satellite data to mapping percent tree cover in different forest types. Environ. Sci. Pollut. Res. 2021, 29, 23665–23676. [Google Scholar] [CrossRef] [PubMed]
  59. Hua, Y.Y.; Zhao, X.S. Multi-Model Estimation of Forest Canopy Closure by Using Red Edge Bands Based on Sentinel-2 Images. Forests 2021, 12, 1768. [Google Scholar] [CrossRef]
  60. Xu, E.; Guo, Y.; Chen, E.; Li, Z.; Zhao, L.; Liu, Q. An Estimation Model for Regional Forest Canopy Closure Combined with UAV LiDAR and High Spatial Resolution Satellite Remote Sensing Data. Geomat. Inf. Sci. Wuhan Univ. 2022, 47, 1298–1308. [Google Scholar]
  61. Wang, X.; Lu, Y. Modern Forest Measurement Method; China Forestry Publishing House: Beijing, China, 2013. [Google Scholar]
  62. Sankey, T.; Donager, J.; McVay, J.; Sankey, J.B. UA V lidar and hyperspectral fusion for forest monitoring in the southwestern USA. Remote Sens. Environ. 2017, 195, 30–43. [Google Scholar] [CrossRef]
  63. Xavier, A.C.; Vettorazzi, C.A. Monitoring leaf area index at watershed level through NDVI from Landsat-7/ETM+ data. Sci. Agric. 2004, 61, 243–252. [Google Scholar] [CrossRef]
  64. Xu, L.; Shi, Y.J.; Fang, H.Y.; Zhou, G.M.; Xu, X.J.; Zhou, Y.F.; Tao, J.X.; Ji, B.Y.; Xu, J.; Li, C.; et al. Vegetation carbon stocks driven by canopy density and forest age in subtropical forest ecosystems. Sci. Total Environ. 2018, 631–632, 619–626. [Google Scholar] [CrossRef]
  65. Fiala, A.C.S.; Garman, S.L.; Gray, A.N. Comparison of five canopy cover estimation techniques in the western Oregon Cascades. For. Ecol. Manag. 2006, 232, 188–197. [Google Scholar] [CrossRef]
  66. Korhonen, L.T.; Korhonen, K.; Rautiainen, M.; Stenberg, P. Estimation of forest canopy cover: A comparison of field measurement techniques. Silva Fenn. 2006, 40, 577–588. [Google Scholar] [CrossRef]
  67. Paletto, A.; Tosi, V. Forest canopy cover and canopy density: Comparison of assessment techniques. Eur. J. For. Res. 2009, 128, 265–272. [Google Scholar] [CrossRef]
  68. Brown, L.A.; Ogutu, B.O.; Dash, J. Tracking forest biophysical properties with automated digital repeat photography: A fisheye perspective using digital hemispherical photography from below the canopy. Agric. For. Meteorol. 2020, 287, 107944. [Google Scholar] [CrossRef]
  69. Li, J.; Mao, X. Comparison of Estimation of canopy density of Plantations Using Parametric, Semi-Parametric, and Non-Parametric Models Based on GF-1 Remote Sensing Images. Forests 2020, 11, 597. [Google Scholar] [CrossRef]
  70. Yang, C.J.; Huang, H.; Han, S.; Ni, J. Estimating forest canopy density using LANDSAT TM data based on sub-compartment objects. In Proceedings of the International Geoscience and Remote Sensing Symposium, Melbourne, Australia, 21–26 July 2013; Volume 10, pp. 999–1002. [Google Scholar]
  71. Franklin, S.E. Discrimination is crimination of subalpine forest species and canopy density using digital CASI, SPOT PLA, and Landsat TM data. Photogramm. Eng. Remote Sens. 1994, 60, 1233–1241. [Google Scholar]
  72. Su Mon, M.; Mizoue, N.; Htun, N.Z.; Kajisa, T.; Yoshida, S. Estimating canopy density of the forest of tropical mixed deciduous vegetation using Landsat data: A comparison of three classification approaches. Int. J. Remote Sens. 2012, 33, 1042–1057. [Google Scholar] [CrossRef]
  73. Pu, R.; Gong, P. Wavelet transform applied to EO-1 hyperspectral data for forest LAI and crown closure mapping. Remote Sens. Environ. 2004, 91, 212–224. [Google Scholar] [CrossRef]
  74. Pu, R.; Xu, B.; Gong, P. Oakwood crown closure estimation by unmixing Landsat TM data. Int. J. Remote Sens. 2003, 24, 4433–4445. [Google Scholar] [CrossRef]
  75. Joshi, C.; De Leeuw, J.; Skidmore, A.K.; Van Duren, I.C.; Van Oosten, H. Remotely sensed estimation of canopy density of the forest: A comparison of the performance of four methods. Int. J. Appl. Earth Obs. Geoinf. 2006, 8, 84–95. [Google Scholar] [CrossRef]
Figure 1. Overview of study area and data sources: (a) is the location and distribution of forest canopy density in the study area, (b) is an optical image (GF-2), (c) is a radar image (GF-3), and (d) is a digital elevation model data of the study area.
Figure 1. Overview of study area and data sources: (a) is the location and distribution of forest canopy density in the study area, (b) is an optical image (GF-2), (c) is a radar image (GF-3), and (d) is a digital elevation model data of the study area.
Forests 15 01145 g001
Figure 2. Research roadmap.
Figure 2. Research roadmap.
Forests 15 01145 g002
Figure 3. Comparison diagram of model accuracy: (a) estimation accuracy of three models based on optical image characteristics, (b) estimation accuracy of three models based on radar image characteristic, (c) estimation accuracy of three models combining optical and radar image characteristics.
Figure 3. Comparison diagram of model accuracy: (a) estimation accuracy of three models based on optical image characteristics, (b) estimation accuracy of three models based on radar image characteristic, (c) estimation accuracy of three models combining optical and radar image characteristics.
Forests 15 01145 g003
Figure 4. Inversion graph of three models based on optical image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Figure 4. Inversion graph of three models based on optical image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Forests 15 01145 g004
Figure 5. Inversion graph of three models based on radar image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Figure 5. Inversion graph of three models based on radar image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Forests 15 01145 g005
Figure 6. Inversion graph of three models of joint optical and radar image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Figure 6. Inversion graph of three models of joint optical and radar image features: (a) inversion results of multisource stepwise regression model; (b) inversion results of BP neural network model; (c) inversion results of Cubist model.
Forests 15 01145 g006
Table 1. Calculation formulas of vegetation indexes [35,36].
Table 1. Calculation formulas of vegetation indexes [35,36].
Vegetation Indexes (VI)Calculation Formulas
Normalized Differential Vegetation Index (NDVI) NDVI = N I R R / N I R + R
Renormalized Difference Vegetation Index (RDVI) R D V I = N D V I × D V I
Perpendicular Vegetation Index (PVI) PVI = 0.939 N I R 0.344 R + 0.09 5
Soil-Adjusted Vegetation Index (SAVI) SAVI = N I R R ( 1 + L ) / N I R + R + L
Modified Soil-Adjusted Vegetation Index (MSAVI) M S A V I = 2 N I R + 1 2 N I R + 1 2 8 N I R R 2
Difference Vegetation Index (DVI) DVI = N I R R
Simple Ratio Vegetation Index (SRVI) S R V I = R / N I R
Ratio Vegetation Index RVI RVI = N I R / R
Note: In the formulas, NIR is the near-infrared band, and R is the red band; L is the background adjustment factor. In this paper, L is taken as 0.5.
Table 2. Description and calculation formula of GLCM texture characteristics.
Table 2. Description and calculation formula of GLCM texture characteristics.
GLCM Characteristic ValueCalculation FormulaDescription
Mean (ME) i , j = 0 N 1 i ( P i , j ) In the GLCM, the ME represents the average value of gray levels in a specific direction. This means that by statistically analyzing the gray-level pairs in a particular direction and calculating their average, one can obtain the overall brightness level of the image texture in that direction.
Variance (VAR) i , j = 0 N 1 P i , j i μ i 2 VAR reflects the degree of dispersion or variation in the distribution of gray levels in the image texture. It measures the difference between gray levels, indicating the degree of diversity in the image texture.
Homogeneity (HOM) i , j = 0 N 1 P i , j 1 + i j 2 HOM measures the consistency of gray-level pairs in the image texture. In the GLCM, higher homogeneity indicates less difference between gray-level pairs, resulting in a more uniform and consistent texture.
Contrast (CON) i , j = 0 N 1 i j 2 P i , j CON describes the difference between different gray levels in the image texture. It quantifies the relative difference between pairs of gray levels, indicating the degree of contrast between different parts of the image texture.
Dissimilarity (DIS) i , j = 0 N 1 P i , j i j DIS reflects the degree of difference between different gray levels in the image texture. It measures the dissimilarity between pairs of gray levels, indicating the degree of difference between different parts of the image texture.
Entropy (ENT) i , j = 0 N 1 P i , j ( log P i , j ) ENT describes the randomness or uncertainty of the image texture. It measures the disorder in the distribution of gray levels, indicating the irregularity of the texture.
Second Moment (ASM) i , j = 0 N 1 P i , j 2 ASM reflects the uniformity of the image texture. It measures the frequency of occurrence of gray-level pairs in the image texture, indicating the uniformity of the texture.
Correlation (COR) i , j = 0 N 1 P i , j ( i μ i ) ( j μ j ) ( σ i 2 ) ( σ j 2 ) COR reflects the linear relationship between pixels in the image texture. It measures the degree of correlation between pairs of gray levels, indicating the strength of the linear relationship between pixels in the texture.
Note: i and j represent the grayscale values of the image and are also the row and column numbers of the grayscale co-occurrence matrix calculated by grayscale pairs. N is the intensity range of grayscale values: for example, the grayscale intensity range of an 8-bit image is 0 to 255. P i , j represents the probability of grayscale pairs (i, j) appearing within the statistical range. μ i , μ j , σ i 2 , and σ j 2 are the ME and VAR of grayscale pairs in a specific direction, respectively. μ i = i = 0 N 1 j = 0 N 1 i ( P i , j ) , μ j = j = 0 N 1 i = 0 N 1 j ( P i , j ) , σ i 2 = i = 0 N 1 j = 0 N 1 P i , j i μ i 2 , σ j 2 = i = 0 N 1 j = 0 N 1 P i , j j μ j 2 .
Table 3. Correlation analysis between selected factors and canopy density.
Table 3. Correlation analysis between selected factors and canopy density.
FactorCorrelation CoefficientSignificance LevelFactorCorrelation CoefficientSignificance Level
NDVI0.332 **0.000 HV3_0ASM0.418 **0.000
PCA3_0VAR0.388 **0.000 HV3_0ENT−0.419 **0.000
PCA3_0ASM−0.611 **0.000 HV3_90COR−0.335 **0.000
PCA5_0ENT0.626 **0.000 HV3_135HOM0.368 **0.000
PCA5_90COR0.426 **0.000 HV3_135DIS−0.342 **0.000
PCA9_0HOM−0.617 **0.000 HV3_135CON−0.314 **0.000
PCA13_0CON0.441 **0.000 HV5_0VAR−0.314 **0.000
PCA13_0DIS0.550 **0.000 Altitude0.563 **0.000
---Slope−0.312 **0.000
Note: ** Significant correlation at 0.01 level.
Table 4. Accuracy analysis of multiple stepwise regression model.
Table 4. Accuracy analysis of multiple stepwise regression model.
Step ParametersCorrelation CoefficientR2Standard ErrorSignificanceDurbin–Watson
10.6630.4350.11390.000
20.7420.5430.10240.000
30.7630.5720.09910.000
40.7760.5900.09700.0001.932
Table 5. Transfer function combination of the BP neural network corresponds to the optimal accuracy.
Table 5. Transfer function combination of the BP neural network corresponds to the optimal accuracy.
The Transfer FunctionThe Number of Nodes in Layer (Ι)The Number of Nodes in Layer (ΙΙ)R2trainR2testRMSEtrainRMSEtest
Logsig and tansig590.68160.58890.08790.1064
Tansig and logsig240.72780.57390.07920.1043
Tansig and purelin540.79360.56940.07110.1201
Purelin and tansig260.69000.61940.08590.1070
Logsig and purelin650.69210.59620.08530.1048
Purelin and logsig1210.60820.57860.09460.1058
Table 6. Evaluation of the use of independent variables in Cubist model.
Table 6. Evaluation of the use of independent variables in Cubist model.
VariablesCondsModel
PCA3_0ASM15%74%
Altitude12%85%
HV3_135CON12%2%
PCA13_0DIS 74%
PCA9_0HOM 62%
PCA13_0CON 61%
PCA5_0ENT 60%
HV3_0ENT 36%
HV3_135DIS 35%
HV3_0ASM 24%
NDVI 24%
PCA3_0VAR 12%
HV3_90COR 11%
HV3_135HOM 11%
Table 7. Comparison of accuracy verification results of three models based on three data sources.
Table 7. Comparison of accuracy verification results of three models based on three data sources.
Data SourceModelRMSEAAPP (%)
Optical image
characteristics
Multiple stepwise regression0.119475.54%
BP neural network0.106680.51%
Cubist0.099779.45%
Average accuracy0.108678.50%
Radar image
characteristics
Multiple stepwise regression0.122976.70%
BP neural network0.116677.39%
Cubist0.124976.62%
Average accuracy0.121476.90%
Combining optical and radar
image characteristics
Multiple stepwise regression0.111577.86%
BP neural network0.104379.66%
Cubist0.101978.46%
Average accuracy0.105978.66%
Table 8. Model accuracy evaluation results under different canopy density grades.
Table 8. Model accuracy evaluation results under different canopy density grades.
Canopy Density GradeLow Canopy DensityMiddle Canopy DensityHigh Canopy Density
Data SourceModelRMSEAAPP (%)RMSEAAPP (%)RMSEAAPP (%)
Optical image characteristicMultiple stepwise regression0.124753.19%0.104482.19%0.185478.54%
BP neural network0.0847 75.37%0.109281.12%0.128987.47%
Cubist0.093764.36%0.100383.06%0.107687.48%
Radar image characteristicMultiple stepwise regression0.153850.39%0.081485.44%0.233573.95%
BP neural network0.153045.59%0.071087.41%0.225577.74%
Cubist0.168844.23%0.073887.46%0.238672.67%
Characteristic of joint optical image and radar imageMultiple stepwise regression0.129453.39%0.095583.19%0.174080.99%
BP neural network0.121157.92%0.096685.21%0.115288.77%
Cubist0.123554.87%0.094383.93%0.100989.17%
Table 9. Comparison of related research in the same field.
Table 9. Comparison of related research in the same field.
OrderDataObjectModelAccuracyDocument
1Landsat 8, Sentinel-2Pinus tabulaeformis, Quercus Mongolia, Betula spp., Populus spp., Larix spp., and Armeniaca sibiricaA novel bounding envelope methodology80.00% (Landsat 8)
86.00% (Sentinel-2)
[55]
2Sentinel-2, DEMLarix gmelinii, Pinus tabuliformis, Betula platyphylla, Populus davidiana, and Quercus mongolicaMSR79.24%[59]
BP83.03%
Li–Strahler geometric optical mode 75.17%
3UAV images, LiDAR, GF-2Pinus tabuliformis, Larix gmelinii, Betula platyphylla, Populus davidianaUnetR75.60%[60]
FCN73.70%
RF70.20%
SVR68.90%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Wang, Y.; Li, X.; Yang, X.; Qi, W.; Zhang, D.; Wang, J. Estimation of Picea Schrenkiana Canopy Density at Sub-Compartment Scale by Integration of Optical and Radar Satellite Images. Forests 2024, 15, 1145. https://doi.org/10.3390/f15071145

AMA Style

Wang Y, Li X, Yang X, Qi W, Zhang D, Wang J. Estimation of Picea Schrenkiana Canopy Density at Sub-Compartment Scale by Integration of Optical and Radar Satellite Images. Forests. 2024; 15(7):1145. https://doi.org/10.3390/f15071145

Chicago/Turabian Style

Wang, Yibo, Xusheng Li, Xiankun Yang, Wenchao Qi, Donghui Zhang, and Jinnian Wang. 2024. "Estimation of Picea Schrenkiana Canopy Density at Sub-Compartment Scale by Integration of Optical and Radar Satellite Images" Forests 15, no. 7: 1145. https://doi.org/10.3390/f15071145

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop