Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (853)

Search Parameters:
Keywords = boosted tree algorithms

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
20 pages, 31892 KiB  
Article
Remote Sensing Classification and Mapping of Forest Dominant Tree Species in the Three Gorges Reservoir Area of China Based on Sample Migration and Machine Learning
by Wenbo Zhang, Xiaohuang Liu, Bin Xu, Jiufen Liu, Hongyu Li, Xiaofeng Zhao, Xinping Luo, Ran Wang, Liyuan Xing, Chao Wang and Honghui Zhao
Remote Sens. 2024, 16(14), 2547; https://doi.org/10.3390/rs16142547 (registering DOI) - 11 Jul 2024
Viewed by 147
Abstract
The distribution of forest-dominant tree species is crucial for ecosystem assessment. Remote sensing monitoring requires annual ground sample data, but consistent field surveys are challenging. This study addresses this by combining sample migration learning and machine learning for multi-year tree species classification in [...] Read more.
The distribution of forest-dominant tree species is crucial for ecosystem assessment. Remote sensing monitoring requires annual ground sample data, but consistent field surveys are challenging. This study addresses this by combining sample migration learning and machine learning for multi-year tree species classification in the Three Gorges Reservoir area in China. Using the continuous change detection and classification (CCDC) algorithm, sample data from 2023 were successfully migrated to 2018–2022, achieving high migration accuracy (R2 = 0.8303, RMSE = 4.64). Based on migrated samples, random forest (RF), support vector machine (SVM), and extreme gradient boosting (XGB) algorithms classified forest tree species with overall accuracies above 70% and Kappa coefficients above 0.6. XGB. They outperformed other algorithms, with classification accuracy of over 80% and Kappa above 0.75 in almost all years. The final map indicates stable distribution from 2018 to 2023, with eucalyptus covering over 40% of the forest area, followed by horsetail pine, fir, cypress, and wetland pine. Full article
Show Figures

Figure 1

20 pages, 6753 KiB  
Article
Rolling Bearing Fault Diagnosis Based on CNN-LSTM with FFT and SVD
by Muzi Xu, Qianqian Yu, Shichao Chen and Jianhui Lin
Information 2024, 15(7), 399; https://doi.org/10.3390/info15070399 (registering DOI) - 11 Jul 2024
Viewed by 125
Abstract
In the industrial sector, accurate fault identification is paramount for ensuring both safety and economic efficiency throughout the production process. However, due to constraints imposed by actual working conditions, the motor state features collected are often limited in number and singular in nature. [...] Read more.
In the industrial sector, accurate fault identification is paramount for ensuring both safety and economic efficiency throughout the production process. However, due to constraints imposed by actual working conditions, the motor state features collected are often limited in number and singular in nature. Consequently, extending and extracting these features pose significant challenges in fault diagnosis. To address this issue and strike a balance between model complexity and diagnostic accuracy, this paper introduces a novel motor fault diagnostic model termed FSCL (Fourier Singular Value Decomposition combined with Long and Short-Term Memory networks). The FSCL model integrates traditional signal analysis algorithms with deep learning techniques to automate feature extraction. This hybrid approach innovatively enhances fault detection by describing, extracting, encoding, and mapping features during offline training. Empirical evaluations against various state-of-the-art techniques such as Bayesian Optimization and Extreme Gradient Boosting Tree (BOA-XGBoost), Whale Optimization Algorithm and Support Vector Machine (WOA-SVM), Short-Time Fourier Transform and Convolutional Neural Networks (STFT-CNNs), and Variational Modal Decomposition-Multi Scale Fuzzy Entropy-Probabilistic Neural Network (VMD-MFE-PNN) demonstrate the superior performance of the FSCL model. Validation using the Case Western Reserve University dataset (CWRU) confirms the efficacy of the proposed technique, achieving an impressive accuracy of 99.32%. Moreover, the model exhibits robustness against noise, maintaining an average precision of 98.88% and demonstrating recall and F1 scores ranging from 99.00% to 99.89%. Even under conditions of severe noise interference, the FSCL model consistently achieves high accuracy in recognizing the motor’s operational state. This study underscores the FSCL model as a promising approach for enhancing motor fault diagnosis in industrial settings, leveraging the synergistic benefits of traditional signal analysis and deep learning methodologies. Full article
Show Figures

Figure 1

16 pages, 12688 KiB  
Article
Enhancing the Understanding of Subsurface Relations: Machine Learning Approaches for Well Data Analysis in the Drava Basin, Pannonian Super Basin
by Ana Brcković, Jasna Orešković, Marko Cvetković and Željka Marić-Đureković
Appl. Sci. 2024, 14(14), 6039; https://doi.org/10.3390/app14146039 (registering DOI) - 10 Jul 2024
Viewed by 262
Abstract
The aim of this study was to confirm if predictive regression algorithms can provide reliable results in missing geophysical logging data in the western and eastern parts of the Drava Super Basin, especially Gola Field, and to apply unsupervised machine learning methods for [...] Read more.
The aim of this study was to confirm if predictive regression algorithms can provide reliable results in missing geophysical logging data in the western and eastern parts of the Drava Super Basin, especially Gola Field, and to apply unsupervised machine learning methods for a better understanding of lithological subsurface relations. Numerous regression models have been used for the estimation of prediction accuracy, along with some clustering algorithms to support the estimation of lithology distribution estimations in well log datasets, consisting of 20 wells in total. Tree-based algorithms and the boosting algorithm have been optimized and proven valuable in predicting well log data when they are not measured or are unavailable at all depth intervals. For blind datasets, predictions become much less reliable. For this purpose, neural networks with at least one Long Short-Term Memory (LSTM) layer have significantly improved the accuracy and reliability of predictions, not in terms of absolute values but in the aspect of the trends in values that change with the depth and other well features, as well as in terms of the magnitudes. Trendlines can further be used for pattern recognition or as a newly engineered feature. Unsupervised learning has confirmed reliability in lithology recognition on validation sets and has proven to be a great asset in distinguishing variabilities in the petrophysical properties of sediments. Full article
Show Figures

Figure 1

18 pages, 7108 KiB  
Article
Inversion of Soybean Net Photosynthetic Rate Based on UAV Multi-Source Remote Sensing and Machine Learning
by Zhen Lu, Wenbo Yao, Shuangkang Pei, Yuwei Lu, Heng Liang, Dong Xu, Haiyan Li, Lejun Yu, Yonggang Zhou and Qian Liu
Agronomy 2024, 14(7), 1493; https://doi.org/10.3390/agronomy14071493 - 10 Jul 2024
Viewed by 220
Abstract
Net photosynthetic rate (Pn) is a common indicator used to measure the efficiency of photosynthesis and growth conditions of plants. In this study, soybeans under different moisture gradients were selected as the research objects. Fourteen vegetation indices (VIS) and five canopy structure characteristics [...] Read more.
Net photosynthetic rate (Pn) is a common indicator used to measure the efficiency of photosynthesis and growth conditions of plants. In this study, soybeans under different moisture gradients were selected as the research objects. Fourteen vegetation indices (VIS) and five canopy structure characteristics (CSC) (plant height (PH), volume (V), canopy cover (CC), canopy length (L), and canopy width (W)) were obtained using an unmanned aerial vehicle (UAV) equipped with three different sensors (visible, multispectral, and LiDAR) at five growth stages of soybeans. Soybean Pn was simultaneously measured manually in the field. The variability of soybean Pn under different conditions and the trend change of CSC under different moisture gradients were analysed. VIS, CSC, and their combinations were used as input features, and four machine learning algorithms (multiple linear regression, random forest, Extreme gradient-boosting tree regression, and ridge regression) were used to perform soybean Pn inversion. The results showed that, compared with the inversion model using VIS or CSC as features alone, the inversion model using the combination of VIS and CSC features showed a significant improvement in the inversion accuracy at all five stages. The highest accuracy (R2 = 0.86, RMSE = 1.73 µmol m−2 s−1, RPD = 2.63) was achieved 63 days after sowing (DAS63). Full article
Show Figures

Figure 1

16 pages, 19129 KiB  
Article
Ship Detection in SAR Images Based on Steady CFAR Detector and Knowledge-Oriented GBDT Classifier
by Shuqi Sun and Junfeng Wang
Electronics 2024, 13(14), 2692; https://doi.org/10.3390/electronics13142692 - 10 Jul 2024
Viewed by 266
Abstract
Ship detection is a significant issue in remote sensing based on Synthetic Aperture Radar (SAR). This paper combines the advantages of a steady constant false alarm rate (CFAR) detector and a knowledge-oriented Gradient Boosting Decision Tree (GBDT) classifier to achieve the location and [...] Read more.
Ship detection is a significant issue in remote sensing based on Synthetic Aperture Radar (SAR). This paper combines the advantages of a steady constant false alarm rate (CFAR) detector and a knowledge-oriented Gradient Boosting Decision Tree (GBDT) classifier to achieve the location and the classification of ship candidates. The steady CFAR detector smooths the image by a moving-average filter and models the probability distribution of the smoothed clutter as a Gaussian distribution. The mean and the standard deviation of the Gaussian distribution are estimated according to the left half of the histogram to remove the effect of land, ships, and other targets. From the Gaussian distribution and a preset constant false alarm rate, a threshold is obtained to segment land, ships, and other targets from the clutter. Then, a series of morphological operations are introduced to eliminate land and extract ships and other targets, and an active contour algorithm is utilized to refine ships and other targets. Finally, ships are recognized from other targets by a knowledge-oriented GBDT classifier. Based on the brain-like ship-recognition process, we change the way of the decision-tree generation and achieve a higher classification performance than the original GBDT. The results on the AIRSARShip-1.0 dataset demonstrate that this scheme has a competitive performance against deep learning, especially in the detection of offshore ships. Full article
(This article belongs to the Special Issue Radar Signal Processing Technology)
Show Figures

Figure 1

9 pages, 240 KiB  
Article
Impact of Hyperparameter Optimization to Enhance Machine Learning Performance: A Case Study on Breast Cancer Recurrence Prediction
by Lorena González-Castro, Marcela Chávez, Patrick Duflot, Valérie Bleret, Guilherme Del Fiol and Martín López-Nores
Appl. Sci. 2024, 14(13), 5909; https://doi.org/10.3390/app14135909 - 6 Jul 2024
Viewed by 373
Abstract
Accurate and early prediction of breast cancer recurrence is crucial to guide medical decisions and treatment success. Machine learning (ML) has shown promise in this domain. However, its effectiveness critically depends on proper hyperparameter setting, a step that is not always performed systematically [...] Read more.
Accurate and early prediction of breast cancer recurrence is crucial to guide medical decisions and treatment success. Machine learning (ML) has shown promise in this domain. However, its effectiveness critically depends on proper hyperparameter setting, a step that is not always performed systematically in the development of ML models. In this study, we aimed to highlight the impact that this process has on the final performance of ML models through a real-world case study by predicting the five-year recurrence of breast cancer patients. We compared the performance of five ML algorithms (Logistic Regression, Decision Tree, Gradient Boosting, eXtreme Gradient Boost, and Deep Neural Network) before and after optimizing their hyperparameters. Simpler algorithms showed better performance using the default hyperparameters. However, after the optimization process, the more complex algorithms demonstrated superior performance. The AUCs obtained before and after adjustment were 0.7 vs. 0.84 for XGB, 0.64 vs. 0.75 for DNN, 0.7 vs. 0.8 for GB, 0.62 vs. 0.7 for DT, and 0.77 vs. 0.72 for LR. The results underscore the critical importance of hyperparameter selection in the development of ML algorithms for the prediction of cancer recurrence. Neglecting this step can undermine the potential of more powerful algorithms and lead to the choice of suboptimal models. Full article
(This article belongs to the Special Issue Artificial Intelligence for Healthcare)
29 pages, 1265 KiB  
Article
Machine Learning Model Development to Predict Power Outage Duration (POD): A Case Study for Electric Utilities
by Bita Ghasemkhani, Recep Alp Kut, Reyat Yilmaz, Derya Birant, Yiğit Ahmet Arıkök, Tugay Eren Güzelyol and Tuna Kut
Sensors 2024, 24(13), 4313; https://doi.org/10.3390/s24134313 - 2 Jul 2024
Viewed by 435
Abstract
In the face of increasing climate variability and the complexities of modern power grids, managing power outages in electric utilities has emerged as a critical challenge. This paper introduces a novel predictive model employing machine learning algorithms, including decision tree (DT), random forest [...] Read more.
In the face of increasing climate variability and the complexities of modern power grids, managing power outages in electric utilities has emerged as a critical challenge. This paper introduces a novel predictive model employing machine learning algorithms, including decision tree (DT), random forest (RF), k-nearest neighbors (KNN), and extreme gradient boosting (XGBoost). Leveraging historical sensors-based and non-sensors-based outage data from a Turkish electric utility company, the model demonstrates adaptability to diverse grid structures, considers meteorological and non-meteorological outage causes, and provides real-time feedback to customers to effectively address the problem of power outage duration. Using the XGBoost algorithm with the minimum redundancy maximum relevance (MRMR) feature selection attained 98.433% accuracy in predicting outage durations, better than the state-of-the-art methods showing 85.511% accuracy on average over various datasets, a 12.922% improvement. This paper contributes a practical solution to enhance outage management and customer communication, showcasing the potential of machine learning to transform electric utility responses and improve grid resilience and reliability. Full article
(This article belongs to the Special Issue Engineering Applications of Artificial Intelligence for Sensors)
Show Figures

Figure 1

10 pages, 3495 KiB  
Technical Note
Machine Learning for Predicting Neutron Effective Dose
by Ali A. A. Alghamdi
Appl. Sci. 2024, 14(13), 5740; https://doi.org/10.3390/app14135740 - 1 Jul 2024
Viewed by 324
Abstract
The calculation of effective doses is crucial in many medical and radiation fields in order to ensure safety and compliance with regulatory limits. Traditionally, Monte Carlo codes using detailed human body computational phantoms have been used for such calculations. Monte Carlo dose calculations [...] Read more.
The calculation of effective doses is crucial in many medical and radiation fields in order to ensure safety and compliance with regulatory limits. Traditionally, Monte Carlo codes using detailed human body computational phantoms have been used for such calculations. Monte Carlo dose calculations can be time-consuming and require expertise in different processes when building the computational phantom and dose calculations. This study employs various machine learning (ML) algorithms to predict the organ doses and effective dose conversion coefficients (DCCs) from different anthropomorphic phantoms. A comprehensive data set comprising neutron energy bins, organ labels, masses, and densities is compiled from Monte Carlo studies, and it is used to train and evaluate the supervised ML models. This study includes a broad range of phantoms, including those from the International Commission on Radiation Protection (ICRP-110, ICRP-116 phantom), the Visible-Human Project (VIP-man phantom), and the Medical Internal Radiation Dose Committee (MIRD-Phantom), with row data prepared using numerical data and organ categorical labeled data. Extreme gradient boosting (XGB), gradient boosting (GB), and the random forest-based Extra Trees regressor are employed to assess the performance of the ML models against published ICRP neutron DCC values using the mean square error, mean absolute error, and R2 metrics. The results demonstrate that the ML predictions significantly vary in lower energy ranges and vary less in higher neutron energy ranges while showing good agreement with ICRP values at mid-range energies. Moreover, the categorical data models align closely with the reference doses, suggesting the potential of ML in predicting effective doses for custom phantoms based on regional populations, such as the Saudi voxel-based model. This study paves the way for efficient dose prediction using ML, particularly in scenarios requiring rapid results without extensive computational resources or expertise. The findings also indicate potential improvements in data representation and the inclusion of larger data sets to refine model accuracy and prevent overfitting. Thus, ML methods can serve as valuable techniques for the continued development of personalized dosimetry. Full article
Show Figures

Figure 1

26 pages, 11424 KiB  
Article
Susceptibility Modeling and Potential Risk Analysis of Thermokarst Hazard in Qinghai–Tibet Plateau Permafrost Landscapes Using a New Interpretable Ensemble Learning Method
by Yuting Yang, Jizhou Wang, Xi Mao, Wenjuan Lu, Rui Wang and Hao Zheng
Atmosphere 2024, 15(7), 788; https://doi.org/10.3390/atmos15070788 - 29 Jun 2024
Viewed by 467
Abstract
Climate change is causing permafrost in the Qinghai–Tibet Plateau to degrade, triggering thermokarst hazards and impacting the environment. Despite their ecological importance, the distribution and risks of thermokarst lakes are not well understood due to complex influencing factors. In this study, we introduced [...] Read more.
Climate change is causing permafrost in the Qinghai–Tibet Plateau to degrade, triggering thermokarst hazards and impacting the environment. Despite their ecological importance, the distribution and risks of thermokarst lakes are not well understood due to complex influencing factors. In this study, we introduced a new interpretable ensemble learning method designed to improve the global and local interpretation of susceptibility assessments for thermokarst lakes. Our primary aim was to offer scientific support for precisely evaluating areas prone to thermokarst lake formation. In the thermokarst lake susceptibility assessment, we identified ten conditioning factors related to the formation and distribution of thermokarst lakes. In this highly accurate stacking model, the primary learning units were the random forest (RF), extremely randomized trees (EXTs), extreme gradient boosting (XGBoost), and categorical boosting (CatBoost) algorithms. Meanwhile, gradient boosted decision trees (GBDTs) were employed as the secondary learning unit. Based on the stacking model, we assessed thermokarst lake susceptibility and validated accuracy through six evaluation indices. We examined the interpretability of the stacking model using three interpretation methods: accumulated local effects (ALE), local interpretable model-agnostic explanations (LIME), and Shapley additive explanations (SHAP). The results showed that the ensemble learning stacking model demonstrated superior performance and the highest prediction accuracy. Approximately 91.20% of the total thermokarst hazard points fell within the high and very high susceptible areas, encompassing 20.08% of the permafrost expanse in the QTP. The conclusive findings revealed that slope, elevation, the topographic wetness index (TWI), and precipitation were the primary factors influencing the assessment of thermokarst lake susceptibility. This comprehensive analysis extends to the broader impacts of thermokarst hazards, with the identified high and very high susceptibility zones affecting significant stretches of railway and highway infrastructure, substantial soil organic carbon reserves, and vast alpine grasslands. This interpretable ensemble learning model, which exhibits high accuracy, offers substantial practical significance for project route selection, construction, and operation in the QTP. Full article
(This article belongs to the Special Issue Research about Permafrost–Atmosphere Interactions)
19 pages, 2787 KiB  
Article
Tongue Disease Prediction Based on Machine Learning Algorithms
by Ali Raad Hassoon, Ali Al-Naji, Ghaidaa A. Khalid and Javaan Chahl
Technologies 2024, 12(7), 97; https://doi.org/10.3390/technologies12070097 - 28 Jun 2024
Viewed by 313
Abstract
The diagnosis of tongue disease is based on the observation of various tongue characteristics, including color, shape, texture, and moisture, which indicate the patient’s health status. Tongue color is one such characteristic that plays a vital function in identifying diseases and the levels [...] Read more.
The diagnosis of tongue disease is based on the observation of various tongue characteristics, including color, shape, texture, and moisture, which indicate the patient’s health status. Tongue color is one such characteristic that plays a vital function in identifying diseases and the levels of progression of the ailment. With the development of computer vision systems, especially in the field of artificial intelligence, there has been important progress in acquiring, processing, and classifying tongue images. This study proposes a new imaging system to analyze and extract tongue color features at different color saturations and under different light conditions from five color space models (RGB, YcbCr, HSV, LAB, and YIQ). The proposed imaging system trained 5260 images classified with seven classes (red, yellow, green, blue, gray, white, and pink) using six machine learning algorithms, namely, the naïve Bayes (NB), support vector machine (SVM), k-nearest neighbors (KNN), decision trees (DTs), random forest (RF), and Extreme Gradient Boost (XGBoost) methods, to predict tongue color under any lighting conditions. The obtained results from the machine learning algorithms illustrated that XGBoost had the highest accuracy at 98.71%, while the NB algorithm had the lowest accuracy, with 91.43%. Based on these obtained results, the XGBoost algorithm was chosen as the classifier of the proposed imaging system and linked with a graphical user interface to predict tongue color and its related diseases in real time. Thus, this proposed imaging system opens the door for expanded tongue diagnosis within future point-of-care health systems. Full article
(This article belongs to the Section Information and Communication Technologies)
12 pages, 1307 KiB  
Article
mRCat: A Novel CatBoost Predictor for the Binary Classification of mRNA Subcellular Localization by Fusing Large Language Model Representation and Sequence Features
by Xiao Wang, Lixiang Yang and Rong Wang
Biomolecules 2024, 14(7), 767; https://doi.org/10.3390/biom14070767 - 27 Jun 2024
Viewed by 368
Abstract
The subcellular localization of messenger RNAs (mRNAs) is a pivotal aspect of biomolecules, tightly linked to gene regulation and protein synthesis, and offers innovative insights into disease diagnosis and drug development in the field of biomedicine. Several computational methods have been proposed to [...] Read more.
The subcellular localization of messenger RNAs (mRNAs) is a pivotal aspect of biomolecules, tightly linked to gene regulation and protein synthesis, and offers innovative insights into disease diagnosis and drug development in the field of biomedicine. Several computational methods have been proposed to predict the subcellular localization of mRNAs within cells. However, there remains a deficiency in the accuracy of these predictions. In this study, we propose an mRCat predictor based on the gradient boosting tree algorithm specifically to predict whether mRNAs are localized in the nucleus or in the cytoplasm. This predictor firstly uses large language models to thoroughly explore hidden information within sequences and then integrates traditional sequence features to collectively characterize mRNA gene sequences. Finally, it employs CatBoost as the base classifier for predicting the subcellular localization of mRNAs. The experimental validation on an independent test set demonstrates that mRCat obtained accuracy of 0.761, F1 score of 0.710, MCC of 0.511, and AUROC of 0.751. The results indicate that our method has higher accuracy and robustness compared to other state-of-the-art methods. It is anticipated to offer deep insights for biomolecular research. Full article
(This article belongs to the Special Issue Artificial Intelligence (AI) in Biomedicine)
Show Figures

Figure 1

24 pages, 6969 KiB  
Article
Supervised Machine Learning-Based Models for Predicting Raised Blood Sugar
by Marwa Mustafa Owess, Amani Yousef Owda, Majdi Owda and Salwa Massad
Int. J. Environ. Res. Public Health 2024, 21(7), 840; https://doi.org/10.3390/ijerph21070840 - 27 Jun 2024
Viewed by 794
Abstract
Raised blood sugar (hyperglycemia) is considered a strong indicator of prediabetes or diabetes mellitus. Diabetes mellitus is one of the most common non-communicable diseases (NCDs) affecting the adult population. Recently, the prevalence of diabetes has been increasing at a faster rate, especially in [...] Read more.
Raised blood sugar (hyperglycemia) is considered a strong indicator of prediabetes or diabetes mellitus. Diabetes mellitus is one of the most common non-communicable diseases (NCDs) affecting the adult population. Recently, the prevalence of diabetes has been increasing at a faster rate, especially in developing countries. The primary concern associated with diabetes is the potential for serious health complications to occur if it is not diagnosed early. Therefore, timely detection and screening of diabetes is considered a crucial factor in treating and controlling the disease. Population screening for raised blood sugar aims to identify individuals at risk before symptoms appear, enabling timely intervention and potentially improved health outcomes. However, implementing large-scale screening programs can be expensive, requiring testing, follow-up, and management resources, potentially straining healthcare systems. Given the above facts, this paper presents supervised machine-learning models to detect and predict raised blood sugar. The proposed raised blood sugar models utilize diabetes-related risk factors including age, body mass index (BMI), eating habits, physical activity, prevalence of other diseases, and fasting blood sugar obtained from the dataset of the STEPwise approach to NCD risk factor study collected from adults in the Palestinian community. The diabetes risk factor obtained from the STEPS dataset was used as input for building the prediction model that was trained using various types of supervised learning classification algorithms including random forest, decision tree, Adaboost, XGBoost, bagging decision trees, and multi-layer perceptron (MLP). Based on the experimental results, the raised blood sugar models demonstrated optimal performance when implemented with a random forest classifier, yielding an accuracy of 98.4%. Followed by the bagging decision trees, XGBoost, MLP, AdaBoost, and decision tree with an accuracy of 97.4%, 96.4%, 96.3%, 95.2%, and 94.8%, respectively. Full article
Show Figures

Figure 1

18 pages, 2201 KiB  
Article
Wheat Yield Prediction Using Machine Learning Method Based on UAV Remote Sensing Data
by Shurong Yang, Lei Li, Shuaipeng Fei, Mengjiao Yang, Zhiqiang Tao, Yaxiong Meng and Yonggui Xiao
Drones 2024, 8(7), 284; https://doi.org/10.3390/drones8070284 - 24 Jun 2024
Viewed by 426
Abstract
Accurate forecasting of crop yields holds paramount importance in guiding decision-making processes related to breeding efforts. Despite significant advancements in crop yield forecasting, existing methods often struggle with integrating diverse sensor data and achieving high prediction accuracy under varying environmental conditions. This study [...] Read more.
Accurate forecasting of crop yields holds paramount importance in guiding decision-making processes related to breeding efforts. Despite significant advancements in crop yield forecasting, existing methods often struggle with integrating diverse sensor data and achieving high prediction accuracy under varying environmental conditions. This study focused on the application of multi-sensor data fusion and machine learning algorithms based on unmanned aerial vehicles (UAVs) in wheat yield prediction. Five machine learning (ML) algorithms, namely random forest (RF), partial least squares (PLS), ridge regression (RR), k-nearest neighbor (KNN) and extreme gradient boosting decision tree (XGboost), were utilized for multi-sensor data fusion, together with three ensemble methods including the second-level ensemble methods (stacking and feature-weighted) and the third-level ensemble method (simple average), for wheat yield prediction. The 270 wheat hybrids were used as planting materials under full and limited irrigation treatments. A cost-effective multi-sensor UAV platform, equipped with red–green–blue (RGB), multispectral (MS), and thermal infrared (TIR) sensors, was utilized to gather remote sensing data. The results revealed that the XGboost algorithm exhibited outstanding performance in multi-sensor data fusion, with the RGB + MS + Texture + TIR combination demonstrating the highest fusion performance (R2 = 0.660, RMSE = 0.754). Compared with the single ML model, the employment of three ensemble methods significantly enhanced the accuracy of wheat yield prediction. Notably, the third-layer simple average ensemble method demonstrated superior performance (R2 = 0.733, RMSE = 0.668 t ha−1). It significantly outperformed both the second-layer ensemble methods of stacking (R2 = 0.668, RMSE = 0.673 t ha−1) and feature-weighted (R2 = 0.667, RMSE = 0.674 t ha−1), thereby exhibiting superior predictive capabilities. This finding highlighted the third-layer ensemble method’s ability to enhance predictive capabilities and refined the accuracy of wheat yield prediction through simple average ensemble learning, offering a novel perspective for crop yield prediction and breeding selection. Full article
24 pages, 949 KiB  
Article
Advancing Skin Cancer Prediction Using Ensemble Models
by Priya Natha and Pothuraju RajaRajeswari
Computers 2024, 13(7), 157; https://doi.org/10.3390/computers13070157 - 21 Jun 2024
Viewed by 448
Abstract
There are many different kinds of skin cancer, and an early and precise diagnosis is crucial because skin cancer is both frequent and deadly. The key to effective treatment is accurately classifying the various skin cancers, which have unique traits. Dermoscopy and other [...] Read more.
There are many different kinds of skin cancer, and an early and precise diagnosis is crucial because skin cancer is both frequent and deadly. The key to effective treatment is accurately classifying the various skin cancers, which have unique traits. Dermoscopy and other advanced imaging techniques have enhanced early detection by providing detailed images of lesions. However, accurately interpreting these images to distinguish between benign and malignant tumors remains a difficult task. Improved predictive modeling techniques are necessary due to the frequent occurrence of erroneous and inconsistent outcomes in the present diagnostic processes. Machine learning (ML) models have become essential in the field of dermatology for the automated identification and categorization of skin cancer lesions using image data. The aim of this work is to develop improved skin cancer predictions by using ensemble models, which combine numerous machine learning approaches to maximize their combined strengths and reduce their individual shortcomings. This paper proposes a fresh and special approach for ensemble model optimization for skin cancer classification: the Max Voting method. We trained and assessed five different ensemble models using the ISIC 2018 and HAM10000 datasets: AdaBoost, CatBoost, Random Forest, Gradient Boosting, and Extra Trees. Their combined predictions enhance the overall performance with the Max Voting method. Moreover, the ensemble models were fed with feature vectors that were optimally generated from the image data by a genetic algorithm (GA). We show that, with an accuracy of 95.80%, the Max Voting approach significantly improves the predictive performance when compared to the five ensemble models individually. Obtaining the best results for F1-measure, recall, and precision, the Max Voting method turned out to be the most dependable and robust. The novel aspect of this work is that skin cancer lesions are more robustly and reliably classified using the Max Voting technique. Several pre-trained machine learning models’ benefits are combined in this approach. Full article
(This article belongs to the Special Issue Machine and Deep Learning in the Health Domain 2024)
23 pages, 1842 KiB  
Article
Multi-Objective Plum Tree Algorithm and Machine Learning for Heating and Cooling Load Prediction
by Adam Slowik and Dorin Moldovan
Energies 2024, 17(12), 3054; https://doi.org/10.3390/en17123054 - 20 Jun 2024
Viewed by 390
Abstract
The prediction of heating and cooling loads using machine learning algorithms has been considered frequently in the research literature. However, many of the studies considered the default values of the hyperparameters. This manuscript addresses both the selection of the best regressor and the [...] Read more.
The prediction of heating and cooling loads using machine learning algorithms has been considered frequently in the research literature. However, many of the studies considered the default values of the hyperparameters. This manuscript addresses both the selection of the best regressor and the tuning of the hyperparameter values using a novel nature-inspired algorithm, namely, the Multi-Objective Plum Tree Algorithm. The two objectives that were optimized were the averages of the heating and cooling predictions. The three algorithms that were compared were the Extra Trees Regressor, the Gradient Boosting Regressor, and the Random Forest Regressor of the sklearn machine learning Python library. We considered five hyperparameters which were configurable for each of the three regressors. The solutions were ranked using the MOORA method. The Multi-Objective Plum Tree Algorithm returned a root mean square error value for heating equal to 0.035719 and a root mean square error for cooling equal to 0.076197. The results are comparable to the ones obtained using standard multi-objective algorithms such as the Multi-Objective Grey Wolf Optimizer, Multi-Objective Particle Swarm Optimization, and NSGA-II. The results are also performant concerning the previous studies, which considered the same experimental dataset. Full article
(This article belongs to the Section J: Thermal Management)
Show Figures

Figure 1

Back to TopTop