Svoboda | Graniru | BBC Russia | Golosameriki | Facebook
 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (1,420)

Search Parameters:
Keywords = multi-head

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
18 pages, 537 KiB  
Article
ASD-GANNet: A Generative Adversarial Network-Inspired Deep Learning Approach for the Classification of Autism Brain Disorder
by Naseer Ahmed Khan and Xuequn Shang
Brain Sci. 2024, 14(8), 766; https://doi.org/10.3390/brainsci14080766 (registering DOI) - 29 Jul 2024
Viewed by 217
Abstract
The classification of a pre-processed fMRI dataset using functional connectivity (FC)-based features is considered a challenging task because of the set of high-dimensional FC features and the small dataset size. To tackle this specific set of FC high-dimensional features and a small-sized dataset, [...] Read more.
The classification of a pre-processed fMRI dataset using functional connectivity (FC)-based features is considered a challenging task because of the set of high-dimensional FC features and the small dataset size. To tackle this specific set of FC high-dimensional features and a small-sized dataset, we propose here a conditional Generative Adversarial Network (cGAN)-based dataset augmenter to first train the cGAN on computed connectivity features of NYU dataset and use the trained cGAN to generate synthetic connectivity features per category. After obtaining a sufficient number of connectivity features per category, a Multi-Head attention mechanism is used as a head for the classification. We name our proposed approach “ASD-GANNet”, which is end-to-end and does not require hand-crafted features, as the Multi-Head attention mechanism focuses on the features that are more relevant. Moreover, we compare our results with the six available state-of-the-art techniques from the literature. Our proposed approach results using the “NYU” site as a training set for generating a cGAN-based synthetic dataset are promising. We achieve an overall 10-fold cross-validation-based accuracy of 82%, sensitivity of 82%, and specificity of 81%, outperforming available state-of-the art approaches. A sitewise comparison of our proposed approach also outperforms the available state-of-the-art, as out of the 17 sites, our proposed approach has better results in the 10 sites. Full article
(This article belongs to the Section Computational Neuroscience and Neuroinformatics)
Show Figures

Figure 1

20 pages, 12214 KiB  
Article
MIMA: Multi-Feature Interaction Meta-Path Aggregation Heterogeneous Graph Neural Network for Recommendations
by Yang Li, Shichao Yan, Fangtao Zhao, Yi Jiang, Shuai Chen, Lei Wang and Li Ma
Future Internet 2024, 16(8), 270; https://doi.org/10.3390/fi16080270 - 29 Jul 2024
Viewed by 294
Abstract
Meta-path-based heterogeneous graph neural networks have received widespread attention for better mining the similarities between heterogeneous nodes and for discovering new recommendation rules. Most existing models depend solely on node IDs for learning node embeddings, failing to leverage attribute information fully and to [...] Read more.
Meta-path-based heterogeneous graph neural networks have received widespread attention for better mining the similarities between heterogeneous nodes and for discovering new recommendation rules. Most existing models depend solely on node IDs for learning node embeddings, failing to leverage attribute information fully and to clarify the reasons behind a user’s interest in specific items. A heterogeneous graph neural network for recommendation named MIMA (multi-feature interaction meta-path aggregation) is proposed to address these issues. Firstly, heterogeneous graphs consisting of user nodes, item nodes, and their feature nodes are constructed, and the meta-path containing users, items, and their attribute information is used to capture the correlations among different types of nodes. Secondly, MIMA integrates attention-based feature interaction and meta-path information aggregation to uncover structural and semantic information. Then, the constructed meta-path information is subjected to neighborhood aggregation through graph convolution to acquire the correlations between different types of nodes and to further facilitate high-order feature fusion. Furthermore, user and item embedding vector representations are obtained through multiple iterations. Finally, the effectiveness and interpretability of the proposed approach are validated on three publicly available datasets in terms of NDCG, precision, and recall and are compared to all baselines. Full article
(This article belongs to the Special Issue Deep Learning in Recommender Systems)
Show Figures

Figure 1

29 pages, 27671 KiB  
Article
Prediction of Feed Quantity for Wheat Combine Harvester Based on Improved YOLOv5s and Weight of Single Wheat Plant without Stubble
by Qian Zhang, Qingshan Chen, Wenjie Xu, Lizhang Xu and En Lu
Agriculture 2024, 14(8), 1251; https://doi.org/10.3390/agriculture14081251 - 29 Jul 2024
Viewed by 306
Abstract
In complex field environments, wheat grows densely with overlapping organs and different plant weights. It is difficult to accurately predict feed quantity for wheat combine harvester using the existing YOLOv5s and uniform weight of a single wheat plant in a whole field. This [...] Read more.
In complex field environments, wheat grows densely with overlapping organs and different plant weights. It is difficult to accurately predict feed quantity for wheat combine harvester using the existing YOLOv5s and uniform weight of a single wheat plant in a whole field. This paper proposes a feed quantity prediction method based on the improved YOLOv5s and weight of a single wheat plant without stubble. The improved YOLOv5s optimizes Backbone with compact bases to enhance wheat spike detection and reduce computational redundancy. The Neck incorporates a hierarchical residual module to enhance YOLOv5s’ representation of multi-scale features. The Head enhances the detection accuracy of small, dense wheat spikes in a large field of view. In addition, the height of a single wheat plant without stubble is estimated by the depth distribution of the wheat spike region and stubble height. The relationship model between the height and weight of a single wheat plant without stubble is fitted by experiments. Then, feed quantity can be predicted using the weight of a single wheat plant without stubble estimated by the relationship model and the number of wheat plants detected by the improved YOLOv5s. The proposed method was verified through experiments with the 4LZ-6A combine harvester. Compared with the existing YOLOv5s, YOLOv7, SSD, Faster R-CNN, and other enhancements in this paper, the mAP50 of wheat spikes detection by the improved YOLOv5s increased by over 6.8%. It achieved an average relative error of 4.19% with a prediction time of 1.34 s. The proposed method can accurately and rapidly predict feed quantity for wheat combine harvesters and further realize closed-loop control of intelligent harvesting operations. Full article
(This article belongs to the Special Issue Computer Vision and Artificial Intelligence in Agriculture)
Show Figures

Figure 1

16 pages, 1722 KiB  
Article
A TCN-BiGRU Density Logging Curve Reconstruction Method Based on Multi-Head Self-Attention Mechanism
by Wenlong Liao, Chuqiao Gao, Jiadi Fang, Bin Zhao and Zhihu Zhang
Processes 2024, 12(8), 1589; https://doi.org/10.3390/pr12081589 - 29 Jul 2024
Viewed by 284
Abstract
In the process of oil and natural gas exploration and development, density logging curves play a crucial role, providing essential evidence for identifying lithology, calculating reservoir parameters, and analyzing fluid properties. Due to factors such as instrument failure and wellbore enlargement, logging data [...] Read more.
In the process of oil and natural gas exploration and development, density logging curves play a crucial role, providing essential evidence for identifying lithology, calculating reservoir parameters, and analyzing fluid properties. Due to factors such as instrument failure and wellbore enlargement, logging data for some well segments may become distorted or missing during the actual logging process. To address this issue, this paper proposes a density logging curve reconstruction model that integrates the multi-head self-attention mechanism (MSA) with temporal convolutional networks (TCN) and bidirectional gated recurrent units (BiGRU). This model uses the distance correlation coefficient to determine curves with a strong correlation to density as a model input parameter and incorporates stratigraphic lithology indicators as physical constraints to enhance the model’s reconstruction accuracy and stability. This method was applied to reconstruct density logging curves in the X depression area, compared with several traditional reconstruction methods, and verified through core calibration experiments. The results show that the reconstruction method proposed in this paper exhibits high accuracy and generalizability. Full article
Show Figures

Figure 1

22 pages, 989 KiB  
Article
Intra-Frame Graph Structure and Inter-Frame Bipartite Graph Matching with ReID-Based Occlusion Resilience for Point Cloud Multi-Object Tracking
by Shaoyu Sun, Chunhao Shi, Chunyang Wang, Qing Zhou, Rongliang Sun, Bo Xiao, Yueyang Ding and Guan Xi
Electronics 2024, 13(15), 2968; https://doi.org/10.3390/electronics13152968 - 27 Jul 2024
Viewed by 200
Abstract
Three-dimensional multi-object tracking (MOT) using lidar point cloud data is crucial for applications in autonomous driving, smart cities, and robotic navigation. It involves identifying objects in point cloud sequence data and consistently assigning unique identities to them throughout the sequence. Occlusions can lead [...] Read more.
Three-dimensional multi-object tracking (MOT) using lidar point cloud data is crucial for applications in autonomous driving, smart cities, and robotic navigation. It involves identifying objects in point cloud sequence data and consistently assigning unique identities to them throughout the sequence. Occlusions can lead to missed detections, resulting in incorrect data associations and ID switches. To address these challenges, we propose a novel point cloud multi-object tracker called GBRTracker. Our method integrates an intra-frame graph structure into the backbone to extract and aggregate spatial neighborhood node features, significantly reducing detection misses. We construct an inter-frame bipartite graph for data association and design a sophisticated cost matrix based on the center, box size, velocity, and heading angle. Using a minimum-cost flow algorithm to achieve globally optimal matching, thereby reducing ID switches. For unmatched detections, we design a motion-based re-identification (ReID) feature embedding module, which uses velocity and the heading angle to calculate similarity and association probability, reconnecting them with their corresponding trajectory IDs or initializing new tracks. Our method maintains high accuracy and reliability, significantly reducing ID switches and trajectory fragmentation, even in challenging scenarios. We validate the effectiveness of GBRTracker through comparative and ablation experiments on the NuScenes and Waymo Open Datasets, demonstrating its superiority over state-of-the-art methods. Full article
(This article belongs to the Section Computer Science & Engineering)
Show Figures

Figure 1

13 pages, 749 KiB  
Article
Adverse Events in Anti-PD-1-Treated Adjuvant and First-Line Advanced Melanoma Patients
by Daan Jan Willem Rauwerdink, Olivier van Not, Melissa de Meza, Remco van Doorn, Jos van der Hage, A. J. M. van den Eertwegh, John B. Haanen, Maureen J. B. Aarts, Franchette W. P. J. van den Berkmortel, Christiaan U. Blank, Marye J. Boers-Sonderen, Jan Willem B. de Groot, Geke A. P. Hospers, Djura Piersma, Rozemarijn S. van Rijn, A. M. Stevense-den Boer, Astrid A. M. van der Veldt, Gerard Vreugdenhil, Michel W. J. M. Wouters, Karijn P. M. Suijkerbuijk and Ellen Kapiteijnadd Show full author list remove Hide full author list
Cancers 2024, 16(15), 2656; https://doi.org/10.3390/cancers16152656 - 26 Jul 2024
Viewed by 242
Abstract
Introduction: The difference in incidence and severity of anti-PD-1 therapy-related adverse events (irAEs) between adjuvant and advanced treated melanoma patients remains unclear, as no head-to-head studies have compared these groups. Methods: This multi-center cohort study analyzed melanoma patients treated with anti-PD-1 [...] Read more.
Introduction: The difference in incidence and severity of anti-PD-1 therapy-related adverse events (irAEs) between adjuvant and advanced treated melanoma patients remains unclear, as no head-to-head studies have compared these groups. Methods: This multi-center cohort study analyzed melanoma patients treated with anti-PD-1 in adjuvant or advanced settings between 2015 and 2021. Comorbidities and ECOG performance status were assessed before treatment, and grade III-IV irAEs were monitored during treatment. Univariate and multivariate regression analyses were conducted to identify factors associated with irAE development. Results: A total of 1465 advanced melanoma patients and 908 resected melanoma patients received anti-PD-1 therapy. Adjuvant-treated patients were younger, with a median age of 63 years compared to 69 years in the advanced group (p < 0.01), and had a better ECOG performance status (p < 0.01). Comorbidities were seen more frequently in advanced melanoma patients than in those receiving adjuvant treatment, 76% versus 68% (p < 0.01). Grade III-IV irAEs occurred in 214 (15%) advanced treated patients and in 119 (13%) adjuvant-treated patients. Multivariate analysis showed an increased risk of severe irAE development with the presence of any comorbidity (adjusted OR 1.22, 95% CI 1.02–1.44) and ECOG status greater than 1 (adjusted OR 2.00, 95% CI 1.20–3.32). Adjuvant therapy was not associated with an increased risk of irAE development compared to advanced treatment (adjusted OR 0.95, 95% CI 0.74–1.21) after correcting for comorbidities and ECOG performance score. Anti-PD-1 therapy was halted due to toxicity (any grade irAE) more often in the adjuvant setting than in the advanced setting, 20% versus 15% (p < 0.01). Conclusions: Higher ECOG performance status and presence of any comorbidity were independently associated with an increased risk of Grade III-IV irAE in adjuvant and advanced treated melanoma patients. Patients treated in the adjuvant setting did not have an increased risk of developing severe irAEs compared to advanced melanoma patients. These findings are of clinical significance in consulting patients for adjuvant anti-PD-1 treatment. Full article
(This article belongs to the Special Issue Feature Paper in Section “Cancer Therapy” in 2024)
Show Figures

Figure 1

10 pages, 700 KiB  
Protocol
Head Nurse Leadership: Facilitators and Barriers to Adherence to Infection Prevention and Control Programs—A Qualitative Study Protocol
by Eva Cappelli, Jacopo Fiorini, Francesco Zaghini, Federica Canzan and Alessandro Sili
Nurs. Rep. 2024, 14(3), 1849-1858; https://doi.org/10.3390/nursrep14030138 - 26 Jul 2024
Viewed by 447
Abstract
Background: The effective management of Healthcare-Associated Infections (HAIs) relies on the implementation of good practice across the entire multidisciplinary team. The organizational context and the role of head nurses influence the team’s performance and behavior. Understanding how decision-making processes influence healthcare professionals’ behavior [...] Read more.
Background: The effective management of Healthcare-Associated Infections (HAIs) relies on the implementation of good practice across the entire multidisciplinary team. The organizational context and the role of head nurses influence the team’s performance and behavior. Understanding how decision-making processes influence healthcare professionals’ behavior in the management of HAIs could help identify alternative interventions for reducing the risk of infection in healthcare organizations. This study aims to explore how the behaviors promoted and actions implemented by the head nurse can influence healthcare professionals’ adherence to Infection Prevention and Control (IPC) programs. Methods: A multi-center qualitative study will be conducted using a Grounded Theory approach. Observations will be conducted, followed by individual interviews and/or focus groups. A constructive and representative sample of healthcare professionals who care directly for patients will be enrolled in the study. The COnsolidated criteria for REporting Qualitative research (COREQ) checklist will be followed to ensure the quality of this study protocol. A multistep inductive process will be used to analyze the data. Conclusions: The study results will provide an understanding of how nurses perceive the influence of leadership and how they modify their behaviors and activities toward patients according to IPC programs. The study will identify barriers and facilitators to IPC compliance and suggest strategies to minimize negative patient outcomes, such as the development of an HAI. Full article
Show Figures

Figure 1

19 pages, 8525 KiB  
Article
MTrans: M-Transformer and Knowledge Graph-Based Network for Predicting Drug–Drug Interactions
by Shiqi Wu, Baisong Liu, Xueyuan Zhang, Xiaowen Shao and Chennan Lin
Electronics 2024, 13(15), 2935; https://doi.org/10.3390/electronics13152935 - 25 Jul 2024
Viewed by 304
Abstract
The combined use of multiple medications is common in treatment, which may lead to severe drug–drug interactions (DDIs). Deep learning methods have been widely used to predict DDIs in recent years. However, current models need help to fully understand the characteristics of drugs [...] Read more.
The combined use of multiple medications is common in treatment, which may lead to severe drug–drug interactions (DDIs). Deep learning methods have been widely used to predict DDIs in recent years. However, current models need help to fully understand the characteristics of drugs and the relationships between these characteristics, resulting in inaccurate and inefficient feature representations. Beyond that, existing studies predominantly focus on analyzing a single DDIs, failing to explore multiple similar DDIs simultaneously, thus limiting the discovery of common mechanisms underlying DDIs. To address these limitations, this research proposes a method based on M-Transformer and knowledge graph for predicting DDIs, comprising a dual-pathway approach and neural network. In the first pathway, we leverage the interpretability of the transformer to capture the intricate relationships between drug features using the multi-head attention mechanism, identifying and discarding redundant information to obtain a more refined and information-dense drug representation. However, due to the potential difficulty for a single transformer model to understand features from multiple semantic spaces, we adopted M-Transformer to understand the structural and pharmacological information of the drug as well as the connections between them. In the second pathway, we constructed a drug–drug interaction knowledge graph (DDIKG) using drug representation vectors obtained from M-Transformer as nodes and DDI types as edges. Subsequently, drug edges with similar interactions were aggregated using a graph neural network (GNN). This facilitates the exploration and extraction of shared mechanisms underlying drug–drug interactions. Extensive experiments demonstrate that our MTrans model accurately predicts DDIs and outperforms state-of-the-art models. Full article
(This article belongs to the Special Issue Medical Applications of Artificial Intelligence)
Show Figures

Figure 1

18 pages, 20454 KiB  
Article
RCRFNet: Enhancing Object Detection with Self-Supervised Radar–Camera Fusion and Open-Set Recognition
by Minwei Chen, Yajun Liu, Zenghui Zhang and Weiwei Guo
Sensors 2024, 24(15), 4803; https://doi.org/10.3390/s24154803 - 24 Jul 2024
Viewed by 240
Abstract
Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this [...] Read more.
Robust object detection in complex environments, poor visual conditions, and open scenarios presents significant technical challenges in autonomous driving. These challenges necessitate the development of advanced fusion methods for millimeter-wave (mmWave) radar point cloud data and visual images. To address these issues, this paper proposes a radar–camera robust fusion network (RCRFNet), which leverages self-supervised learning and open-set recognition to effectively utilise the complementary information from both sensors. Specifically, the network uses matched radar–camera data through a frustum association approach to generate self-supervised signals, enhancing network training. The integration of global and local depth consistencies between radar point clouds and visual images, along with image features, helps construct object class confidence levels for detecting unknown targets. Additionally, these techniques are combined with a multi-layer feature extraction backbone and a multimodal feature detection head to achieve robust object detection. Experiments on the nuScenes public dataset demonstrate that RCRFNet outperforms state-of-the-art (SOTA) methods, particularly in conditions of low visual visibility and when detecting unknown class objects. Full article
Show Figures

Figure 1

21 pages, 38700 KiB  
Article
Transformative Noise Reduction: Leveraging a Transformer-Based Deep Network for Medical Image Denoising
by Rizwan Ali Naqvi, Amir Haider, Hak Seob Kim, Daesik Jeong and Seung-Won Lee
Mathematics 2024, 12(15), 2313; https://doi.org/10.3390/math12152313 - 24 Jul 2024
Viewed by 269
Abstract
Medical image denoising has numerous real-world applications. Despite their widespread use, existing medical image denoising methods fail to address complex noise patterns and typically generate artifacts in numerous cases. This paper proposes a novel medical image denoising method that learns denoising using an [...] Read more.
Medical image denoising has numerous real-world applications. Despite their widespread use, existing medical image denoising methods fail to address complex noise patterns and typically generate artifacts in numerous cases. This paper proposes a novel medical image denoising method that learns denoising using an end-to-end learning strategy. Furthermore, the proposed model introduces a novel deep–wider residual block to capture long-distance pixel dependencies for medical image denoising. Additionally, this study proposes leveraging multi-head attention-guided image reconstruction to effectively denoise medical images. Experimental results illustrate that the proposed method outperforms existing qualitative and quantitative evaluation methods for numerous medical image modalities. The proposed method can outperform state-of-the-art models for various medical image modalities. It illustrates a significant performance gain over its counterparts, with a cumulative PSNR score of 8.79 dB. The proposed method can also denoise noisy real-world medical images and improve clinical application performance such as abnormality detection. Full article
Show Figures

Figure 1

11 pages, 4963 KiB  
Article
Impact of Dual-Depth Head-Up Displays on Vehicle Driver Performance
by Chien-Yu Chen, Tzu-An Chou, Chih-Hao Chuang, Ching-Cheng Hsu, Yi-Sheng Chen and Shi-Hwa Huang
Appl. Sci. 2024, 14(15), 6441; https://doi.org/10.3390/app14156441 - 24 Jul 2024
Viewed by 237
Abstract
In recent years, the display information of head-up displays for vehicles has gradually developed from single-depth to multi-depth. To reduce the workload of driving and the number of eye adjustments, researchers use the visual perception of human eyes to realize the image information [...] Read more.
In recent years, the display information of head-up displays for vehicles has gradually developed from single-depth to multi-depth. To reduce the workload of driving and the number of eye adjustments, researchers use the visual perception of human eyes to realize the image information integrated with the real world. In this study, HoloLens2 is used to demonstrate head-up displays of different depths. An electroencephalogram, an electro-ophthalmogram, and a NASA-TLX questionnaire were used to evaluate the fatigue of drivers during long-term driving. The results showed that a dual-depth head-up display could effectively reduce the driver’s workload. Full article
(This article belongs to the Special Issue Virtual Models for Autonomous Driving Systems)
Show Figures

Figure 1

27 pages, 3744 KiB  
Article
Multi-Head Self-Attention-Based Fully Convolutional Network for RUL Prediction of Turbofan Engines
by Zhaofeng Liu, Xiaoqing Zheng, Anke Xue, Ming Ge and Aipeng Jiang
Algorithms 2024, 17(8), 321; https://doi.org/10.3390/a17080321 - 23 Jul 2024
Viewed by 243
Abstract
Remaining useful life (RUL) prediction is widely applied in prognostic and health management (PHM) of turbofan engines. Although some of the existing deep learning-based models for RUL prediction of turbofan engines have achieved satisfactory results, there are still some challenges. For example, the [...] Read more.
Remaining useful life (RUL) prediction is widely applied in prognostic and health management (PHM) of turbofan engines. Although some of the existing deep learning-based models for RUL prediction of turbofan engines have achieved satisfactory results, there are still some challenges. For example, the spatial features and importance differences hidden in the raw monitoring data are not sufficiently addressed or highlighted. In this paper, a novel multi-head self-Attention fully convolutional network (MSA-FCN) is proposed for predicting the RUL of turbofan engines. MSA-FCN combines a fully convolutional network and multi-head structure, focusing on the degradation correlation among various components of the engine and extracting spatially characteristic degradation representations. Furthermore, by introducing dual multi-head self-attention modules, MSA-FCN can capture the differential contributions of sensor data and extracted degradation representations to RUL prediction, emphasizing key data and representations. The experimental results on the C-MAPSS dataset demonstrate that, under various operating conditions and failure modes, MSA-FCN can effectively predict the RUL of turbofan engines. Compared with 11 mainstream deep neural networks, MSA-FCN achieves competitive advantages in terms of both accuracy and timeliness for RUL prediction, delivering more accurate and reliable forecasts. Full article
Show Figures

Figure 1

18 pages, 5597 KiB  
Article
Spatiotemporal Feature Fusion Transformer for Precipitation Nowcasting via Feature Crossing
by Taisong Xiong, Weiping Wang, Jianxin He, Rui Su, Hao Wang and Jinrong Hu
Remote Sens. 2024, 16(14), 2685; https://doi.org/10.3390/rs16142685 - 22 Jul 2024
Viewed by 385
Abstract
Precipitation nowcasting plays an important role in mitigating the damage caused by severe weather. The objective of precipitation nowcasting is to forecast the weather conditions 0–2 h ahead. Traditional models based on numerical weather prediction and radar echo extrapolation obtain relatively better results. [...] Read more.
Precipitation nowcasting plays an important role in mitigating the damage caused by severe weather. The objective of precipitation nowcasting is to forecast the weather conditions 0–2 h ahead. Traditional models based on numerical weather prediction and radar echo extrapolation obtain relatively better results. In recent years, models based on deep learning have also been applied to precipitation nowcasting and have shown improvement. However, the forecast accuracy is decreased with longer forecast times and higher intensities. To mitigate the shortcomings of existing models for precipitation nowcasting, we propose a novel model that fuses spatiotemporal features for precipitation nowcasting. The proposed model uses an encoder–forecaster framework that is similar to U-Net. First, in the encoder, we propose a spatial and temporal multi-head squared attention module based on MaxPool and AveragePool to capture every independent sequence feature, as well as a global spatial and temporal feedforward network, to learn the global and long-distance relationships between whole spatiotemporal sequences. Second, we propose a cross-feature fusion strategy to enhance the interactions between features. This strategy is applied to the components of the forecaster. Based on the cross-feature fusion strategy, we constructed a novel multi-head squared cross-feature fusion attention module and cross-feature fusion feedforward network in the forecaster. Comprehensive experimental results demonstrated that the proposed model more effectively forecasted high-intensity levels than other models. These results prove the effectiveness of the proposed model in terms of predicting convective weather. This indicates that our proposed model provides a feasible solution for precipitation nowcasting. Extensive experiments also proved the effectiveness of the components of the proposed model. Full article
(This article belongs to the Special Issue Deep Learning Techniques Applied in Remote Sensing)
Show Figures

Figure 1

18 pages, 10628 KiB  
Article
A CNN- and Self-Attention-Based Maize Growth Stage Recognition Method and Platform from UAV Orthophoto Images
by Xindong Ni, Faming Wang, Hao Huang, Ling Wang, Changkai Wen and Du Chen
Remote Sens. 2024, 16(14), 2672; https://doi.org/10.3390/rs16142672 - 22 Jul 2024
Viewed by 284
Abstract
The accurate recognition of maize growth stages is crucial for effective farmland management strategies. In order to overcome the difficulty of quickly obtaining precise information about maize growth stage in complex farmland scenarios, this study proposes a Maize Hybrid Vision Transformer (MaizeHT) that [...] Read more.
The accurate recognition of maize growth stages is crucial for effective farmland management strategies. In order to overcome the difficulty of quickly obtaining precise information about maize growth stage in complex farmland scenarios, this study proposes a Maize Hybrid Vision Transformer (MaizeHT) that combines a convolutional algorithmic structure with self-attention for maize growth stage recognition. The MaizeHT model utilizes a ResNet34 convolutional neural network to extract image features to self-attention, which are then transformed into sequence vectors (tokens) using Patch Embedding. It simultaneously inserts category information and location information as a token. A Transformer architecture with multi-head self-attention is employed to extract token features and predict maize growth stage categories using a linear layer. In addition, the MaizeHT model is standardized and encapsulated, and a prototype platform for intelligent maize growth stage recognition is developed for deployment on a website. Finally, the performance validation test of MaizeHT was carried out. To be specific, MaizeHT has an accuracy of 97.71% when the input image resolution is 224 × 224 and 98.71% when the input image resolution is 512 × 512 on the self-built dataset, the number of parameters is 15.446 M, and the floating-point operations are 4.148 G. The proposed maize growth stage recognition method could provide computational support for maize farm intelligence. Full article
Show Figures

Figure 1

21 pages, 15805 KiB  
Article
Wet-ConViT: A Hybrid Convolutional–Transformer Model for Efficient Wetland Classification Using Satellite Data
by Ali Radman, Fariba Mohammadimanesh and Masoud Mahdianpari
Remote Sens. 2024, 16(14), 2673; https://doi.org/10.3390/rs16142673 - 22 Jul 2024
Viewed by 351
Abstract
Accurate and efficient classification of wetlands, as one of the most valuable ecological resources, using satellite remote sensing data is essential for effective environmental monitoring and sustainable land management. Deep learning models have recently shown significant promise for identifying wetland land cover; however, [...] Read more.
Accurate and efficient classification of wetlands, as one of the most valuable ecological resources, using satellite remote sensing data is essential for effective environmental monitoring and sustainable land management. Deep learning models have recently shown significant promise for identifying wetland land cover; however, they are mostly constrained in practical issues regarding efficiency while gaining high accuracy with limited training ground truth samples. To address these limitations, in this study, a novel deep learning model, namely Wet-ConViT, is designed for the precise mapping of wetlands using multi-source satellite data, combining the strengths of multispectral Sentinel-2 and SAR Sentinel-1 datasets. Both capturing local information of convolution and the long-range feature extraction capabilities of transformers are considered within the proposed architecture. Specifically, the key to Wet-ConViT’s foundation is the multi-head convolutional attention (MHCA) module that integrates convolutional operations into a transformer attention mechanism. By leveraging convolutions, MHCA optimizes the efficiency of the original transformer self-attention mechanism. This resulted in high-precision land cover classification accuracy with a minimal computational complexity compared with other state-of-the-art models, including two convolutional neural networks (CNNs), two transformers, and two hybrid CNN–transformer models. In particular, Wet-ConViT demonstrated superior performance for classifying land cover with approximately 95% overall accuracy metrics, excelling the next best model, hybrid CoAtNet, by about 2%. The results highlighted the proposed architecture’s high precision and efficiency in terms of parameters, memory usage, and processing time. Wet-ConViT could be useful for practical wetland mapping tasks, where precision and computational efficiency are paramount. Full article
(This article belongs to the Special Issue Satellite-Based Climate Change and Sustainability Studies)
Show Figures

Figure 1

Back to TopTop