Autoencoder (AE) is widely used in image fusion. However, AE-based fusion methods usually use the same encoder to extract the features of images from different sensors/modalities without considering the differences between them. In addition, these methods cannot fuse the images in real time. To solve these problems, an end-to-end fusion network is proposed for fast infrared image and visible image fusion. We design an end-to-end W-shaped network (W-Net), which consists of two independent encoders, one shared decoder and skip connections. The two encoders extract the representative features of images from different sources respectively, and the decoder combines the hierarchical features from corresponding layers and reconstructs the fused image without using additional fusion layer or any handcrafted fusion rules. Skip connections are added to help retain the details and salient features in the fused image. Specifically, W-Net is lightweight, with fewer parameters than the existing AE-based methods. The experimental results show that our fusion network performs well in terms of subjective and objective visual assessments compared with other state-of-the-art fusion methods. It can fuse the images very fast (e.g., the fusion time of 20 pairs of images in the TNO dataset is 0.871 to 1.081 ms), operating above real-time speed. |
ACCESS THE FULL ARTICLE
No SPIE Account? Create one
Image fusion
Infrared imaging
Infrared radiation
Visible radiation
Education and training
Feature fusion
Feature extraction