Digital Library

cab1

 
Title:      MID-LEVEL TRANSFORMER FUSION OF LOCAL AND GLOBAL FEATURES FOR REMOTE SENSING SCENE CLASSIFICATION
Author(s):      Vian Abdulmajeed, Khaled Jouini and Ouajdi Korbaa
ISBN:      978-989-8704-71-9
Editors:      Paula Miranda and Pedro IsaĆ­as
Year:      2025
Edition:      Single
Keywords:      Remote Sensing Scene Classification, Transformer-Based Fusion, Swin Transformer, CBAM, Channel Attention, Spatial Attention
Type:      Full Paper
First Page:      75
Last Page:      82
Language:      English
Cover:      cover          
Full Contents:      if you are a member please login Download
Paper Abstract:      Accurate remote sensing scene classification requires the effective integration of fine-grained local details with broader global context. This paper proposes a novel mid-level fusion framework that leverages a Transformer Encoder to dynamically fuse these complementary features. Our dual-branch architecture extracts local details using EfficientNet-B0, enhanced with a Convolutional Block Attention Module (CBAM), while capturing global context through a Swin Tiny Transformer. By integrating these features at an intermediate stage, the model learns complex interactions between local and global representations. The proposed approach achieves highly competitive accuracy, attaining 98.67% accuracy on EuroSAT and 96.06% on RESISC45, outperforming several recent methods, while maintaining a practical model size. Ablation studies validate the contributions of the Transformer Encoder and CBAM, demonstrating their synergistic effect on feature refinement.
   

Social Media Links

Search

Login