Channel-Attentive Transformer-Based Multimodal Semantic Segmentation Model for Early Detection of Wheat Yellow Rust Disease
18th International Conference on Machine Vision, ICMV 2025, Paris, Fransa, 19 - 22 Ekim 2025, cilt.14114, (Tam Metin Bildiri)
- Yayın Türü: Bildiri / Tam Metin Bildiri
- Cilt numarası: 14114
- Doi Numarası: 10.1117/12.3090951
- Basıldığı Şehir: Paris
- Basıldığı Ülke: Fransa
- Anahtar Kelimeler: Multimodal semantic segmentation, remote sensing, wheat yellow rust disease
- Ankara Üniversitesi Adresli: Evet
Özet
Early detection of wheat yellow rust is vital for timely fungicide application before infections exceed 5% of all plants in the monitored plot. While RGB imagery offers high spatial detail, NIR sensing captures early biochemical changes—chlorophyll loss and water stress—undetectable in RGB alone. Therefore, we propose a multimodal semantic segmentation model that relies on Transformer architecture to fuse RGB and NIR modalities. Additionally, further to improve the Transformer-based model, adaptive channel re-weighting is incorporated through lightweight squeeze-and-excitation blocks. When evaluated on UAV-collected field data specifically curated for wheat yellow rust disease, our model achieves an IoU of 0.689, outperforming CNN-based multimodal baselines by 14.1% and the best NIR-only CNN-based model by 11.3%. These findings highlight the potential efficacy of channel-attentive multimodal Transformer architecture for precise wheat yellow rust monitoring.