Infrared small target detection algorithm with U-shaped multiscale transformer network

Peipei DUAN; Yan ZHANG; Mingshi LUO; Xiaoying YAN

doi:10.1051/jnwpu/20254310154

All issues

Volume 43 / No 1 (February 2025)

JNWPU, 43 1 (2025) 154-162

Abstract

Open Access

Issue		JNWPU Volume 43, Number 1, February 2025


Page(s)		154 - 162
DOI		https://doi.org/10.1051/jnwpu/20254310154
Published online		18 April 2025

JNWPU 2025, 43(1): 154–162

Infrared small target detection algorithm with U-shaped multiscale transformer network

基于U型多尺度Transformer网络的红外小目标检测算法

Peipei DUAN (段沛沛)¹, Yan ZHANG (张严)², Mingshi LUO (雒明世)¹ and Xiaoying YAN (闫效莺)¹

¹ School of Computer Science, Xi'an Shiyou University, Xi'an 710065, China
² School of Electric Information and Artificial Intelligence, Shaanxi University of Science & Technology, Xi'an 710021, China

Received: 23 September 2023

Abstract

To solve the problem of small targets feature extraction and the susceptibility of targets to being overwhelmed by noise and complex backgrounds, a detection method with U-shaped multiscale transformer network is proposed. Based on the U-shaped multiscale network architecture, the proposed method uses convolution operations to extract and enhance local salient features of small targets. Concurrently, it uses the Transformer mechanism to model global image features, facilitating the extraction and suppression of the image background. Subsequently, through self-attention operations on target confidence maps and feature maps, fusion of shallow and deep features in images is achieved. This accomplishes pixel-level segmentation of infrared small targets, fulfilling the purpose of target detection. Experiments demonstrate in infrared sequence image dim and small aircraft target detection and tracking data set, even when applied to infrared images with complex background and noisy, our method outperforms the state-of-the-art detection methods. The method shows good robustness and high detection accuracy. When the threshold is selected to maximize the average of FM, the detection rate of our method reaches 0.997 2, its false alarm rate is 2.82×10^-7, the precision rate is 0.912 7, and the recall rate is 0.921.

摘要

针对红外小目标特征难以提取、易被噪声干扰及复杂背景淹没等问题, 提出了一种基于U型多尺度Transformer网络的检测算法。该算法在U型多尺度网络架构下, 借助卷积操作提取、强化小目标局部显著性特征, 同时又基于Transformer机制对图像全局特征进行建模, 以获取红外图像背景信息; 通过对所生成目标置信图与特征图的自注意力运算, 完成了对图像浅层和深层特征的融合, 实现了对像素级红外小目标的分割及检测。实验证明, 在红外序列图像弱小飞机目标检测跟踪数据集中, 即使针对背景复杂且含噪的图像进行检测, 所提算法性能仍然优于对比算法, 呈现了良好的鲁棒性及稳定、准确的检测效果。在算法阈值选用使FM平均值最大的情况下, 其检测率为0.997 2, 虚警率为2.82×10^-7, 精确率为0.912 7, 而召回率则为0.921。

Key words: infrared small target detection / image segmentation / deep learning / self-attention mechanism

关键字 : 红外小目标检测 / 图像分割 / 深度学习 / 自注意力机制

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.