Unknown

Dataset Information

0

LAGSwin: Local attention guided Swin-transformer for thermal infrared sports object detection.


ABSTRACT: Compared with visible light images, thermal infrared images have poor resolution, low contrast, signal-to-noise ratio, blurred visual effects, and less information. Thermal infrared sports target detection methods relying on traditional convolutional networks capture the rich semantics in high-level features but blur the spatial details. The differences in physical information content and spatial distribution of high and low features are ignored, resulting in a mismatch between the region of interest and the target. To address these issues, we propose a local attention-guided Swin-transformer thermal infrared sports object detection method (LAGSwin) to encode sports objects' spatial transformation and orientation information. On the one hand, Swin-transformer guided by local attention is adopted to enrich the semantic knowledge of low-level features by embedding local focus from high-level features and generating high-quality anchors while increasing the embedding of contextual information. On the other hand, an active rotation filter is employed to encode orientation information, resulting in orientation-sensitive and invariant features to reduce the inconsistency between classification and localization regression. A bidirectional criss-cross fusion strategy is adopted in the feature fusion stage to enable better interaction and embedding features of different resolutions. At last, the evaluation and verification of multiple open-source sports target datasets prove that the proposed LAGSwin detection framework has good robustness and generalization ability.

SUBMITTER: Meng H 

PROVIDER: S-EPMC11003622 | biostudies-literature | 2024

REPOSITORIES: biostudies-literature

altmetric image

Publications

LAGSwin: Local attention guided Swin-transformer for thermal infrared sports object detection.

Meng Hengran H   Si Shuqi S   Mao Bingfei B   Zhao Jia J   Wu Liping L  

PloS one 20240409 4


Compared with visible light images, thermal infrared images have poor resolution, low contrast, signal-to-noise ratio, blurred visual effects, and less information. Thermal infrared sports target detection methods relying on traditional convolutional networks capture the rich semantics in high-level features but blur the spatial details. The differences in physical information content and spatial distribution of high and low features are ignored, resulting in a mismatch between the region of int  ...[more]

Similar Datasets

| S-EPMC10956868 | biostudies-literature
| S-EPMC9575930 | biostudies-literature
| S-EPMC10536945 | biostudies-literature
| S-EPMC10119175 | biostudies-literature
| S-EPMC6166630 | biostudies-literature
| S-EPMC10099222 | biostudies-literature
2024-09-13 | GSE262953 | GEO
| S-EPMC9214338 | biostudies-literature
| S-EPMC10703013 | biostudies-literature
| S-EPMC10040655 | biostudies-literature