基于高效视觉Transformer增强的无锚框YOLO混凝土桥梁损伤自动化检测

, , , , ,

工程(英文) ›› 2025, Vol. 51 ›› Issue (8) : 311 -326.

PDF
工程(英文) ›› 2025, Vol. 51 ›› Issue (8) : 311 -326. DOI: 10.1016/j.eng.2025.02.018
研究论文

基于高效视觉Transformer增强的无锚框YOLO混凝土桥梁损伤自动化检测

  • a, *
  • a
  • a
  • a
  • b
  • c
作者信息 +

Automated Concrete Bridge Damage Detection Using an Efficient Vision Transformer-Enhanced Anchor-Free YOLO

Author information +
文章历史 +
PDF

Abstract

Deep learning techniques have recently been the most popular method for automatically detecting bridge damage captured by unmanned aerial vehicles (UAVs). However, their wider application to real-world scenarios is hindered by three challenges: ① defect scale variance, motion blur, and strong illumination significantly affect the accuracy and reliability of damage detectors; ② existing commonly used anchor-based damage detectors struggle to effectively generalize to harsh real-world scenarios; and ③ convolutional neural networks (CNNs) lack the capability to model long-range dependencies across the entire image. This paper presents an efficient Vision Transformer-enhanced anchor-free YOLO (you only look once) method to address these challenges. First, a concrete bridge damage dataset was established, augmented by motion blur and varying brightness. Four key enhancements were then applied to an anchor-based YOLO method: ① Four detection heads were introduced to alleviate the multi-scale damage detection issue; ② decoupled heads were employed to address the conflict between classification and bounding box regression tasks inherent in the original coupled head design; ③ an anchor-free mechanism was incorporated to reduce the computational complexity and improve generalization to real-world scenarios; and ④ a novel Vision Transformer block, C3MaxViT, was added to enable CNNs to model long-range dependencies. These enhancements were integrated into an advanced anchor-based YOLOv5l algorithm, and the proposed Vision Transformer-enhanced anchor-free YOLO method was then compared against cutting-edge damage detection methods. The experimental results demonstrated the effectiveness of the proposed method, with an increase of 8.1% in mean average precision at intersection over union threshold of 0.5 (mAP50) and an improvement of 8.4% in mAP@[0.5:.05:.95] respectively. Furthermore, extensive ablation studies revealed that the four detection heads, decoupled head design, anchor-free mechanism, and C3MaxViT contributed improvements of 2.4%, 1.2%, 2.6%, and 1.9% in mAP50, respectively.

关键词

Key words

Computer vision / Deep learning techniques / Vision Transformer / Object detection / Bridge visual inspection

引用本文

引用格式 ▾
, , , , , 基于高效视觉Transformer增强的无锚框YOLO混凝土桥梁损伤自动化检测[J]. 工程(英文), 2025, 51(8): 311-326 DOI:10.1016/j.eng.2025.02.018

登录浏览全文

4963

注册一个新账户 忘记密码

参考文献

AI Summary AI Mindmap
PDF

Supplementary files

Supplementary data

942

访问

0

被引

详细

导航
相关文章

AI思维导图

/