Bridge target detection in large-scale ultra-high-resolution remote sensing images based on holistic learning (including dataset download address)

2024-07-12

Article Summary

Bridge detection in remote sensing images (RSIs) plays a vital role in various applications，But compared to other object detection, bridge detection faces unique challenges.Bridges exhibit considerable variations in spatial scale and aspect ratio in RSIs. Therefore, holistic bridge inspection in large-scale and high-resolution (VHR) RSIs is necessary to ensure bridge visibility and integrity.However, the lack of datasets of large-scale VHR RSIs limits the performance of deep learning algorithms in bridge detection. Due to the limitation of GPU memory when processing large-scale images, deep learning-based object detection methods usually adopt a cropping strategy, which inevitably leads to fragmented labels and discontinuous predictions.In order to alleviate the scarcity of datasets,The article proposes a large-scale dataset called GLH-Bridge, which includes 6,000 VHR RSIs sampled from different geographical locations around the world.The images range in size from 2,048 × 2,048 to 16,384 × 16,384 pixels and contain a total of 59,737 bridges across different backgrounds, and each bridge is manually annotated using oriented bounding boxes (OBBs) and horizontal bounding boxes (HBBs).Furthermore, we propose an efficient holistic bridge detection network (HBD-Net) for bridge detection in large-sized RSIs.HBD-Net adopts a independent detector-based feature fusion (SDFF) architecture and is optimized by a shape-sensitive sample reweighting (SSRW) strategy. The SDFF architecture performs inter-layer feature fusion (IFF) to fuse multi-scale contexts in a dynamic image pyramid (DIP) for large-scale images, while the SSRW strategy is used to ensure the balance of regression weights for bridges of different aspect ratios.Based on the proposed GLH-Bridge dataset, the authors established a bridge detection benchmark including OBB and HBB tasks and verified the effectiveness of the proposed HBD-Net.In addition, cross-dataset generalization experiments on two publicly available datasets demonstrate the strong generalization ability of the GLH-Bridge dataset.

Paper address:

https://ieeexplore.ieee.org/document/10509806

The data set is large, exceeding 20G, and requires scientific Internet access to download. We have already downloaded it for you.

Dataset download address:

https://www.dilitanxianjia.com/15644/

Past and Present

Bridges are critical infrastructure components that span various terrains and serve as basic transportation facilities. They are of great importance in civil transportation, military operations, and disaster relief[1]. At the same time, bridges are built rapidly and are frequently renovated.For example, in 2012, there were approximately 617,000 bridges in the United States, and their deterioration will increase over the next 50 years, requiring more than $125 billion to repair the backlog of repairs. Therefore, the efficiency and effectiveness of bridge inspection are crucial for timely updating of navigation maps and further monitoring the structural health and condition of bridges [2], [3]. Remote sensing images (RSIs) are suitable as basic data for bridge inspection due to their wide geographical coverage and high frequency of revisits. At the same time, considering the powerful feature representation ability of deep networks, RSIs bridge inspection based on deep learning has great potential and has become a research focus [4].

As shown in Figure 1, compared with other common objects, detecting multi-scale bridges in RSIs is quite challenging due to two main characteristics: (i) diverse object scales.In very high-resolution (VHR) RSIs, the length of bridge instances ranges from a few pixels to thousands of pixels.ii) Extreme aspect ratios.There are significant differences in the elongation of different bridges. To ensure the detectability of small or narrow bridges, it is essential to use very high resolution (VHR) images. At the same time, in order to pursue the structural integrity of large and long bridges in VHR images, it is necessary to perform holistic bridge detection in large-scale images, which imposes strict requirements on both datasets and methods. Despite significant progress in multi-class object detection [12], [13], [14], [15], [16] and bridge detection [4], [11], [17],However, large-scale datasets and appropriate methods for holistic bridge inspection in large-scale VHR RSIs are still insufficient.

As shown in Table 1, although many popular RSIs object detection datasets have been created [6], [7], [8], [18], the number of bridges in these datasets is limited.In addition, datasets created specifically for bridge detection [4], [11] are often limited in sample size and image size.Some existing datasets only provide horizontal bounding box (HBB) annotations instead of precise oriented bounding box (OBB) annotations. Therefore, it seems unrealistic to train a robust and widely adaptable bridge detection model using the above datasets. To address the data limitation problem, the authors constructed GLH-Bridge, a large-scale dataset for bridge detection in large-scale VHR RSIs. GLH-Bridge contains 6,000 VHR RSIs sampled globally and more than 59,000 manually annotated bridges. Compared with existing bridge detection datasets, GLH-Bridge better captures the characteristics of bridges in real scenes by annotating multi-scale bridges in large-scale VHR RSIs, covering a variety of background types such as vegetation, dry riverbeds, and roads. In short, compared with existing bridge detection datasets, GLH-Bridge demonstrates comprehensive advantages and significant advantages.

To advance the research on this fundamental and practical problem, the authors propose a new challenging and meaningful task: holistic bridge inspection in large-scale VHR RSIs.To solve this task, potential solutions can be divided into four main areas: (i) Given the limitations of GPU memory, mainstream deep learning-based object detection methods [15], [16], [19], [20], [21] usually adopt cropping strategies [7], [22]. However, these strategies have inherent limitations and are prone to cutting off large bridges, as shown in Figure 1. In addition to cropping strategies, some object detection methods process raw large-size images through fixed window downsampling strategies [23], [24], [25], resulting in significant image information loss; (ii) streaming methods [26] perform forward and backward passes on small patches of large-size images, but cannot support deep neural networks (DNNs) with normalization; (iii) LMS methods [27] use memory offloading to share memory across system memory (CPU DRAM) and GPU memory. However, they introduce significant time overhead and are limited by the maximum memory expansion rate; (iv) Multi-GPU tensor parallelization techniques [28], [29] are expected to scale deep networks to support the overall processing of large-size images. However, they are often resource-intensive and difficult to operate under conventional conditions. In summary, existing methods cannot effectively perform holistic bridge detection of large-sized VHR RSIs under ordinary computing resources (such as a single GPU with 24 GB memory).

Considering the limitations of the above potential solutions, the authors proposed a holistic bridge detection network (HBD-Net) designed for large-scale VHR RSIs bridge detection.Our approach has two key advantages: (i) The independent detector-based feature fusion (SDFF) architecture, when applied to the dynamic image pyramid (DIP), demonstrates an efficient way to handle large-scale images with minimal resource consumption. (ii) The shape-sensitive sample reweighting (SSRW) strategy balances the regression weights of bridges with different aspect ratios. Experimental results on GLH-Bridge demonstrate the excellent performance of the proposed HBD-Net.

In conclusion, to the best of the authors’ knowledge, this paper is the first to explore holistic bridge detection in large-scale VHR RSIs.The main contributions of this paper are as follows:

We propose GLH-Bridge, the first large-scale dataset for bridge detection using large-scale VHR RSIs.The dataset contains 59,737 bridges covering a variety of backgrounds, providing a comprehensive representation of bridges in real-world scenarios.
A low-cost holistic bridge detection network (HBD-Net) for large-size images is proposed., which can efficiently process large-size images and holistically detect multi-scale bridges through a well-designed SDFF architecture and SSRW strategy.
Using the proposed GLH-Bridge dataset, the authors created a bridge detection benchmark covering both OBB and HBB tasks.HBD-Net outperforms existing state-of-the-art algorithms. In addition, the authors conducted cross-dataset generalization experiments to demonstrate the strong generalization ability of GLH-Bridge. The authors hope that this benchmark can contribute to the basic evaluation of object detection in large-scale images.

Unique ingenuity

The authors developed a new dataset for bridge detection with two goals:(i) Fill the gap in the large-scale dataset for bridge detection in large-scale high-resolution remote sensing images (VHR RSIs). (ii) Promote a novel and challenging task: holistic bridge detection in large-scale VHR RSIs.

Fig. 2. Geographical distribution of sampled images from the proposed GLH-Bridge dataset.

Figure 3. Examples of standard annotations, where yellow circles indicate unannotated cases(a) Roads crossing water that are too curved or irregular in shape are not marked. (b) Connections between two terminals are not marked.

Fig. 4. Illustration of the features of the GLH-Bridge dataset.(a) Comparison of bridge characteristics in different datasets. (b) Distribution of bridge areas in GLH-Bridge. (c) Distribution of bridge lengths in GLH-Bridge. (d) Distribution of bridge density in GLH-Bridge.

Figure 5. Examples of bridges in different contexts from the GLH-Bridge dataset.(a) A bridge across vegetation. (b) A bridge across a dry riverbed. (c) A bridge across a road. (d) A bridge across a body of water.

Fig. 6. The proposed HBD-Net pipeline.It contains the proposed SDFF architecture and SSRW strategy. The SDFF architecture consists of independent detectors and IFF modules. Starting from the input large-size VHR image, the authors build a DIP and send it to the independent detector of SDFF to obtain features. Then, the features of all SDFF detectors are fused through the IFF module to share context and detailed texture information. The SSRW strategy is applied in the sample selection stage of the object detector to balance the regression weights. Finally, the output fused features are input to the head of the object detector to obtain the results of each layer, which are used to calculate the loss with the corresponding true labels.

Fig. 7. Schematic diagram of the proposed IFF module.The figure shows the method of feature fusion between two adjacent layers.

Fig. 8. Schematic diagram of the proposed SSRW strategy.The red and blue dots represent the positive and negative samples selected by the object detector, respectively. For anchor-based detectors, these dots correspond to the feature map locations where anchors or proposals are generated. For anchor-free detectors, these dots indicate the grid on the feature map. For clarity and simplicity, the anchors or proposals associated with the sample points (for anchor-based methods) are not shown in this diagram.

Superior performance

Future Outlook

In this paper, we propose a large-scale dataset named GLH-Bridge for holistic bridge detection in large-scale high-resolution remote sensing images.The proposed dataset contains 6,000 high-resolution remote sensing images with image sizes ranging from 2,048 × 2,048 to 16,384 × 16,384 pixels, including 59,737 bridges across different backgrounds with OBB and HBB annotations. The large image size, large sample size, and diversity of object scales and background types make GLH-Bridge a valuable dataset that has the premise to promote a new challenging but far-reaching task: holistic bridge detection in large-scale high-resolution remote sensing images.Furthermore, the authors proposed HBD-Net, a cost-effective solution tailored for holistic bridge detection in large-size images.Based on the proposed GLH-Bridge dataset,The authors established a benchmark and empirically verified the effectiveness of the proposed HBD-Net.In future work, the authors will continue to enrich the sample size and subcategory annotations of the GLH-Bridge dataset. In addition, the authors' goals include generalizing the proposed HBD-Net to meet multi-class object detection in large-scale images. The authors strive to explore methods that can simultaneously improve the accuracy of large-scale and small-scale bridges, thereby expanding the applicability and effectiveness of HBD-Net in various scenarios.

Technology Sharing

Bridge target detection in large-scale ultra-high-resolution remote sensing images based on holistic learning (including dataset download address)

Personal profile

my contact information