By Nidhi DhullReviewed by Susha Cheriyedath, M.Sc.Sep 2 2024
A recent article published in Data in Brief introduced an image-type dataset for deep learning-based detection of building facade features. The data was prepared from the static street view images (SSVIs) of buildings in London and Scotland.
Background
Building characteristics play a critical role in construction management and architectural design. SSVIs can be analyzed with deep learning techniques to interpret building characteristics without a physical visit. They provide a human-like view of building facades, detailing materials, window types, etc. Thus, SSVIs can more cost-effectively and quickly identify building characteristics than actual site visits.
Facade datasets exist for tasks like window state recognition, tile defect categorization, window identification, and building segmentation. However, varying annotations are needed for diverse tasks, even using similar images, which is inefficient and laborious.
Moreover, no raw and labeled data is freely available for the following four tasks using SSVIs: categorizing the number of stories, usable SSVIs, building typologies, and external cladding materials.
Deep learning methods have high generalization capability and can automate many tasks involving SSVIs. Thus, this study aimed to establish labeled datasets for the four tasks that can help develop robust deep-learning models.
Data Description
Four tasks were constructed to classify the number of stories, building typologies, external cladding materials, and usable SSVIs. The number of stories dataset comprised images classified by their corresponding floors (1F to 5F, or others). The ‘others’ category in each dataset accounted for images unrelated to the captured building characteristics for that task.
This task had 700 initial images, divided into 418 (80%) for training and 282 (20%) for testing. Alternatively, the dataset for categorizing building typologies included images categorized as residential and non-residential. The initial 450 images collected for this task were divided into 270 (80%) for training and 180 (20%) for testing.
The dataset of usable SSVIs was differentiated for the complete building facade and first-story building, categorizing the images as usable, potential, and non-usable in both cases. While the former used 1000 raw images (divided into 600 (60%) for training, 200 (20%) for validation, and 200 (20%) for testing), the latter used 700 raw images (divided into 418 (60%) for training, 140 (20%) for validation, and 142 (20%) for testing). The augmentation methods generated 3600 augmented images in each case.
The dataset for analyzing external cladding materials categorized images by type, i.e., brick, concrete, glass, stone, mixed, and others. The ‘others’ category accounted for rare textures and appearances like wood, metal, and synthetic sidings.
This dataset used 1550 raw images from London (split into 928 (60%) for training, 311 (20%) for validation, and 311 (20%) for testing) and 1017 from Scotland (split into 608 (60%) for training, 204 (20%) for validation, and 205 (20%) for testing). Accordingly, 5568 augmented images for London and 3648 for Scotland were generated.
Methods
The original dataset was prepared using SSVIs captured from buildings in London and Scotland. Their addresses were obtained from the United Kingdom’s national mapping agency (Ordnance Survey Data Hub). A randomly chosen sample of buildings from the northwest and eastern regions was also used.
Building images were downloaded from GSV (Google Street View) Static API (Application Programming Interface) using the selected addresses. Specific API parameters were manually altered to optimize visual hints for identifying building characteristics across the four tasks.
For instance, the ‘Field of View’ parameter was set between 10 and 50 degrees, and the ‘pitch’ parameter was set between 25 and 30 degrees. The resolution of the captured images was 640×640 pixels, the maximum of the GSV platform.
Augmentation techniques (brightness, contrast, perspective, rotation, scale, shear, translation) applied exclusively to the training subset in each category resulted in nine different datasets. The parameters of these image augmentation techniques were selected through a trial-and-error method to generate close-to-reality images. Each augmentation technique effectively doubled the number of images. Notably, the voids in these images were filled with 255-value pixels to pad these areas with white.
Conclusion
Overall, the researchers prepared a detailed dataset to classify the number of stories, building typologies, exterior cladding materials, and usable SSVIs. The well-labeled dataset can serve as a foundational basis for training and validating deep-learning algorithms, paving the way for advancements in building management and design.
However, the dataset has certain limitations, such as its geographic specificity. It captures the architectural styles, materials, and environmental conditions unique to the buildings in London and Scotland and may not represent the global architectural diversity.
Moreover, the captured image resolution is enough for the targeted tasks in this dataset but may not suffice for comprehensive tasks like crack recognition, crack segmentation, and tile segmentation.
Thus, the dataset needs to be complemented with additional data from diverse regions, building types, and temporal settings to improve the robustness and applicability of the models trained on it.
Journal Reference
Wang, S., Park, S., Park, S., & Kim, J. (2024). Building façade datasets for analyzing building characteristics using deep learning. Data in Brief, 110885. DOI: 10.1016/j.dib.2024.110885, https://www.sciencedirect.com/science/article/pii/S2352340924008485
Disclaimer: The views expressed here are those of the author expressed in their private capacity and do not necessarily represent the views of AZoM.com Limited T/A AZoNetwork the owner and operator of this website. This disclaimer forms part of the Terms and conditions of use of this website.