Improving acoustic species identification using data augmentation within a deep learning framework

Author(s): MacIsaac, J., Newson, S., Ashton-Butt, A., Pearce, H. & Milner, B.

Published: October 2024  

Journal: Ecological Informatics Volume: 83

Article No.: 102851

Digital Identifier No. (DOI): 10.1016/j.ecoinf.2024.102851

View journal article

Abstract

Convolutional neural networks (CNNs) are effective tools for acoustic classification tasks such as species identification. Large datasets of labelled recordings are required to develop CNN classifiers which can be difficult to obtain, particularly if species are rare or vocalise infrequently. Additionally, data often requires manual labelling which can be time consuming requiring expert analysis. Artificially generating data using augmentation can address these challenges, however the impact of data augmentation on CNN performance is poorly understood and often omitted in bioacoustic studies. Here, we empirically test the impact of CNN architecture and 20 data augmentation methods on classifier performance. We use acoustic identification of 18 small mammal species as a case study of a species group that can be effectively surveyed by acoustic monitoring, but recordings for training data are scarce and difficult to collect. Networks that achieved the highest accuracy across all sample sizes was a 10-layer CNN (96.43 %) and a pre-trained ResNet50 model (96.37 %). Overall, all augmentation effects improved ResNet50 model performance and 17 effects improved Conv10 performance, increasing relative change in accuracy (RCA) by 0.021–0.641. Three augmentation effects negatively impacted Conv10 RCA by −0.042 to −0.182. We also show that adding augmented data when the number of original samples is low has the greatest positive impact on accuracy and this effect was larger with ResNet50 models. Our work demonstrates that using data augmentation where few original samples are available can considerably improve model performance and highlights the potential of augmentation in developing acoustic classifiers for species where data are limited or difficult to obtain.

Notes

This work was supported by the Natural Environment Research Council and the ARIES Doctoral Training Partnership [grant number NE/S007334/1], the Endangered Landscapes and Seascapes Programme, managed by the Cambridge Conservation Initiative in partnership with Arcadia and Frankfurt Zoological Society. Data collection was facilitated by APB and Anton Kuzmickij in Belarus, Kaunas Tadas Ivanauskas Museum of Zoology in Lithuania and Roger Trout in the UK. The research presented in this paper was carried out on the High Performance Computing Cluster supported by the Research and Specialist Computing Support service at the University of East Anglia.
Staff Author(s)


Related content