Interpretable deep learning for biodiversity monitoring: Introducing AudioProtoPNet

Machine Learning


Global biodiversity has declined sharply in recent decades, with bird populations in North America declining by 29% since 1970. A variety of factors are driving this decline, including land use change, resource development, pollution, climate change, and invasive species. Birds serve as important indicators of environmental health, and effective monitoring systems are essential to combat biodiversity decline. Passive acoustic monitoring (PAM) has emerged as a cost-effective method to collect bird data without disturbing their habitat. Although traditional his PAM analysis is time-consuming, recent advances in deep learning techniques offer promising solutions for automatically identifying bird species from audio recordings. However, it is essential to make complex algorithms easy to understand for ornithologists and biologists.

Although XAI techniques have been widely studied in image and text processing, research on their application to audio data is limited. Posterior explanation methods, such as counterfactual methods, gradient methods, perturbation methods, and attention-based attribution methods, have been mainly studied in the medical context. Preliminary research on deep learning for speech interpretation includes deep prototype learning, which was first proposed for image classification. Although advances such as DeformableProtoPNet have been made, applications to complex multi-label problems such as bioacoustic bird classification remain unexplored.

Published by researchers from the Fraunhofer Institute for Energy Economics and Energy Systems Technology (IEE) and Intelligent Embedded Systems (IES) at the University of Kassel. AudioProtoPNet, an adaptation of the ProtoPNet architecture tailored for complex multi-label audio classification, highlighting the interpretability inherent in that architecture. This approach leverages his ConvNeXt backbone for feature extraction to learn prototype patterns for each bird species from the spectrograms of the training data. Classification of new data involves comparison with these prototypes in the latent space, providing an easy-to-understand explanation of model decisions.

The model consists of a convolutional neural network (CNN) backbone, a prototype layer, and a fully connected final layer. It extracts the embedding from the input spectrogram, compares it to the prototype in latent space using cosine similarity, and utilizes a weighted loss function for training. Training occurs in two phases to optimize prototype adaptation and model synergy. The prototype is visualized by projecting onto similar patches from the training spectrogram to ensure fidelity and meaning.

The main contributions of this study are:

1. Researchers developed a prototype learning model (AudioProtoPNet) for bioacoustic bird classification. This model can identify prototypical parts in the spectrograms of training samples and use them for effective multi-label classification.

2. The model is evaluated on eight different datasets of bird vocal recordings from different geographic regions. The results show that their model is able to learn relevant and interpretable prototypes.

3. Comparison with two state-of-the-art black-box deep learning models for avian bioacoustic classification shows that this interpretable model achieves similar performance on eight evaluation datasets, and is highly effective in bioacoustic monitoring. The applicability of the interpretable model is demonstrated.

In conclusion, this study introduces AudioProtoPNet, an interpretable model for bioacoustic bird classification that addresses the limitations of black-box approaches. Evaluation across diverse datasets demonstrates its validity and interpretability, demonstrating its potential in biodiversity monitoring efforts.


Please check paper. All credit for this study goes to the researchers of this project.Don't forget to follow us twitter.Please join us telegram channel, Discord channeland LinkedIn groupsHmm.

If you like what we do, you'll love Newsletter..

Don't forget to join us 40,000+ ML subreddits

Asjad is an intern consultant at Marktechpost. He is pursuing a degree in mechanical engineering from the Indian Institute of Technology, Kharagpur. Asjad is a machine learning and deep learning enthusiast and is constantly researching the applications of machine learning in healthcare.

🐝 Join the fastest growing AI research newsletter from researchers at Google + NVIDIA + Meta + Stanford + MIT + Microsoft and more…





Source link

Leave a Reply

Your email address will not be published. Required fields are marked *