4.6 Article

A novel deep transfer learning models for recognition of birds sounds in different environment

Journal

SOFT COMPUTING
Volume 26, Issue 3, Pages 1003-1023

Publisher

SPRINGER
DOI: 10.1007/s00500-021-06640-1

Keywords

Voice classifications; Bird species; ResNet50; DenseNet201; Efficient net; Mel frequency cepstral coefficients; Intelligent systems

Ask authors/readers for more resources

This study aims to develop an intelligent system that can predict different bird species based on audio data. By applying deep transfer learning models and various feature extraction techniques, we achieved effective extraction and recognition of bird sounds and achieved significant prediction accuracy.
Automatic detection of calling bird species is advantageous for monitoring the environment on a broad scale, both temporally and spatially. Numerous investigations have been influenced by feature representations employed in the field of automatic voice recognition. In this study, we investigated deep neural networks on a dataset of 12,061 files for voice recognition for 22 bird species. The methodology adopted in the current study deviates from the existing approaches by integrating transfer learning. Also, multiple feature extraction techniques have been used to extract features from audio to analyze bird sounds, including the Fourier Transform, Mel-Spectrogram, and Mel Frequency Cepstral Coefficients. The study's main objective is to develop intelligent systems that can predict the different species of bird from the collected set of audio data recordings. The current work verifies that deep transfer learning models like ResNet50, DenseNet201, InceptionV3, Xception and Efficient Net can effectively extract and recognize the audio signals from different bird species with significant prediction accuracy. The absolute best classification accuracy is 97.43%, which DenseNet201 and ResNet50 classification model attained on validation set. Also, DenseNet201 incurred least validation loss (0.1080). The Xception model performed best with the training data and achieved 100% training accuracy and incurred least loss (0.0011). Thus, our study brings us a solution to quantify/test deep learning models appropriately.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

No Data Available
No Data Available