4.4 Article

Bird and whale species identification using sound images

期刊

IET COMPUTER VISION
卷 12, 期 2, 页码 178-184

出版社

WILEY
DOI: 10.1049/iet-cvi.2017.0075

关键词

image texture; sound images; visual features; percussion images; harmonic images; bird vocalisations; texture features; audio signal approaches; image identification

资金

  1. CAPES
  2. CNPq
  3. Fundacao Araucaria

向作者/读者索取更多资源

Image identification of animals is mostly centred on identifying them based on their appearance, but there are other ways images can be used to identify animals, including by representing the sounds they make with images. In this study, the authors present a novel and effective approach for automated identification of birds and whales using some of the best texture descriptors in the computer vision literature. The visual features of sounds are built starting from the audio file and are taken from images constructed from different spectrograms and from harmonic and percussion images. These images are divided into sub-windows from which sets of texture descriptors are extracted. The experiments reported in this study using a dataset of Bird vocalisations targeted for species recognition and a dataset of right whale calls targeted for whale detection (as well as three well-known benchmarks for music genre classification) demonstrate that the fusion of different texture features enhances performance. The experiments also demonstrate that the fusion of different texture features with audio features is not only comparable with existing audio signal approaches but also statistically improves some of the stand-alone audio features. The code for the experiments will be publicly available at https://www.dropbox.com/s/bguw035yrqz0pwp/ElencoCode.docx?dl=0.

作者

我是这篇论文的作者
点击您的名字以认领此论文并将其添加到您的个人资料中。

评论

主要评分

4.4
评分不足

次要评分

新颖性
-
重要性
-
科学严谨性
-
评价这篇论文

推荐

Article Ecology

Acoustic monitoring reveals diversity and surprising dynamics in tropical freshwater soundscapes

Benjamin L. Gottesman, Dante Francomano, Zhao Zhao, Kristen Bellisario, Maryam Ghadiri, Taylor Broadhead, Amandine Gasc, Bryan C. Pijanowski

FRESHWATER BIOLOGY (2020)

Article Ecology

Contributions of MIR to Soundscape Ecology. Part 2: Spectral timbral analysis for discriminating soundscape components

Kristen M. Bellisario, Jack VanSchaik, Zhao Zhao, Amandine Gasc, Hichem Omrani, Bryan C. Pijanowski

ECOLOGICAL INFORMATICS (2019)

Article Biodiversity Conservation

How well do acoustic indices measure biodiversity? Computational experiments to determine effect of sound unit shape, vocalization intensity, and frequency of vocalization occurrence on performance of acoustic indices

Zhao Zhao, Zhi-yong Xu, Kristen Bellisario, Rui-wen Zeng, Ning Li, Wen-yang Zhou, Bryan C. Pijanowski

ECOLOGICAL INDICATORS (2019)

Review Computer Science, Information Systems

OMR metrics and evaluation: a systematic review

Luciano Mengarelli, Bruno Kostiuk, Joao G. Vitorio, Maicon A. Tibola, William Wolff, Carlos N. Silla

MULTIMEDIA TOOLS AND APPLICATIONS (2020)

Article Computer Science, Artificial Intelligence

MLTL: A multi-label approach for the Tomek Link undersampling algorithm

Rodolfo M. Pereira, Yandre M. G. Costa, Carlos N. Silla Jr

NEUROCOMPUTING (2020)

Article Engineering, Biomedical

Vibration pattern recognition using a compressed histogram of oriented gradients for snoring source analysis

Yi Zhang, Zhao Zhao, Hui-jie Xu, Chong He, Hao Peng, Zhan Gao, Zhi-yong Xu

BIO-MEDICAL MATERIALS AND ENGINEERING (2020)

Article Acoustics

Ensemble of convolutional neural networks to improve animal audio classification

Loris Nanni, Yandre M. G. Costa, Rafael L. Aguiar, Rafael B. Mangolin, Sheryl Brahnam, Carlos N. Silla Jr

EURASIP JOURNAL ON AUDIO SPEECH AND MUSIC PROCESSING (2020)

Article Computer Science, Interdisciplinary Applications

COVID-19 identification in chest X-ray images on flat and hierarchical classification scenarios

Rodolfo M. Pereira, Diego Bertolini, Lucas O. Teixeira, Carlos N. Silla Jr, Yandre M. G. Costa

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE (2020)

Article Engineering, Electrical & Electronic

A Novel Method for Microphone Channel Frequency Response Calibration Based on Newton Algorithm

Ziyi Wang, Zhao Zhao, Zhiyong Xu

Summary: Microphone arrays are widely used in practical applications, but frequency response mismatches among channels can negatively impact performance. Existing calibration methods have varying accuracy and suffer from passband bandwidth shrinkage. To address these issues, we propose a novel calibration method based on the Newton algorithm.

IEEE TRANSACTIONS ON INSTRUMENTATION AND MEASUREMENT (2023)

Article Engineering, Electrical & Electronic

DOA Estimation for Multiple Speech Sources Based on Flexible Single-Source Zones and Concentration Weighting

Zhao Zhao, Hongrui Kan, Jiale Lin, Zhiyong Xu

Summary: In this article, a DOA estimation algorithm for multiple speech sources based on flexible SSZs and concentration weighting is proposed. The algorithm identifies single-source points using correlation coefficients of time delay vectors across adjacent frequency bins and constructs flexible SSZs. The number of single-source points in each flexible SSZ is considered as a weighting factor to form the pooled histogram. A matching pursuit-based approach is used to obtain multisource DOA estimates. Simulation results and real-world experiments demonstrate the effectiveness and improved performance of the proposed method.

IEEE SENSORS JOURNAL (2023)

Article Biodiversity Conservation

A frequency-dependent acoustic diversity index: A revision to a classic acoustic index for soundscape ecological research

Zhi-yong Xu, Lei Chen, Bryan C. Pijanowski, Zhao Zhao

Summary: This article discusses the application of passive acoustic monitoring (PAM) in soundscape ecology and the importance of acoustic indices. However, existing acoustic indices are susceptible to noise. To address this issue, a revised acoustic diversity index (FADI) is proposed, which is less affected by noise compared to the original index.

ECOLOGICAL INDICATORS (2023)

Article Computer Science, Information Systems

A multimodal approach for multi-label movie genre classification

Rafael B. Mangolin, Rodolfo M. Pereira, Alceu S. Britto, Carlos N. Silla, Valeria D. Feltrim, Diego Bertolini, Yandre M. G. Costa

Summary: This paper addresses the multi-label classification of movie genres using a multimodal approach. A large dataset is created, consisting of various sources of information, and different descriptors and classifiers are used for experimental evaluation. The results demonstrate the complementarity among classifiers trained on different sources of information in movie genre classification.

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Proceedings Paper Engineering, Civil

Research On Vehicle Speed Estimation Method Based on Microphone Array

Zijun Xu, Zhao Zhao, Edmund Sowah

Summary: A novel vehicle speed measurement method based on differential beamforming techniques is proposed in this paper, which shows lower cost and reduced computational complexity. Experimental results demonstrate better accuracy and performance in vehicle speed estimation.

INTERNATIONAL CONFERENCE ON SMART TRANSPORTATION AND CITY ENGINEERING 2021 (2021)

Proceedings Paper Automation & Control Systems

3-D Localization of UAV and Detection based on Harmonics Index and Spectral Entropy Criteria

Muhammad Amjad Iqbal, Zhao Zhao, Xu ZhiYong, Saad Ur Rehman

2020 6TH INTERNATIONAL CONFERENCE ON ELECTRICAL ENGINEERING, CONTROL AND ROBOTICS (EECR 2020) (2020)

Proceedings Paper Computer Science, Artificial Intelligence

Multi-label Emotion Classification in Music Videos Using Ensembles of Audio and Video Features

Bruno Kostiuk, Yandre M. G. Costa, Alceu S. Britto Jr, Xiao Hu, Carlos N. Silla Jr

2019 IEEE 31ST INTERNATIONAL CONFERENCE ON TOOLS WITH ARTIFICIAL INTELLIGENCE (ICTAI 2019) (2019)

暂无数据