☆ 4.5 Article

Low-frequency Fourier analysis of speech rhythm

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA (2008)

Journal

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA

Volume 124, Issue 2, Pages EL34-EL39

Publisher

ACOUSTICAL SOC AMER AMER INST PHYSICS

DOI: 10.1121/1.2947626

Keywords

-

Categories

Acoustics Audiology & Speech-Language Pathology

Funding

NIDCD NIH HHS [R01 DC004421] Funding Source: Medline

Ask authors/readers for more resources

Protocol

Community support

Reagent

Community support

Abstract

A method for studying speech rhythm is presented, using Fourier analysis of the amplitude envelope of bandpass-filtered speech. Rather than quantifying rhythm with time-domain measurements of interval durations, a frequency-domain representation is used-the rhythm spectrum. This paper describes the method in detail, and discusses approaches to characterizing rhythm with low-frequency spectral information. (C) 2008 Acoustical Society of America.

Authors

I am an author on this paper

Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.5

Not enough ratings

Secondary Ratings

Novelty

-

Significance

-

Scientific rigor

-

Rate this paper

Recommended

Article Automation & Control Systems

MSLEFC: A low-frequency focused underwater acoustic signal classification and analysis system

Yunqi Zhang, Qunfeng Zeng

Summary: A Multi-scale short-time Fourier transform (MS-STFT) method is proposed to increase the low-frequency information and preserve the detailed information in underwater acoustic classification. An effective data augmentation method and a Ladder-like Encode (LE) architecture are utilized to enhance the model's generalization and accuracy. Additionally, a Frequency-CAM (FC) method is introduced to analyze the neural network's interest in frequency band locations during classification tasks. The integrated approach, MSLEFC, achieves high accuracy on ship radiation noise datasets and outperforms previous state-of-the-art methods on the ShipsEar dataset. The proposed model architecture also shows improvements in parameters and computation compared to ResNet50.

ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE (2023)

Add to Collection

Article Acoustics

The effect of low frequency reverberation on Chinese speech intelligibility in two classrooms

Shihan Xu, Jianxin Peng, Yi Xiao, Wuqiong Huang

Summary: The study found that improving the low frequency characteristic of reverberation time can help to enhance Chinese speech intelligibility in classrooms.

APPLIED ACOUSTICS (2021)

Add to Collection

Article Psychology, Multidisciplinary

Differential sensitivity to speech rhythms in young and older adults

Dylan V. Pearson, Yi Shen, J. Devin McAuley, Gary R. Kidd

Summary: Sensitivity to temporal properties of auditory patterns tends to be poorer in older listeners, contributing to their poorer speech understanding. This study examined sensitivity to speech rhythms in young and older normal-hearing subjects, finding that both rely on speech rhythms to generate temporal expectancies for upcoming events. However, older listeners do not exhibit lower thresholds for shortened gaps, indicating a change in speech-timing expectancies with age.

FRONTIERS IN PSYCHOLOGY (2023)

Add to Collection

Article Engineering, Electrical & Electronic

Analysis of Reassignment Operators Used in Synchrosqueezing Transforms: With an Application to Instantaneous Frequency Estimation

Sylvain Meignen, Neha Singh

Summary: This paper investigates the behavior of reassignment operators in synchrosqueezing transforms applied to multicomponent signals. The study shows that the quality of the original instantaneous frequency estimators deteriorates significantly when the modes interfere in the time-frequency plane or when there is noise present. Based on this analysis, a novel instantaneous frequency estimator that uses specific points on the ridges of synchrosqueezing transforms is proposed, and its performance is compared with state-of-the-art techniques based on the same type of time-frequency representations.

IEEE TRANSACTIONS ON SIGNAL PROCESSING (2022)

Add to Collection

Article Remote Sensing

Low frequency error analysis and calibration for multiple star sensors system of GaoFen7 satellite

Yanli Wang, Mi Wang, Ying Zhu, Xiaoxiang Long

Summary: The GF7 optical satellite is equipped with advanced camera and laser altimeter for high-resolution mapping. The satellite's attitude determination system includes four star sensors for accuracy. However, thermal deformation and ALFE can affect the consistency of attitude determination results, so a compensation model is proposed for precise attitude determination.

GEO-SPATIAL INFORMATION SCIENCE (2022)

Add to Collection

Article Neurosciences

Test of Prosody via Syllable Emphasis (TOPsy): Psychometric Validation of a Brief Scalable Test of Lexical Stress Perception

Srishti Nayak, Daniel E. Gustavson, Youjia Wang, Jennifer E. Below, Reyna L. Gordon, Cyrille L. Magne

Summary: Perception of prosody is crucial for spoken language communication, including comprehension, pragmatics, and phonological awareness. This study introduces the Test of Prosody via Syllable Emphasis (TOPsy) as a phenotyping tool to measure lexical stress sensitivity. The test demonstrates excellent reliability and predictive validity, suggesting its potential for large-scale investigations of prosody and reading.

FRONTIERS IN NEUROSCIENCE (2022)

Add to Collection

Article Audiology & Speech-Language Pathology

Adult Cochlear Implant Users Versus Typical Hearing Persons: An Automatic Analysis of Acoustic-Prosodic Parameters

Tomas Arias-Vergara, Anton Batliner, Tobias Rader, Daniel Polterauer, Catalina Hoegerle, Joachim Mueller, Juan-Rafael Orozco-Arroyave, Elmar Noeth, Maria Schuster

Summary: The aim of this study was to compare the speech prosody of post-lingually deaf cochlear implant users with control speakers without hearing or speech impairment. The results showed differences in duration and rhythm features between CI users and control speakers. The findings suggest that even after cochlear implantation and rehabilitation, the speech of postlingually deaf adults deviates from the speech of control speakers, possibly due to changes in auditory feedback. Future rehabilitation strategies may need to consider these changes.

JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH (2022)

Add to Collection

Article Engineering, Electrical & Electronic

Low-Latency Short-Time Fourier Transform of Microwave Photonics Processing

Jilong Li, Songnian Fu, Xiangzhi Xie, Meng Xiang, Yitang Dai, Feifei Yin, Yuwen Qin

Summary: This paper proposes a low-latency short-time Fourier transform (STFT) method in microwave photonic processing, which uses a passive fiber loop to provide high-value equivalent dispersion, achieving high acquisition frame rates and low processing latency, and enabling real-time acquisition of spectrograms of microwave signals.

JOURNAL OF LIGHTWAVE TECHNOLOGY (2023)

Add to Collection

Article Neurosciences

Neural generators of the frequency-following response elicited to stimuli of low and high frequency: A magnetoencephalographic (MEG) study

Natalia Gorina-Careta, Jari L. O. Kurkela, Jarmo Hamalainen, Piia Astikainen, Carles Escera

Summary: Studies have found that the FFR to high frequency sounds mainly originates from the inferior colliculus and the medial geniculate body of the thalamus, with no significant cortical contribution. In contrast, the FFR to low frequency sounds has a major contribution from the auditory cortices, as well as from midbrain and thalamic structures. These findings support the multiple generator hypothesis of the FFR and suggest a hierarchical organization of periodicity encoding in the auditory system.

NEUROIMAGE (2021)

Add to Collection

Article Acoustics

Performance analysis of low complexity fully connected neural networks for monaural speech enhancement

Himavanth Reddy, Asutosh Kar, Jan Ostergaard

Summary: This study compares the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. The results demonstrate that a simple fully connected DNN has the lowest run-time computational complexity for the same speech enhancement performance.

APPLIED ACOUSTICS (2022)

Add to Collection

Article Acoustics

The interplay of prosodic cues in the L2: How intonation, rhythm, and speech rate in speech by Spanish learners of Dutch contribute to L1 Dutch perceptions of accentedness and comprehensibility*

Lieke van Maastricht, Tim Zee, Emiel Krahmer, Marc Swerts

Summary: This study shows that improving the prosody of L2 speakers positively affects L1 perceptions of L2 speech. Dutch listeners were influenced by intonation and speech rate transfer in terms of accent-edness perception, while comprehensibility ratings were mediated by interactions between prosodic features.

SPEECH COMMUNICATION (2021)

Add to Collection

Article Biochemistry & Molecular Biology

Effect of high-frequency stimulation on the complexity of low-Mg2 thorn - induced epileptiform discharge rhythm waves in the CA3 region of rat hippocampal slices

Lei Dong, Tong Zhao, Jia-Kang Duan, Lei Tian, Yu Zheng

Summary: This study found that high-frequency stimulation can inhibit the pathological rhythm caused by epilepsy, and the effect mainly occurs in the CA3 region of the hippocampus.

BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS (2023)

Add to Collection

Article Psychology, Experimental

Quantifying the role of rhythm in infants' language discrimination abilities: A meta-analysis

Loretta Gasparini, Alan Langus, Sho Tsuji, Natalie Boll-Avetisyan

Summary: The study found that smaller differences in vowel interval variability and larger differences in successive consonantal interval variability were associated with more successful language discrimination, and better accounted for discrimination results than the factor of rhythm class. Results on preference studies were affected by age: the older infants get, the more they prefer non-native languages that are rhythmically similar to their native language, but not non-native languages that are rhythmically distinct.

COGNITION (2021)

Add to Collection

Article Acoustics

Speech rhythm convergence in a dyadic reading task

Karina Cerda-Onate, Gloria Toledo Vega, Mikhail Ordin

Summary: The study found that the presence of others plays a crucial role in entrainment to speech rhythm, helping participants to synchronize their speech better. Synchronization between speakers was improved during co-present reading, and texts with strong meter only affected speech rhythm convergence in more challenging conditions.

SPEECH COMMUNICATION (2021)

Add to Collection

Article Behavioral Sciences

The role of the basal ganglia and cerebellum in adaptation to others' speech rate and rhythm: A study of patients with Parkinson's disease and cerebellar degeneration

Mona Spaeth, Ingrid Aichert, Dagmar Timmann, Andres O. Ceballos-Baumann, Edith Wagner-Sonntag, Wolfram Ziegler

Summary: This article explores the role of cerebellar and basal ganglia dysfunctions in unintentional adaptation to speech rhythm and articulation rate of a second speaker. The findings suggest that continuous auditory-motor adaptation takes place in interactive language use and the plasticity of auditory-motor representations of speech persists throughout life.

CORTEX (2022)

Add to Collection

Review Computer Science, Artificial Intelligence

Analysis of speech production real-time MRI

Vikram Ramanarayanan, Sam Tilsen, Michael Proctor, Johannes Toger, Louis Goldstein, Krishna S. Nayak, Shrikanth Narayanan

COMPUTER SPEECH AND LANGUAGE (2018)

Add to Collection

Article Neurosciences

Encoding of Articulatory Kinematic Trajectories in Human Speech Sensorimotor Cortex

Josh Chartier, Gopala K. Anumanchipalli, Keith Johnson, Edward F. Chang

NEURON (2018)

Add to Collection

Article Acoustics

Modeling the effect of palate shape on the articulatory-acoustics mapping

Sarah Bakst, Keith Johnson

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA (2018)

Add to Collection

Review Multidisciplinary Sciences

Space and time in models of speech rhythm

Sam Tilsen

ANNALS OF THE NEW YORK ACADEMY OF SCIENCES (2019)

Add to Collection

Article Multidisciplinary Sciences

Speaker-normalized sound representations in the human auditory cortex

Matthias J. Sjerps, Neal P. Fox, Keith Johnson, Edward F. Chang

NATURE COMMUNICATIONS (2019)

Add to Collection

Article Acoustics

Spectral and temporal measures of coarticulation in child speech

Margaret Cychosz, Jan R. Edwards, Benjamin Munson, Keith Johnson

JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA (2019)

Add to Collection

Article Linguistics

Temporal differences between high vowels and glides are more robust than spatial differences

Dan Cameron Burgdorf, Sam Tilsen

Summary: The study investigated articulatory and acoustic differences between English vowels and glides, finding that temporal organization may be more important than constriction degree for the vowel-glide distinction.

JOURNAL OF PHONETICS (2021)

Add to Collection

Article Multidisciplinary Sciences

Localizing category-related information in speech with multi-scale analyses

Sam Tilsen, Seung-Eun Kim, Claire Wang

Summary: This article introduces a multi-scale analysis method that uses machine learning algorithms to localize category-related information in speech signals, and investigates phonemic/gestural categories and syntactic relative clause categories. The study found that the neural network algorithm is more efficient in detecting category-related information in speech compared to discriminant analyses.

PLOS ONE (2021)

Add to Collection

Article Acoustics

Parameters of unit-based measures of speech rate

Sam Tilsen, Mark Tiede

Summary: Speech rate is a useful concept for understanding variation in speech. There are various ways to measure speech rate, such as counting linguistic units within a given time period. A corpus study explores how different parameters affect the correlation between rate measures and target segment durations. The study finds that phone-based rate measures have stronger correlations with segment durations and that proper rates (events per second) outperform inverse rates (average durations per event). Including intervals associated with target segments in rate calculation window leads to artificial increases in correlation.

SPEECH COMMUNICATION (2023)

Add to Collection

Article Linguistics

Hierarchical distinctions in the production and perception of nuclear tunes in American English

Jennifer Cole, Jeremy Steffman, Stefanie Shattuck-Hufnagel, Sam Tilsen

Summary: In this study, an eight-way distinction in nuclear tune shape in American English is examined in speech production and perception. The results show that tune shapes mainly differ in the scaling of final f0, rather than small differences in final f0. The observed distinctions align with a binary pitch accent contrast {H*, L*} and a maximally ternary {H%, M%, L%} boundary tone contrast, but not with distinct tonal specifications for the phrase accent and boundary tone from the AM model.

LABORATORY PHONOLOGY (2023)

Add to Collection

Article Linguistics

Detecting anticipatory information in speech with signal chopping

Sam Tilsen

JOURNAL OF PHONETICS (2020)

Add to Collection

Article Linguistics

Incongruencies between phonological theory and phonetic measurement

Doris Muecke, Anne Hermes, Sam Tilsen

PHONOLOGY (2020)

Add to Collection

Article Psychology, Multidisciplinary

Motoric Mechanisms for the Emergence of Non-local Phonological Patterns

Sam Tilsen

FRONTIERS IN PSYCHOLOGY (2019)

Add to Collection

Article Linguistics

Exertive modulation of speech and articulatory phasing

Sam Tilsen

JOURNAL OF PHONETICS (2017)

Add to Collection

Article Multidisciplinary Sciences

Localizing category-related information in speech with multi-scale analyses

Sam Tilsen, Seung-Eun Kim, Claire Wang

Summary: A multi-scale analysis method is proposed to localize category-related information in an ensemble of speech signals using machine learning algorithms. Findings suggest that both linear discriminant analysis and neural networks detected category-related information earlier or later than expected, with neural networks outperforming discriminant analyses in identifying category-related information.

PLOS ONE (2021)

Add to Collection

No Data Available

© Peeref 2019-2024. All rights reserved.