Article
Automation & Control Systems
Yunqi Zhang, Qunfeng Zeng
Summary: A Multi-scale short-time Fourier transform (MS-STFT) method is proposed to increase the low-frequency information and preserve the detailed information in underwater acoustic classification. An effective data augmentation method and a Ladder-like Encode (LE) architecture are utilized to enhance the model's generalization and accuracy. Additionally, a Frequency-CAM (FC) method is introduced to analyze the neural network's interest in frequency band locations during classification tasks. The integrated approach, MSLEFC, achieves high accuracy on ship radiation noise datasets and outperforms previous state-of-the-art methods on the ShipsEar dataset. The proposed model architecture also shows improvements in parameters and computation compared to ResNet50.
ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE
(2023)
Article
Acoustics
Shihan Xu, Jianxin Peng, Yi Xiao, Wuqiong Huang
Summary: The study found that improving the low frequency characteristic of reverberation time can help to enhance Chinese speech intelligibility in classrooms.
Article
Psychology, Multidisciplinary
Dylan V. Pearson, Yi Shen, J. Devin McAuley, Gary R. Kidd
Summary: Sensitivity to temporal properties of auditory patterns tends to be poorer in older listeners, contributing to their poorer speech understanding. This study examined sensitivity to speech rhythms in young and older normal-hearing subjects, finding that both rely on speech rhythms to generate temporal expectancies for upcoming events. However, older listeners do not exhibit lower thresholds for shortened gaps, indicating a change in speech-timing expectancies with age.
FRONTIERS IN PSYCHOLOGY
(2023)
Article
Engineering, Electrical & Electronic
Sylvain Meignen, Neha Singh
Summary: This paper investigates the behavior of reassignment operators in synchrosqueezing transforms applied to multicomponent signals. The study shows that the quality of the original instantaneous frequency estimators deteriorates significantly when the modes interfere in the time-frequency plane or when there is noise present. Based on this analysis, a novel instantaneous frequency estimator that uses specific points on the ridges of synchrosqueezing transforms is proposed, and its performance is compared with state-of-the-art techniques based on the same type of time-frequency representations.
IEEE TRANSACTIONS ON SIGNAL PROCESSING
(2022)
Article
Remote Sensing
Yanli Wang, Mi Wang, Ying Zhu, Xiaoxiang Long
Summary: The GF7 optical satellite is equipped with advanced camera and laser altimeter for high-resolution mapping. The satellite's attitude determination system includes four star sensors for accuracy. However, thermal deformation and ALFE can affect the consistency of attitude determination results, so a compensation model is proposed for precise attitude determination.
GEO-SPATIAL INFORMATION SCIENCE
(2022)
Article
Neurosciences
Srishti Nayak, Daniel E. Gustavson, Youjia Wang, Jennifer E. Below, Reyna L. Gordon, Cyrille L. Magne
Summary: Perception of prosody is crucial for spoken language communication, including comprehension, pragmatics, and phonological awareness. This study introduces the Test of Prosody via Syllable Emphasis (TOPsy) as a phenotyping tool to measure lexical stress sensitivity. The test demonstrates excellent reliability and predictive validity, suggesting its potential for large-scale investigations of prosody and reading.
FRONTIERS IN NEUROSCIENCE
(2022)
Article
Audiology & Speech-Language Pathology
Tomas Arias-Vergara, Anton Batliner, Tobias Rader, Daniel Polterauer, Catalina Hoegerle, Joachim Mueller, Juan-Rafael Orozco-Arroyave, Elmar Noeth, Maria Schuster
Summary: The aim of this study was to compare the speech prosody of post-lingually deaf cochlear implant users with control speakers without hearing or speech impairment. The results showed differences in duration and rhythm features between CI users and control speakers. The findings suggest that even after cochlear implantation and rehabilitation, the speech of postlingually deaf adults deviates from the speech of control speakers, possibly due to changes in auditory feedback. Future rehabilitation strategies may need to consider these changes.
JOURNAL OF SPEECH LANGUAGE AND HEARING RESEARCH
(2022)
Article
Engineering, Electrical & Electronic
Jilong Li, Songnian Fu, Xiangzhi Xie, Meng Xiang, Yitang Dai, Feifei Yin, Yuwen Qin
Summary: This paper proposes a low-latency short-time Fourier transform (STFT) method in microwave photonic processing, which uses a passive fiber loop to provide high-value equivalent dispersion, achieving high acquisition frame rates and low processing latency, and enabling real-time acquisition of spectrograms of microwave signals.
JOURNAL OF LIGHTWAVE TECHNOLOGY
(2023)
Article
Neurosciences
Natalia Gorina-Careta, Jari L. O. Kurkela, Jarmo Hamalainen, Piia Astikainen, Carles Escera
Summary: Studies have found that the FFR to high frequency sounds mainly originates from the inferior colliculus and the medial geniculate body of the thalamus, with no significant cortical contribution. In contrast, the FFR to low frequency sounds has a major contribution from the auditory cortices, as well as from midbrain and thalamic structures. These findings support the multiple generator hypothesis of the FFR and suggest a hierarchical organization of periodicity encoding in the auditory system.
Article
Acoustics
Himavanth Reddy, Asutosh Kar, Jan Ostergaard
Summary: This study compares the run-time complexity of recent deep neural network (DNN) and non-DNN based monaural speech enhancement algorithms. The results demonstrate that a simple fully connected DNN has the lowest run-time computational complexity for the same speech enhancement performance.
Article
Acoustics
Lieke van Maastricht, Tim Zee, Emiel Krahmer, Marc Swerts
Summary: This study shows that improving the prosody of L2 speakers positively affects L1 perceptions of L2 speech. Dutch listeners were influenced by intonation and speech rate transfer in terms of accent-edness perception, while comprehensibility ratings were mediated by interactions between prosodic features.
SPEECH COMMUNICATION
(2021)
Article
Biochemistry & Molecular Biology
Lei Dong, Tong Zhao, Jia-Kang Duan, Lei Tian, Yu Zheng
Summary: This study found that high-frequency stimulation can inhibit the pathological rhythm caused by epilepsy, and the effect mainly occurs in the CA3 region of the hippocampus.
BIOCHEMICAL AND BIOPHYSICAL RESEARCH COMMUNICATIONS
(2023)
Article
Psychology, Experimental
Loretta Gasparini, Alan Langus, Sho Tsuji, Natalie Boll-Avetisyan
Summary: The study found that smaller differences in vowel interval variability and larger differences in successive consonantal interval variability were associated with more successful language discrimination, and better accounted for discrimination results than the factor of rhythm class. Results on preference studies were affected by age: the older infants get, the more they prefer non-native languages that are rhythmically similar to their native language, but not non-native languages that are rhythmically distinct.
Article
Acoustics
Karina Cerda-Onate, Gloria Toledo Vega, Mikhail Ordin
Summary: The study found that the presence of others plays a crucial role in entrainment to speech rhythm, helping participants to synchronize their speech better. Synchronization between speakers was improved during co-present reading, and texts with strong meter only affected speech rhythm convergence in more challenging conditions.
SPEECH COMMUNICATION
(2021)
Article
Behavioral Sciences
Mona Spaeth, Ingrid Aichert, Dagmar Timmann, Andres O. Ceballos-Baumann, Edith Wagner-Sonntag, Wolfram Ziegler
Summary: This article explores the role of cerebellar and basal ganglia dysfunctions in unintentional adaptation to speech rhythm and articulation rate of a second speaker. The findings suggest that continuous auditory-motor adaptation takes place in interactive language use and the plasticity of auditory-motor representations of speech persists throughout life.
Review
Computer Science, Artificial Intelligence
Vikram Ramanarayanan, Sam Tilsen, Michael Proctor, Johannes Toger, Louis Goldstein, Krishna S. Nayak, Shrikanth Narayanan
COMPUTER SPEECH AND LANGUAGE
(2018)
Article
Neurosciences
Josh Chartier, Gopala K. Anumanchipalli, Keith Johnson, Edward F. Chang
Article
Acoustics
Sarah Bakst, Keith Johnson
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
(2018)
Review
Multidisciplinary Sciences
Sam Tilsen
ANNALS OF THE NEW YORK ACADEMY OF SCIENCES
(2019)
Article
Multidisciplinary Sciences
Matthias J. Sjerps, Neal P. Fox, Keith Johnson, Edward F. Chang
NATURE COMMUNICATIONS
(2019)
Article
Acoustics
Margaret Cychosz, Jan R. Edwards, Benjamin Munson, Keith Johnson
JOURNAL OF THE ACOUSTICAL SOCIETY OF AMERICA
(2019)
Article
Linguistics
Dan Cameron Burgdorf, Sam Tilsen
Summary: The study investigated articulatory and acoustic differences between English vowels and glides, finding that temporal organization may be more important than constriction degree for the vowel-glide distinction.
JOURNAL OF PHONETICS
(2021)
Article
Multidisciplinary Sciences
Sam Tilsen, Seung-Eun Kim, Claire Wang
Summary: This article introduces a multi-scale analysis method that uses machine learning algorithms to localize category-related information in speech signals, and investigates phonemic/gestural categories and syntactic relative clause categories. The study found that the neural network algorithm is more efficient in detecting category-related information in speech compared to discriminant analyses.
Article
Acoustics
Sam Tilsen, Mark Tiede
Summary: Speech rate is a useful concept for understanding variation in speech. There are various ways to measure speech rate, such as counting linguistic units within a given time period. A corpus study explores how different parameters affect the correlation between rate measures and target segment durations. The study finds that phone-based rate measures have stronger correlations with segment durations and that proper rates (events per second) outperform inverse rates (average durations per event). Including intervals associated with target segments in rate calculation window leads to artificial increases in correlation.
SPEECH COMMUNICATION
(2023)
Article
Linguistics
Jennifer Cole, Jeremy Steffman, Stefanie Shattuck-Hufnagel, Sam Tilsen
Summary: In this study, an eight-way distinction in nuclear tune shape in American English is examined in speech production and perception. The results show that tune shapes mainly differ in the scaling of final f0, rather than small differences in final f0. The observed distinctions align with a binary pitch accent contrast {H*, L*} and a maximally ternary {H%, M%, L%} boundary tone contrast, but not with distinct tonal specifications for the phrase accent and boundary tone from the AM model.
LABORATORY PHONOLOGY
(2023)
Article
Linguistics
Sam Tilsen
JOURNAL OF PHONETICS
(2020)
Article
Linguistics
Doris Muecke, Anne Hermes, Sam Tilsen
Article
Psychology, Multidisciplinary
Sam Tilsen
FRONTIERS IN PSYCHOLOGY
(2019)
Article
Linguistics
Sam Tilsen
JOURNAL OF PHONETICS
(2017)
Article
Multidisciplinary Sciences
Sam Tilsen, Seung-Eun Kim, Claire Wang
Summary: A multi-scale analysis method is proposed to localize category-related information in an ensemble of speech signals using machine learning algorithms. Findings suggest that both linear discriminant analysis and neural networks detected category-related information earlier or later than expected, with neural networks outperforming discriminant analyses in identifying category-related information.