4.8 Article

Enhancement of Speech Recognitions for Control Automation Using an Intelligent Particle Swarm Optimization

Journal

IEEE TRANSACTIONS ON INDUSTRIAL INFORMATICS
Volume 8, Issue 4, Pages 869-879

Publisher

IEEE-INST ELECTRICAL ELECTRONICS ENGINEERS INC
DOI: 10.1109/TII.2012.2187910

Keywords

Beamformer; intelligent fuzzy systems; particle swarm optimization; speech control; speech recognition

Funding

  1. RGC Grant [PolyU. (5365/09E)]
  2. NSFC [10901170]
  3. Research Committee of the Hong Kong Polytechnic University

Ask authors/readers for more resources

For over two decades, speech control mechanisms have been widely applied in manufacturing systems such as factory automation, warehouse automation, and industrial robotic control for over two decades. To implement speech controls, a commercial speech recognizer is used as the interface between users and the automation system. However, users' commands are often contaminated by environmental noise which degrades the performance of speech recognition for controlling automation systems. This paper presents a multichannel signal enhancement methodology to improve the performance of commercial speech recognizers. The proposed methodology aims to optimize speech recognition accuracy of a commercial speech recognizer in a noisy environment based on a beamformer, which is developed by an intelligent particle swarm optimization. It overcomes the limitation of the existing signal enhancement approaches whereby the parameters inside commercial speech recognizers are required to be tuned, which is impossible in a real-world situation. Also, it overcomes the limitation of the existing optimization algorithm including gradient descent methods, genetic algorithms and classical particle swarm optimization that are unlikely to develop optimal beamformers for maximizing speech recognition accuracy. The performance of the proposed methodology was evaluated by developing beamformers for a commercial speech recognizer, which was implemented on warehouse automation. Results indicate a significant improvement regarding speech recognition accuracy.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.8
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Operations Research & Management Science

Technical note on the existence of solutions for generalized symmetric set-valued quasi-equilibrium problems utilizing improvement set

Zai-Yun Peng, Jing-Jing Wang, Ka Fai Cedric Yiu, Yun-Bin Zhao

Summary: This paper establishes existence results for the solution of the generalized symmetric set-valued quasi-equilibrium problem (GSSQEP) and introduces new forms of the problem. Sufficient conditions for the existence of solutions to GSSQEP are developed using fixed point method, maximal element principle, and nonlinear scalarization technique. The paper also provides applications to related problems and improves existing results.

OPTIMIZATION (2023)

Article Computer Science, Artificial Intelligence

A Bayesian Filter for Multi-View 3D Multi-Object Tracking With Occlusion Handling

Jonah Ong, Ba-Tuong Vo, Ba-Ngu Vo, Du Yong Kim, Sven Nordholm

Summary: This paper presents an online multi-camera multi-object tracker that can be trained with a monocular detector and is independent of the multi-camera configurations. It operates in the 3D world frame and provides 3D trajectory estimates of objects. The proposed algorithm integrates track management, state estimation, clutter rejection, and occlusion/misdetection handling into a single Bayesian recursion using a high fidelity yet tractable 3D occlusion model.

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE (2022)

Article Computer Science, Information Systems

Asymptotic Behaviors and Confidence Intervals for the Number of Operating Sensors in a Sensor Network

Mingjie Gao, Ka-Fai Cedric Yiu

Summary: In this paper, we investigate the asymptotic behaviors of an estimator for the number of operating sensors in a sensor network using the Good-Turing estimator. We obtain the asymptotic normality, moderate deviations, and deviation inequalities of the estimator. Our approach is based on tail probability estimates and moderate deviations for occupancy problems. By applying these asymptotic behaviors, we provide a performance analysis for the estimator of the number N of operating nodes when the deviations of the estimator are within (root N, o(N)). These estimates also offer a method for constructing confidence intervals for N.

IEEE TRANSACTIONS ON INFORMATION THEORY (2023)

Article Computer Science, Artificial Intelligence

Multi-layer segmentation of retina OCT images via advanced U-net architecture

N. Man, S. Guo, K. F. C. Yiu, C. K. S. Leung

Summary: Optical Coherence Tomography (OCT) is a non-invasive method for early diagnosis of ocular diseases. This research focuses on retinal layer segmentation in OCT images, exploring algorithms and network structures, and proposing a method to reduce complexity when training a large volume of data on a cloud platform.

NEUROCOMPUTING (2023)

Article Mathematics

Design of Confidence-Integrated Denoising Auto-Encoder for Personalized Top-N Recommender Systems

Zeshan Aslam Khan, Naveed Ishtiaq Chaudhary, Waqar Ali Abbasi, Sai Ho Ling, Muhammad Asif Zahoor Raja

Summary: A recommender system aims to gain users' confidence and reduce their time and effort. In this study, an improved, confidence-integrated denoising auto-encoder (DAE) is proposed to enhance the performance of recommender systems. The proposed model achieves improved scores in various evaluation metrics and proves to be efficient and accurate in generating recommendations.

MATHEMATICS (2023)

Article Mathematics, Applied

ON THE SIMULTANEOUS DESIGN OF BROADBAND BEAMFORMER FILTERS AND CONFIGURATION

Mingjie Gao, Ka-Fai Cedric Yiu

Summary: This paper addresses the importance of beamforming in signal enhancement and proposes a method that considers both filters and microphone positions as design variables. The Gauss-Newton algorithm is employed to simultaneously update these two variables during iterations. The effectiveness of the proposed method is demonstrated through several design examples.

NUMERICAL ALGEBRA CONTROL AND OPTIMIZATION (2023)

Article Computer Science, Artificial Intelligence

Conjoining congestion speed-cycle patterns and deep learning neural network for short-term traffic speed forecasting

W. M. Tang, K. F. C. Yiu, K. Y. Chan, K. Zhang

Summary: Accurate traffic forecasting is crucial for regional traffic management. The proposed NSDNN approach combines DNN and subset selection method to extract useful inputs from nearby roads. By selecting appropriate input subsets based on congestion cycle patterns, the method reduces input data dimensions and avoids artificial high correlations. Experimental results show that NSDNN achieves higher accuracy compared to other conventional methods. It is also comparable to NSLSTM when the same selected input subset is used. The forecasting system can benefit logistic companies in route planning and fleet management.

APPLIED SOFT COMPUTING (2023)

Article Automation & Control Systems

Consensus of multi-Agent systems with one-Sided lipschitz nonlinearity via nonidentical double event-Triggered control subject to deception attacks

Maolin Wang, Xinsong Yang, Shuoyu Mao, Ka Fai Cedric Yiu, Ju H. Park

Summary: This article studies the leader-following consensus problem in multi-agent systems (MASs) with time-varying switching subject to deception attacks. The one-sided Lipschitz (OSL) condition is used for the nonlinear functions, resulting in more general and relaxed results than those obtained using Lipschitz condition. Nonidentical double event-triggering mechanisms (ETMs) are adopted for only a fraction of agents, and each agent transmits data according to its own necessity. The switching topology is modeled using semi-Markov process with time-varying switching probability, and deception attacks in the transmission channel are considered. Sufficient conditions for MASs to achieve consensus in mean square are obtained using the cumulative distribution function (CDF) and linear matrix inequality (LMI) technology. An effective algorithm is presented for obtaining event-based control gains. The advantages of the proposed control scheme are demonstrated through a simulation example.

JOURNAL OF THE FRANKLIN INSTITUTE-ENGINEERING AND APPLIED MATHEMATICS (2023)

Article Computer Science, Artificial Intelligence

Multi-classification for EEG motor imagery signals using data evaluation-based auto-selected regularized FBCSP and convolutional neural network

Yang An, Hak Keung Lam, Sai Ho Ling

Summary: This paper develops a single-channel-based convolutional neural network to tackle multi-classification motor imagery tasks. The proposed method uses a single-channel learning strategy to extract effective information from each independent channel, making the information between adjacent channels not affect each other. It also proposes a data evaluation method and a mutual information-based regularization parameters auto-selection algorithm to generate effective spatial filters.

NEURAL COMPUTING & APPLICATIONS (2023)

Article Operations Research & Management Science

Moderate deviations for stochastic variational inequalities

Mingjie Gao, Ka-Fai Cedric Yiu

Summary: This paper investigates the convergence of the sample average approximation (SAA) solution for stochastic variational inequalities in regimes of moderate deviations. By using the delta method and exponential approximation, some results on moderate deviations are established. The results are applied to hypotheses testing, showing that the rejection region constructed by the central limit theorem has the probability of the type II error with exponential decay speed. Simulations and numerical results for the tail probabilities are also provided.

OPTIMIZATION (2023)

Article Mathematics, Applied

MODERATE DEVIATIONS AND INVARIANCE PRINCIPLES FOR SAMPLE AVERAGE APPROXIMATIONS

M. I. N. G. J. I. E. Gao, Ka-fai cedric Yiu

Summary: This paper studies the moderate deviations and convergence rates for the optimal values and optimal solutions of sample average approximations. It gives an extension of the Delta method in large deviations and establishes a moderate deviation principle for the optimal value under Lipschitz continuity on the objective function. It also obtains a moderate deviation principle for the optimal solution and a Cramer-type moderate deviation for the optimal value when the objective function is twice continuously differentiable and the optimal solution of true optimization problem is unique.

SIAM JOURNAL ON OPTIMIZATION (2023)

Article Chemistry, Analytical

Advancing Fault Detection in HVAC Systems: Unifying Gramian Angular Field and 2D Deep Convolutional Neural Networks for Enhanced Performance

Wunna Tun, Kwok-Wai (Johnny) Wong, Sai-Ho Ling

Summary: This article presents a framework for HVAC fault detection using HVACSIM+ simulated data and GAF-2DCNN method. By converting time-series sensor data into informative 2D images and extracting features using 2DCNN, this method captures hidden temporal relationships in 1D signals. Experimental results demonstrate high accuracy and precision in HVAC fault detection using this method.

SENSORS (2023)

Article Acoustics

Distributed Microphone Array Localization Problem via SDP-SOCP Method

Qi He, Mingjie Gao, Ka Fai Cedric Yiu, Sven Nordholm

Summary: In multimedia applications, it is common to use acoustic sensors collectively to enhance signals and locate sound sources. This article investigates the microphone array localization problem in a distributed acoustic network with TDOA measurements and proposes a mixed model to solve the problem. Experimental results demonstrate that the proposed model can successfully estimate sensor locations in noisy and reverberant environments, outperforming other relaxation methods.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2023)

Article Engineering, Multidisciplinary

MEAN-VARIANCE PORTFOLIO SELECTION WITH RANDOM INVESTMENT HORIZON

Jingzhen Liu, Ka-Fai Cedric Yiu, Xun Li, Tak Kuen Siu, Kok Lay Teo

Summary: This paper examines a continuous-time securities market and discusses how to minimize the variance of a portfolio's return given a random investment horizon and a targeted terminal mean return. It finds that the variance of an investment portfolio is no longer minimized when all assets are invested in a risk-free security. Additionally, the random investment horizon has a significant impact on the efficient frontier.

JOURNAL OF INDUSTRIAL AND MANAGEMENT OPTIMIZATION (2023)

Article Acoustics

Audio-Visual Based Online Multi-Source Separation

Jonah Ong, Ba Tuong Vo, Sven Nordholm, Ba-Ngu Vo, Diluka Moratuwage, Changbeom Shim

Summary: This paper proposes a novel solution for online separation of an unknown and time-varying number of moving sources using only a single microphone array co-located with a single visual device. The approach exploits the complementary nature of simultaneous audio and visual measurements, accomplishing separation through a model-centric 3-stage process.

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING (2022)

No Data Available