Article
Medicine, Legal
Nabanita Basu, Agnes S. Bali, Philip Weber, Claudia Rosas-Aguilar, Gary Edmond, Kristy A. Martire, Geoffrey Stewart Morrison
Summary: This paper investigates the accuracy of speaker identification by individual lay listeners, such as judges, compared to a forensic-voice-comparison system based on state-of-the-art automatic-speaker-recognition technology. The study tests listeners with different language backgrounds and considers different courtroom contexts.
FORENSIC SCIENCE INTERNATIONAL
(2022)
Article
Computer Science, Information Systems
Nianchang Huang, Yi Liu, Qiang Zhang, Jungong Han
Summary: The study proposes a novel RGB-D salient object detection model that effectively combines cross-modal features from RGB-D images and unimodal features from RGB and depth images, achieving significant performance improvement on four benchmark datasets compared to state-of-the-art methods.
IEEE TRANSACTIONS ON MULTIMEDIA
(2021)
Article
Computer Science, Information Systems
Fiseha B. Tesema, Jason Gu, Wei Song, Hong Wu, Shiqiang Zhu, Zheyuan Lin
Summary: Active speaker detection (ASD) is about identifying the person speaking in a video among visible human instances. This study proposes an efficient audiovisual fusion (AVF) approach that captures correlations between facial regions and sound signals, focusing on discriminative facial features and associating them with corresponding audio features, resulting in improved detection accuracy.
Article
Computer Science, Information Systems
Lionel Pibre, Francisco Madrigal, Cyrille Equoy, Frederic Lerasle, Thomas Pellegrini, Julien Pinquier, Isabelle Ferrane
Summary: In this paper, we propose two different fusion techniques (naive fusion and attention-based fusion) to combine audio and visual information for active speaker detection in meetings. By using neural networks to process audio and video data, and supplementing with motion information, the system performance is improved. The method shows good detection capability in a public benchmark test.
MULTIMEDIA TOOLS AND APPLICATIONS
(2023)
Article
Computer Science, Information Systems
Martina Slivova, Miroslav Voznak, Jaromir Tovarek, Pavol Partila
Summary: The article introduces a new speaker liveness test to prevent speech verification systems from presentation attacks. By verifying the reliability of isolated word recognition, the experiment results indicate that this method serves as a simple yet effective security component in existing SV systems.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Computer Science, Information Systems
Ning Ding, Sheng-wei Tian, Long Yu
Summary: This paper presents the state of sarcasm information on social media and the research status of sarcasm detection. By considering multiple information sources, including text, audio, and images, a multilevel late-fusion learning framework is proposed. Extensive experiments show its superiority.
MULTIMEDIA TOOLS AND APPLICATIONS
(2022)
Article
Computer Science, Hardware & Architecture
Chao-Lung Chou
Summary: This paper proposes a multimodal presentation attack detection method against photo-attack and video-attack in face recognition system by using score level fusion and challenge-response scenario. Experimental results show that the proposed method can achieve the best error rate and effectively improve the security of facial recognition system.
JOURNAL OF SUPERCOMPUTING
(2021)
Article
Computer Science, Artificial Intelligence
Daniele Salvati, Carlo Drioli, Gian Luca Foresti
Summary: Speaker identification aims to determine the speaker identity by analyzing voice characteristics, often using statistical models or machine learning techniques. Frequency-domain features are commonly used for sound recognition, but recent studies have also explored the use of time-domain raw waveform (RW) with deep neural network (DNN) architectures. This paper proposes a method that combines RWs and gammatone cepstral coefficients (GTCCs) using a late fusion DNN, hypothesizing that the use of both time-domain and frequency-domain features can enhance the accuracy of speaker identification in noisy and reverberant conditions. Experimental results show that the proposed method improves accuracy in adverse conditions compared to other feature choices.
EXPERT SYSTEMS WITH APPLICATIONS
(2023)
Article
Computer Science, Information Systems
Jessica Sena, Artur Jorda, William Robson Schwartz
Summary: Recent works have proposed various pedestrian detectors, leading to the development of a novel method called Content-Based Spatial Consensus (CSBC), which combines spatial consensus and content to learn a weighted fusion of pedestrian detectors. This approach reduces false alarms and improves detection accuracy, overcoming state-of-the-art fusion methods on the ETH and Caltech datasets while requiring fewer detectors for effective results.
JOURNAL OF VISUAL COMMUNICATION AND IMAGE REPRESENTATION
(2021)
Article
Computer Science, Artificial Intelligence
Nianchang Huang, Yongjiang Luo, Qiang Zhang, Jungong Han
Summary: A novel end-to-end RGB-D salient object detection model is proposed in this paper, which combines a Semantic-Guided Modality-Weight Map Generation sub-network and a Bi-directional Multi-scale Cross-modal Feature Fusion module to effectively address the issue of poor input image quality affecting discriminative ability.
PATTERN RECOGNITION
(2022)
Article
Computer Science, Information Systems
Zesheng Chen
Summary: This article aims to design a detector that can distinguish original audio from audio contaminated by adversarial attacks. The proposed MEH-FEST detector calculates the minimum energy in the high-frequency band of the audio signal, showing good performance in determining whether an audio is corrupted by FAKEBOB attacks.
IEEE INTERNET OF THINGS JOURNAL
(2023)
Article
Nanoscience & Nanotechnology
Reza Ghiaasiaan, Nabeel Ahmad, Paul R. Gradl, Shuai Shao, Nima Shamsaei
Summary: This study compares the effects of different heat treatment temperatures on Haynes 282 and finds that even with larger gamma' precipitates, low temperature heat treatment can achieve comparable tensile properties to conventional heat treatment.
MATERIALS SCIENCE AND ENGINEERING A-STRUCTURAL MATERIALS PROPERTIES MICROSTRUCTURE AND PROCESSING
(2022)
Article
Acoustics
Donghui Zhu, Ning Chen
Summary: In this paper, a multiple source domain adaptation and fusion model is proposed to enhance the robustness of Speaker Verification (SV) task against domain mismatches. The experimental results demonstrate the superiority of the proposed fusion model over existing models, and the effectiveness of the strategies and methods employed to improve the performance. The fusion effectiveness is not greatly influenced by the hyper-parameter setting of the Modified Similarity Network Fusion (MSNF) scheme.
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
(2022)
Article
Computer Science, Artificial Intelligence
K. VijayKumar, R. Rajeswara Rao
Summary: Speaker diarization is the process of partitioning an audio source stream into homogeneous segments based on the speaker's identity. This paper proposes the use of a hybrid optimization technique, FrACWOA, along with the Deep Embedded Clustering (DEC) algorithm for speaker diarization. The proposed method outperforms existing approaches in terms of accuracy, with a significant improvement of 12.97% when using six speakers.
DATA & KNOWLEDGE ENGINEERING
(2023)
Article
Biology
Shamim Mobram, Mansour Vali
Summary: This study proposes depression detection systems based on the i-vector framework, aiming to classify speakers and predict depression levels. By combining linear and non-linear features and utilizing variability compensation techniques, the proposed system achieves significant improvements in accuracy.
COMPUTERS IN BIOLOGY AND MEDICINE
(2022)
Article
Computer Science, Artificial Intelligence
Mohammad Farhad Bulbul, Saiful Islam, Zannatul Azme, Preksha Pareek, Md Humaun Kabir, Hazrat Ali
Summary: In this paper, a new method is proposed to use auto-correlation gradient features in depth action data for action recognition. By calculating depth motion map sequences and utilizing the STACOG descriptor, three vectors of 3D auto-correlation gradient features are obtained and passed to an unsupervised classifier for action recognition.
INTERNATIONAL JOURNAL OF MULTIMEDIA INFORMATION RETRIEVAL
(2022)
Article
Computer Science, Information Systems
Imran Ul Haq, Haider Ali, Hong Yu Wang, Cui Lei, Hazrat Ali
Summary: The world is facing a concerning situation regarding breast cancer patients. A Computer-Aided Diagnosis (CAD) system based on deep convolution neural network (DCNN) and feature fusion has been proposed to improve the detection and classification of abnormalities in mammographic scans. The model achieved high sensitivity, specificity, and accuracy in the evaluation on two publicly available datasets.
JOURNAL OF KING SAUD UNIVERSITY-COMPUTER AND INFORMATION SCIENCES
(2022)
Article
Engineering, Biomedical
Hazrat Ali, Johannes Umander, Robin Rohlen, Oliver Roehrle, Christer Gronlund
Summary: This study proposes a deep learning approach to model authentic intra-muscular skeletal muscle contraction patterns, bridging the gap between simulated and experimental image sequences. The results demonstrate the potential of generating spatio-temporal features similar to in vivo data, offering insights for neuromuscular imaging research.
BIOMEDICAL ENGINEERING ONLINE
(2022)
Article
Computer Science, Artificial Intelligence
Mehreen Mubashar, Hazrat Ali, Christer Gronlund, Shoaib Azmat
Summary: U-Net is a widely used neural network in the field of medical image segmentation. However, its performance on complex datasets is not satisfactory. To address this issue, several variants such as R2U-Net and UNET++ have been proposed. In this paper, we propose a new U-Net-based medical image segmentation architecture called R2U++, which overcomes the limitations of traditional U-Net by incorporating deeper recurrent residual convolutional blocks and dense skip connections. Experimental results on multiple medical imaging datasets demonstrate that R2U++ achieves significant improvements in IoU and dice scores compared to UNET++ and R2U-Net.
NEURAL COMPUTING & APPLICATIONS
(2022)
Article
Telecommunications
Asif Khan, Alam Zaib, Hazrat Ali, Shahid Khattak
Summary: This paper presents a novel framework for link-level performance abstraction for MIMO receivers using a neural network model. The framework achieves good performance in different channel conditions by training the neural network with data generated from link-level simulations.
TELECOMMUNICATION SYSTEMS
(2022)
Article
Computer Science, Hardware & Architecture
Muhammad Awais, Ali Zahir, Syed Ayaz Ali Shah, Pedro Reviriego, Anees Ullah, Nasim Ullah, Adam Khan, Hazrat Ali
Summary: Domain-specific accelerators for signal processing, image processing, and machine learning are increasingly implemented on SRAM-based field-programmable gate arrays (FPGAs). This article presents an optimized carry-aware approximate radix-4 Booth multiplier design that leverages built-in slice look-up tables (LUTs) and carry-chain resources in a novel configuration. The proposed design offers significant improvements in FPGA resource usage, power delay product, performance metric, and errors compared to the latest state-of-the-art designs, making it an attractive choice for multiplication on FPGA-based accelerators.
ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS
(2023)
Article
Engineering, Electrical & Electronic
Owais Ali, Hazrat Ali, Syed Ayaz Ali Shah, Aamir Shahzad
Summary: This paper presents the implementation of a modified U-Net model for medical image segmentation on low-power devices. The experimental results demonstrate comparable performance to the traditional U-Net model, while requiring lower resource consumption.
IEEE TRANSACTIONS ON CIRCUITS AND SYSTEMS II-EXPRESS BRIEFS
(2022)
Article
Computer Science, Artificial Intelligence
Guan Huang, Son N. Tran, Quan Bai, Jane Alty
Summary: There is a pressing need for remote evaluation of hand movements by clinicians and neuroscientists, especially in light of the COVID-19 pandemic, to detect and monitor degenerative brain disorders prevalent in older adults. This study developed a computer vision-based method that accurately detects hand gestures of older adults using real-life video data. Through validated hand movement tests conducted at home or in clinic, data were collected and analyzed to evaluate different network structures. The newly developed RGRNet showed promising results in detecting hand gestures, offering potential applications in medical and research fields.
NEURAL COMPUTING & APPLICATIONS
(2023)
Article
Computer Science, Artificial Intelligence
Sofia Kanwal, Sohail Asghar, Hazrat Ali
Summary: This study presents a speech features enhancement strategy for improving speech emotion recognition. The method outperforms state-of-the-art methods in two different language datasets, demonstrating its effectiveness.
PEERJ COMPUTER SCIENCE
(2022)
Article
Engineering, Biomedical
Hazrat Ali, Emma Nyman, Ulf Naslund, Christer Gronlund
Summary: This work evaluated a model that translates features of atherosclerotic disease onto healthy carotid ultrasound images. The cycleGAN model successfully translated disease features onto healthy images while retaining the overall anatomical contents. This has important implications for education, cardiovascular risk communication, and personalized modeling in precision medicine.
BIOMEDICAL SIGNAL PROCESSING AND CONTROL
(2023)
Review
Computer Science, Artificial Intelligence
Sulaiman Khan, Hazrat Ali, Zubair Shah
Summary: This scoping review examines the use of vision transformers for skin lesion detection and finds that this approach has shown outstanding performance in detecting skin cancer. However, there are some challenges that hinder the trustworthiness of vision transformers in skin cancer diagnosis, such as intrinsic visual ambiguities and irregular lesion shapes. The findings of this review provide new insights and suggest the best segmentation techniques for accurate lesion boundary identification and melanoma diagnosis.
FRONTIERS IN ARTIFICIAL INTELLIGENCE
(2023)
Review
Computer Science, Information Systems
Rizwan Qureshi, Muhammad Irfan, Hazrat Ali, Arshad Khan, Aditya Shekhar Nittala, Shawkat Ali, Abbas Shah, Taimoor Muzaffar Gondal, Ferhat Sadak, Zubair Shah, Muhammad Usman Hadi, Sheheryar Khan, Qasem Al-Tashi, Jia Wu, Amine Bermak, Tanvir Alam
Summary: Data generated from various sources in the medical field has rapidly increased in the past decade, and computational hardware advancements have enabled the utilization of this data through sophisticated AI techniques. This article provides an overview of recent advancements in AI and biosensors in medical and life sciences, including the role of machine learning in medical imaging, precision medicine, and IoT-based biosensors. The article also discusses the progress in wearable biosensing technologies and computing technologies for medical data, as well as challenges and future prospects.
Proceedings Paper
Computer Science, Artificial Intelligence
Kinza Rohail, Saba Bashir, Hazrat Ali, Tanvir Alam, Sheheryar Khan, Jia Wu, Pingjun Chen, Rizwan Qureshi
Summary: Based on historical data statistics, the survival rate of patients with CLL is about 65%. Aggressive and rare variants of CLL, aCLL and RT-DLBL, have lower survival rates in patients and worsen with age. A framework combining Graph Theory, Gaussian Mixture Modeling, and Fuzzy C-mean Clustering was developed for studying neoplastic lymphomas, providing quantitative analysis of pathological facts through integration of Image and Nuclei level analysis. The proposed algorithm outperforms existing algorithms with a mean diagnosis accuracy of 0.70833.
COMPUTER VISION - ACCV 2022 WORKSHOPS
(2023)
Review
Medical Informatics
Hazrat Ali, Rizwan Qureshi, Zubair Shah
Summary: This study reviews the application of different vision transformers (ViTs) in brain cancer diagnosis and tumor segmentation, focusing on the enhancement of brain tumor segmentation task through different architectures and the improvement of convolutional neural networks' performance using ViT-based models.
JMIR MEDICAL INFORMATICS
(2023)
Article
Telecommunications
Adnan O. M. Abuassba, Zhang Dezheng, Hazrat Ali, Fan Zhang, Khan Ali
Summary: This paper proposes a general framework for creating ensembles in the context of classification. The framework consists of four stages and takes into account diversity and efficiency. Experimental results validate the effectiveness of the proposed approach.
DIGITAL COMMUNICATIONS AND NETWORKS
(2022)