4.7 Article

Overview and comparative study of dimensionality reduction techniques for high dimensional data

Journal

INFORMATION FUSION
Volume 59, Issue -, Pages 44-58

Publisher

ELSEVIER
DOI: 10.1016/j.inffus.2020.01.005

Keywords

Dimensionality reduction; Features; High dimensional data; Linear techniques; Nonlinear techniques

Ask authors/readers for more resources

The recent developments in the modern data collection tools, techniques, and storage capabilities are leading towards huge volume of data. The dimensions of data indicate the number of features that have been measured for each observation. It has become a challenging task to analyze high dimensional data. Different dimensionality reduction techniques are available in literature to eliminate irrelevant and redundant features. Selection of an appropriate dimension reduction technique can help to enhance the processing speed and reduce the time and effort required to extract valuable information. This paper presents the state-of-the art dimensionality reduction techniques and their suitability for different types of data and application areas. Furthermore, the issues of dimensionality reduction techniques have been highlighted that can affect the accuracy and relevance of results.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.7
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Computer Science, Information Systems

Employing Deep Learning and Time Series Analysis to Tackle the Accuracy and Robustness of the Forecasting Problem

Haseeb Tariq, Muhammad Kashif Hanif, Muhammad Umer Sarwar, Sabeen Bari, Muhammad Shahzad Sarfraz, Rozita Jamili Oskouei

Summary: This study utilizes time series to predict crime rates in order to find practical crime prevention solutions. Machine learning plays a crucial role in understanding and analyzing future trends in violations. Different time-series forecasting models are used to predict crimes.

SECURITY AND COMMUNICATION NETWORKS (2021)

Article Mathematics, Interdisciplinary Applications

Employing Machine Learning-Based Predictive Analytical Approaches to Classify Autism Spectrum Disorder Types

Muhammad Kashif Hanif, Naba Ashraf, Muhammad Umer Sarwar, Deleli Mesay Adinew, Reehan Yaqoob

Summary: Autism spectrum disorder is a neurological disorder that typically begins in early childhood and has complex causes. Early detection of autism spectrum disorder is beneficial for children's mental health. This study applied various machine and deep learning algorithms to classify the severity of autism spectrum disorder and utilized optimization techniques to improve performance. The deep neural network outperformed other approaches.

COMPLEXITY (2022)

Article Mathematics, Interdisciplinary Applications

Monitoring Population Phenology of Asian Citrus Psyllid Using Deep Learning

Maria Bibi, Muhammad Kashif Hanif, Muhammad Umer Sarwar, Muhammad Irfan Khan, Shouket Zaman Khan, Casper Shikali Shivachi, Asad Anees

Summary: Multiple prediction models were developed to monitor the population dynamics of Asian citrus psyllid in citrus-growing regions of Pakistan, using regression algorithms of machine learning. A deep neural network-based prediction model resulted in the least root mean squared error values when predicting egg, nymph, and adult populations of the pest.

COMPLEXITY (2021)

Article Computer Science, Information Systems

Accelerating all-pairs shortest path algorithms for bipartite graphs on graphics processing units

Muhammad Kashif Hanif, Karl-Heinz Zimmermann, Asad Anees

Summary: Bipartite graphs are commonly used in biological and physical sciences, and finding shortest paths in these graphs can be efficiently solved using dynamic programming algorithms. This study introduces parallel versions of Floyd-Warshall and Torgasin-Zimmermann algorithms, implemented on graphics processing unit using tropical matrix product. The performance of these algorithms is compared under different scenarios and parameters.

MULTIMEDIA TOOLS AND APPLICATIONS (2022)

Article Computer Science, Information Systems

Performance Enhancement of Predictive Analytics for Health Informatics Using Dimensionality Reduction Techniques and Fusion Frameworks

Shaeela Ayesha, Muhammad Kashif Hanif, Ramzan Talib

Summary: Predictive analytics is crucial in health informatics, with multi-source, multi-modal data improving disease prediction, diagnosis, and medication processes. Dimensionality reduction techniques and multi-modal data fusion enhance analysis performance, although handling diverse data poses challenges.

IEEE ACCESS (2022)

Article Mathematics, Interdisciplinary Applications

Caricature Face Photo Facial Attribute Similarity Generator

Muhammad Irfan Khan, Muhammad Kashif Hanif, Ramzan Talib

Summary: This study proposes a cross-domain qualitative feature-based approach to match caricature with a mugshot. It uses Haar-like features and point distribution measure to locate exaggerated facial features and calculates the difference vector based on the ratios between different facial features. The implementation based on convolutional neural network achieves better performance.

COMPLEXITY (2022)

Article Computer Science, Hardware & Architecture

Contextual Text Mining Framework for Unstructured Textual Judicial Corpora through Ontologies

Zubair Nabi, Ramzan Talib, Muhammad Kashif Hanif, Muhammad Awais

Summary: Digitalization has changed the way of information processing and new techniques of legal data processing are evolving. This research paper presents a three-tier contextual text mining framework through ontologies for judicial corpora, and the experimental results and evaluations show significant improvements.

COMPUTER SYSTEMS SCIENCE AND ENGINEERING (2022)

Article Computer Science, Information Systems

Deep Learning and Machine Learning-Based Model for Conversational Sentiment Classification

Sami Ullah, Muhammad Ramzan Talib, Toqir A. Rana, Muhammad Kashif Hanif, Muhammad Awais

Summary: This paper proposes a model that utilizes deep learning and machine learning approaches for the classification of users' emotions from Urdu conversational text. The experimental evaluation shows encouraging results with 67% accuracy for Urdu dialogue datasets.

CMC-COMPUTERS MATERIALS & CONTINUA (2022)

Article Computer Science, Information Systems

A Deep Learning Approach for Prediction of Protein Secondary Structure

Muhammad Zubair, Muhammad Kashif Hanif, Eatedal Alabdulkreem, Yazeed Ghadi, Muhammad Irfan Khan, Muhammad Umer Sarwar, Ayesha Hanif

Summary: The secondary structure of a protein is crucial for understanding its tertiary structure. In this study, deep learning models were proposed to predict the protein secondary structure by processing amino acid sequences, achieving high accuracy rates.

CMC-COMPUTERS MATERIALS & CONTINUA (2022)

Proceedings Paper Computer Science, Information Systems

HarX: Real-time harassment detection tool using machine learning

Kainat Rizwan, Sehar Babar, Sania Nayab, Muhammad Kashif Hanif

Summary: Cybersecurity is crucial in the digital market as users often face harassment in online chats, and it is important for organizations to tackle this issue effectively.

2021 INTERNATIONAL CONFERENCE OF MODERN TRENDS IN INFORMATION AND COMMUNICATION TECHNOLOGY INDUSTRY (MTICTI 2021) (2021)

Article Computer Science, Information Systems

DarkDetect: Darknet Traffic Detection and Categorization Using Modified Convolution-Long Short-Term Memory

Muhammad Bilal Sarwar, Muhammad Kashif Hanif, Ramzan Talib, Muhammad Younas, Muhammad Umer Sarwar

Summary: This paper proposes a method for detecting and categorizing darknet traffic using deep learning, achieving significant results in darknet traffic identification through steps such as data preprocessing, feature selection, and machine learning algorithms.

IEEE ACCESS (2021)

Article Computer Science, Information Systems

A Paradigm-Shifting From Domain-Driven Data Mining Frameworks to Process-Based Domain-Driven Data Mining-Actionable Knowledge Discovery Framework

Fakeeha Fatima, Ramzan Talib, Muhammad Kashif Hanif, Muhammad Awais

IEEE ACCESS (2020)

Article Computer Science, Information Systems

Accelerating Forward Algorithm for Stochastic Automata on Graphics Processing Units

Muhammad Umer Sarwar, Muhammad Kashif Hanif, Ramzan Talib, Muhammad Haris Aziz

IEEE ACCESS (2020)

Article Computer Science, Information Systems

Comparative Study of Machine Learning Techniques for Population Genetics

Muhammad Arslan Amin, Muhammad Kashif Hanif, Muhammad Umer Sarwar, Mohsin Abbas, Muhammad Haroon Jilani, Usman Nasir, Muhammad Bilal Sarwar, Hafiz Muhammad Talha

INTERNATIONAL JOURNAL OF COMPUTER SCIENCE AND NETWORK SECURITY (2019)

Article Computer Science, Artificial Intelligence

Ultrametrics for context-aware comparison of binary images

C. Lopez-Molina, S. Iglesias-Rey, B. De Baets

Summary: Quantitative image comparison is a critical topic in image processing literature, with diverse applications. Existing measures of comparison often overlook the context in which the comparison takes place. This paper presents a context-aware comparison method for binary images, tested on the BSDS500 benchmark.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Preemptively pruning Clever-Hans strategies in deep neural networks

Lorenz Linhardt, Klaus-Robert Mueller, Gregoire Montavon

Summary: This paper investigates the issue of mismatches between the decision strategy of the explainable model and the user's domain knowledge, and proposes a new method EGEM to mitigate hidden flaws in the model. Experimental results demonstrate that the approach can significantly reduce reliance on Clever Hans strategies and improve the accuracy of the model on new data.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Dual-level Deep Evidential Fusion: Integrating multimodal information for enhanced reliable decision-making in deep learning

Zhimin Shao, Weibei Dou, Yu Pan

Summary: This paper proposes a novel algorithm, Dual-level Deep Evidential Fusion (DDEF), to integrate multimodal information at both the BBA level and multimodal level, aiming to enhance accuracy, robustness, and reliability. The DDEF approach utilizes the Dirichlet framework and BBA methods for effective uncertainty estimation and employs the Dempster-Shafer Theory for dual-level fusion. The experimental results show that the proposed DDEF outperforms existing methods in synthetic digit classification and real-world medical prognosis after BCI treatment.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Multi-modal detection of fetal movements using a wearable monitor

Abhishek K. Ghosh, Danilo S. Catelli, Samuel Wilson, Niamh C. Nowlan, Ravi Vaidyanathan

Summary: The inability of current FM monitoring methods to be used outside clinical environments has made it challenging to understand the nature and evolution of FM. This investigation introduces a novel wearable FM monitor with a heterogeneous sensor suite and a data fusion architecture to efficiently capture and separate FM from interfering artifacts. The performance of the device and architecture were validated through at-home use, demonstrating high accuracy in detecting and recognizing FM events. This research is a major milestone in the development of low-cost wearable FM monitors for pervasive monitoring of FM in unsupervised environments.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

ADCT-Net: Adaptive traffic forecasting neural network via dual-graphic cross-fused transformer

Jianlei Kong, Xiaomeng Fan, Min Zuo, Muhammet Deveci, Xuebo Jin, Kaiyang Zhong

Summary: In this study, we propose an intelligent traffic flow prediction framework based on the adaptive dual-graphic transformer with a cross-fusion strategy, aiming to uncover latent graphic feature representations that transcend temporal and spatial limitations. By establishing a traffic spatiotemporal prediction model using a cross-fusion attention mechanism, our proposed model achieves superior prediction performance on practical urban traffic flow datasets, particularly for long-term predictions.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Component similarity based conflict analysis: An information fusion viewpoint

Huilai Zhi, Jinhai Li

Summary: This article addresses the issue that conflict analysis based on single-valued information systems is no longer valid. It proposes a conflict analysis method based on component similarity, which uses three-way n-valued concept lattices to handle set-valued formal contexts and realizes fast conflict analysis from an information fusion viewpoint. Experimental results verify the effectiveness of this method in reducing time consumption.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Mining and fusing unstructured online reviews and structured public index data for hospital selection

Huchang Liao, Jiaxin Qi, Jiawei Zhang, Chonghui Zhang, Fan Liu, Weiping Ding

Summary: In this paper, a hospital selection approach based on a fuzzy multi-criterion decision-making method is proposed. This approach considers sentiment evaluation values of unstructured data from online reviews and structured data of public indexes simultaneously. The methodology involves collecting and processing online reviews, classifying topics and sentiments, quantifying sentiment analysis results using fuzzy numbers, and obtaining final preference scores of hospitals based on patients' preferences. A case study and robustness analysis are conducted to validate the effectiveness of the method.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Fake news detection: Taxonomy and comparative study

Faramarz Farhangian, Rafael M. O. Cruz, George D. C. Cavalcanti

Summary: The proliferation of social networks has posed challenges in combating fake news, but automatic fake news detection using artificial intelligence has become more feasible. This paper revisits the definitions and perspectives of fake news and proposes an updated taxonomy, based on multiple criteria, for the field. The study finds that optimal feature extraction techniques vary depending on the dataset, and context-dependent models based on transformer models consistently exhibit superior performance.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

A dynamic multiple classifier system using graph neural network for high dimensional overlapped data

Mariana A. Souza, Robert Sabourin, George D. C. Cavalcanti, Rafael M. O. Cruz

Summary: In this study, a dynamic selection technique is proposed to handle sparse and overlapped data. The technique leverages the relationships between instances and classifiers to learn a dynamic classifier combination rule. Experimental results show that the proposed method outperforms static selection and other dynamic selection techniques.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

A clustering method based on multi-positive-negative granularity and attenuation-diffusion pattern

Bin Yu, Ruihui Xu, Mingjie Cai, Weiping Ding

Summary: This paper introduces a clustering method based on non-Euclidean metric and multi-granularity staged clustering to address the challenges posed by complex spatial structure data to traditional clustering methods. The method improves the similarity measure and employs an attenuation-diffusion pattern for local to global clustering, achieving good clustering results.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Fast metric multi-view hashing for multimedia retrieval

Jian Zhu, Pengbo Hu, Bingqian Li, Yi Zhou

Summary: The acquisition of multi-view hash representation for heterogeneous data is highly important for multimedia retrieval. Current approaches suffer from limited retrieval precision due to insufficient integration of multi-view features and failure to effectively utilize metric information from diverse samples. In this paper, we propose an innovative method called Fast Metric Multi-View Hashing (FMMVH), which demonstrates the superiority of gate-based fusion over traditional methods. We also introduce a novel deep metric loss function to leverage metric information from dissimilar samples. By optimizing and employing model compression techniques, our FMMVH method significantly outperforms existing state-of-the-art methods on benchmark datasets, with up to 7.47% improvement in mean Average Precision (mAP).

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

SwinWave-SR: Multi-scale lightweight underwater image super-resolution

Fayaz Ali Dharejo, Iyyakutti Iyappan Ganapathi, Muhammad Zawish, Basit Alawode, Moath Alathbah, Naoufel Werghi, Sajid Javed

Summary: The resource-limited nature of underwater vision equipment affects underwater robotics and ocean engineering tasks. Super-resolution methods, particularly using Vision Transformers (ViTs), have emerged to enhance low-resolution underwater images. However, ViTs face challenges in handling severe degradation in underwater imaging. In contrast, Multi-scale ViTs (MViTs) overcome these challenges by preserving long-range dependencies through evolving channel capacity. This study proposes a novel algorithm, SwinWave-SR, for efficient and accurate multi-scale super-resolution for underwater images.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Federated split learning for sequential data in satellite-terrestrial integrated networks

Weiwei Jiang, Haoyu Han, Yang Zhang, Jianbin Mu

Summary: This study incorporates federated learning and split learning paradigms with satellite-terrestrial integrated networks and introduces a split-then-federated learning framework and federated split learning with long short-term memory to handle sequential data in STINs. The proposed solution is demonstrated to be effective through a case study of electricity theft detection based on a real-world dataset.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Software defined radio frequency sensing framework for Internet of Medical Things

Najah Abuali, Mohammad Bilal Khan, Farman Ullah, Mohammad Hayajneh, Hikmat Ullah, Shahid Mumtaz

Summary: The demand for innovative solutions in biomedical systems for precise diagnosis and management of critical diseases is increasing. A promising technology, non-invasive and intelligent Internet of Medical Things (IoMT) system, emerges to assess patients with reduced health risks. This research introduces a comprehensive framework for early diagnosis of respiratory abnormalities through RF sensing and SDR technology. The results highlight the superior performance of deep learning frameworks in classifying respiratory anomalies.

INFORMATION FUSION (2024)

Article Computer Science, Artificial Intelligence

Global-local fusion based on adversarial sample generation for image-text matching

Shichen Huang, Weina Fu, Zhaoyue Zhang, Shuai Liu

Summary: In the era of adversarial machine learning (AML), developing robust and generalized algorithms has become a key research topic. This study proposes a global similarity matching module and a global-local cognition fusion training mechanism based on relationship adversarial sample generation to improve image-text matching algorithm. Experimental results show significant improvements in accuracy and robustness, performing well in facing security challenges and promoting the fusion of visual and linguistic modalities.

INFORMATION FUSION (2024)