Article
Computer Science, Artificial Intelligence
Soumen Kumar Pati, Manan Kumar Gupta, Rinita Shai, Ayan Banerjee, Arijit Ghosh
Summary: Microarray data analysis is important in cancer study. However, the complexity of the data extraction process leads to missing values which disrupt the analysis. This study proposes a novel method, Sim-GAN, which utilizes similarity index and generative adversarial network to estimate missing values. Experimental results show that the proposed method performs well in predicting meaningful expression values and outperforms existing techniques.
KNOWLEDGE AND INFORMATION SYSTEMS
(2022)
Article
Computer Science, Information Systems
Aiguo Wang, Jing Yang, Ning An
Summary: This study formalizes the problem of missing values in microarray data under a regularized sparse framework and proposes local learning-based imputation models with elastic net regularization to accurately estimate missing entries in gene expression profiles. Experimental results demonstrate the superiority of elastic net over other methods in terms of statistical analysis metrics.
Article
Environmental Sciences
Xiwen Sun, Tieding Lu, Shunqiang Hu, Jiahui Huang, Xiaoxing He, Jean-Philippe Montillet, Xiaping Ma, Zhengkai Huang
Summary: Accurate identification of noise models for GNSS time series is crucial for reliable GNSS velocity field and uncertainty estimation in geodynamics and geodesy. By considering time span and missing data effects, we analyzed the impact of duration and data gaps on noise model selection and velocity estimation using four combined noise models. Results show that longer time series improve the convergence of selected noise models and homogenize the velocity uncertainty estimation. However, inaccurate estimation of noise models can occur for shorter time series, affecting the accuracy of GNSS velocity estimation. Finally, a minimum time series duration of 12 years is recommended for reliable selection of noise models and velocity estimation.
Article
Computer Science, Artificial Intelligence
Jangho Park, Juliane Muller, Bhavna Arora, Boris Faybishenko, Gilberto Pastorello, Charuleka Varadharajan, Reetik Sahu, Deborah Agarwal
Summary: We propose a deep learning model, specifically a MultiLayer Perceptron, for estimating missing values in multivariate time series data. Our algorithm focuses on filling long continuous gaps in the data and uses an automated method to determine the optimal MLP model architecture. We compared our approach to existing R-based methods and found that using an MLP leads to better results, particularly for nonlinear data.
NEURAL COMPUTING & APPLICATIONS
(2023)
Article
Physics, Fluids & Plasmas
Sangwon Lee, Vipul Periwal, Junghyo Jo
Summary: Inferring dynamics from incomplete time series data is challenging, but an expectation maximization algorithm proposed in this study demonstrates effectiveness in restoring missing data points and inferring underlying network models. Balancing consistency between observed and missing data points is crucial for accurate model inference during iterative processes.
Article
Computer Science, Artificial Intelligence
Eunseo Oh, Hyunsoo Lee
Summary: As the importance of data-based predictive maintenance frameworks rises, missing values in industrial data become an emerging issue. This study proposes a missing value estimation method based on Gaussian progress regression and corrects them using quantum mechanics-based stochastic differential equation and Ito's lemma. This method enables more accurate data analysis.
EXPERT SYSTEMS WITH APPLICATIONS
(2024)
Article
Computer Science, Interdisciplinary Applications
Francesco Camastra, Vincenzo Capone, Angelo Ciaramella, Angelo Riccio, Antonino Staiano
Summary: This paper introduces an Iterated Imputation and Prediction algorithm for predicting time series with missing data. The algorithm uses Correlation Dimension Estimation and Support Vector Machine Regression. Experimental validation shows that the algorithm has a small average percentage prediction error on three environmental time series.
ENVIRONMENTAL MODELLING & SOFTWARE
(2022)
Article
Computer Science, Artificial Intelligence
Sunghoon Lim, Sun Jun Kim, YoungJae Park, Nahyun Kwon
Summary: The study introduces a time series model based on deep learning technique LSTM to predict future values of liquid cargo traffic. The model can handle missing values on liquid cargo traffic records and enhance prediction performance with additional indices.
EXPERT SYSTEMS WITH APPLICATIONS
(2021)
Article
Computer Science, Information Systems
Parikshit Bansal, Prathamesh Deshpande, Sunita Sarawagi
Summary: DeepMVI is a deep learning method for missing value imputation in multidimensional time-series datasets. Experimental results show that DeepMVI outperforms conventional methods in accuracy, reducing error by more than 50% in over half of the cases.
PROCEEDINGS OF THE VLDB ENDOWMENT
(2021)
Article
Energy & Fuels
Antonio Liguori, Romana Markovic, Martina Ferrando, Jerome Frisch, Francesco Causone, Christoph van Treeck
Summary: This study investigates the use of data augmentation techniques for reconstructing missing energy time-series in limited data scenarios. A convolutional denoising autoencoder is chosen as the base imputation model, and an optimal augmentation rate is determined based on preliminary results. The results show that augmenting a nine days-long training set 80 times can significantly reduce the initial average RMSE and outperform benchmark methods.
Article
Automation & Control Systems
Ali Eshragh, Fred Roosta, Asef Nazari, Michael W. Mahoney
Summary: This paper applies methods from RandNLA to develop improved algorithms for analyzing large-scale time series data. A fast algorithm is developed to estimate the leverage scores of an AR model in big data regimes, showing high accuracy. Using these theoretical results, an efficient algorithm called LSAR is proposed to fit an appropriate AR model to big time series data, with high probability of finding maximum likelihood estimates and significantly improving running time compared to state-of-the-art alternatives.
JOURNAL OF MACHINE LEARNING RESEARCH
(2022)
Article
Statistics & Probability
Zhan Liu, Chun Yip Yau
Summary: This paper proposed a method for handling nonignorable missing data in longitudinal surveys, focusing on time series models. Results from simulation studies and an empirical example were presented to demonstrate the usefulness of the proposed methodology.
JOURNAL OF STATISTICAL PLANNING AND INFERENCE
(2021)
Article
Engineering, Electrical & Electronic
Rui Wu, Scott D. Hamshaw, Lei Yang, Dustin W. Kincaid, Randall Etheridge, Amir Ghasemkhani
Summary: This paper proposes a framework to improve the accuracy of the popular multivariate imputation by chained equations (MICE) method for dealing with missing data. The framework involves reshaping the original sensor data and leveraging the correlation between missing and observed data. Experimental results using water quality monitoring data demonstrate a significant improvement in MICE model accuracy with these strategies.
IEEE SENSORS JOURNAL
(2022)
Article
Neurosciences
Evan M. Dastin-van Rijn, Nicole R. Provenza, Gregory S. Vogt, Michelle Avendano-Ortega, Sameer A. Sheth, Wayne K. Goodman, Matthew T. Harrison, David A. Borton
Summary: Recent advances in wireless data transmission technology have the potential to revolutionize clinical neuroscience. We developed a method called PELP to accurately reconstruct time-domain neural signals impacted by data loss during wireless transmission. Applying PELP enables a better understanding of the brain-behavior relationships.
FRONTIERS IN HUMAN NEUROSCIENCE
(2022)
Article
Automation & Control Systems
Huan Xu, Feng Ding, Erfu Yang
Summary: This paper explores data filtering-based identification algorithms for an exponential autoregressive time-series model with moving average noise. By using the hierarchical identification principle, the model is transformed into three sub-identification models and a new extended stochastic gradient algorithm is derived. Through simulation results, it is shown that the proposed algorithm can effectively improve parameter estimation accuracy.
INTERNATIONAL JOURNAL OF ROBUST AND NONLINEAR CONTROL
(2021)
Article
Automation & Control Systems
Xiao-Fei Zhang, Le Ou-Yang, Ting Yan, Xiaohua Tony Hu, Hong Yan
Summary: The study introduces a joint graphical model to estimate multiple gene networks simultaneously, leveraging network decomposition and group lasso penalty to examine similarities and differences among different subpopulations and data types, leading to improved accuracy in gene network reconstruction.
IEEE TRANSACTIONS ON CYBERNETICS
(2021)
Article
Automation & Control Systems
Yu-Feng Yu, Guoxia Xu, Min Jiang, Hu Zhu, Dao-Qing Dai, Hong Yan
Summary: In this paper, a robust graph matching (RGM) model is proposed to improve the effectiveness and robustness in matching graphs with deformations, rotations, outliers, and noise. The RGM model embeds joint geometric transformation and utilizes $L_{2,1}$ -norm as the similarity metric for enhanced robustness. Extensive experiments demonstrate the competitive performance of the RGM model in various graph matching tasks.
IEEE TRANSACTIONS ON CYBERNETICS
(2021)
Article
Computer Science, Information Systems
Rizwan Qureshi, Mengxu Zhu, Hong Yan
Summary: This study investigates the mechanism of drug resistance caused by EGFR mutations in NSCLC, using molecular dynamics simulations and a PCA-based method to analyze drug resistance. The establishment of a systematic method for visualizing protein-drug interactions provides an effective framework for the atomic-level analysis of drug resistance in lung cancer.
IEEE JOURNAL OF BIOMEDICAL AND HEALTH INFORMATICS
(2021)
Article
Automation & Control Systems
Chuan-Xian Ren, Pengfei Ge, Dao-Qing Dai, Hong Yan
Summary: A new kernel learning method called KLN is proposed in this paper to enhance the discrimination performance of Conditional Maximum Mean Discrepancy (CMMD) by iteratively operating on deep network features. By considering a compound kernel, the effectiveness of CMMD for data category description is improved, leading to state-of-the-art classification performance on benchmark datasets.
IEEE TRANSACTIONS ON CYBERNETICS
(2021)
Article
Biochemical Research Methods
Mengxu Zhu, Avirup Ghosh, Hong Yan
Summary: This study analyzed the binding affinity between Dexamethasone and an SARS-CoV-2 protein through geometrical methods and studied the theoretical effectiveness of Dexamethasone as a potential treatment for COVID-19. The results showed that the behavior of Dexamethasone was similar to other inhibitors, with its efficacy possibly due to its glucocorticoid properties and potent inhibition.
CURRENT BIOINFORMATICS
(2021)
Article
Engineering, Electrical & Electronic
Jianjun Liu, Dunbin Shen, Zebin Wu, Liang Xiao, Jun Sun, Hong Yan
Summary: This paper proposes a patch-aware deep fusion approach for hyperspectral image fusion, aiming to improve the fusion result by utilizing patch information under subspace representation. The proposed approach builds a fusion model and solves it using an optimization algorithm, resulting in a structured deep fusion network. The performance is further improved by an aggregation fusion technique.
IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING
(2022)
Article
Biochemical Research Methods
Yu-Ting Tan, Le Ou-Yang, Xingpeng Jiang, Hong Yan, Xiao-Fei Zhang
Summary: Learning how gene regulatory networks change under different conditions is important. Existing methods for inferring differential networks have limitations. In this study, a new method is proposed and shown to outperform other methods in simulation studies and applications.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Biochemical Research Methods
Rizwan Qureshi, Avirup Ghosh, Hong Yan
Summary: This study examines the complete structure of the multi-domain EGFR protein and its mutants using molecular dynamics simulations and normal mode analysis. The findings reveal different patterns of correlated motions in each domain of EGFR mutants compared to the wildtype, and the mutant structures have fewer hydrogen bonds. These findings are important for understanding the dynamics and communications in EGFR domains.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Computer Science, Artificial Intelligence
You-Wei Luo, Chuan-Xian Ren, Dao-Qing Dai, Hong Yan
Summary: This paper proposes a Riemannian manifold learning framework for achieving transferability and discriminability simultaneously in unsupervised domain adaptation. A probabilistic discriminant criterion is established on the target domain using soft labels, and manifold metric alignment is used to be compatible with the embedding space. Experimental results demonstrate the superiority of the proposed method.
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
(2022)
Article
Biochemical Research Methods
Meng-Guo Wang, Le Ou-Yang, Hong Yan, Xiao-Fei Zhang
Summary: In this study, a novel method called prior network-dependent gene network inference (pGNI) is proposed to estimate gene co-expression networks by integrating gene expression data and prior protein interaction network data. The method successfully captures the modular structures in the networks and is demonstrated to be effective through simulation studies and real datasets.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Biochemical Research Methods
Bo Li, Ke Jin, Le Ou-Yang, Hong Yan, Xiao-Fei Zhang
Summary: The single-cell RNA sequencing (scRNA-seq) technique is used to analyze gene expression patterns in complex tissues at single-cell resolution, but dropout events can hinder downstream analyses. We developed a new imputation method, scTSSR2, which combines matrix decomposition with two-side sparse self-representation to effectively impute dropout events in scRNA-seq data. Comparative experiments show that scTSSR2 outperforms existing imputation methods in terms of computational speed and memory usage. We also provide a user-friendly R package, scTSSR2, for denoising scRNA-seq data and improving data quality.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2023)
Article
Computer Science, Artificial Intelligence
Ali Raza Shahid, Mehmood Nawaz, Xinqi Fan, Hong Yan
Summary: This article proposes a view-adaptive mechanism that transforms the skeleton view into a more consistent virtual perspective, reducing the influence of view variations.
IEEE TRANSACTIONS ON COGNITIVE AND DEVELOPMENTAL SYSTEMS
(2023)
Article
Biochemical Research Methods
Subin Qian, Huiyi Liu, Xiaofeng Yuan, Wei Wei, Shuangshuang Chen, Hong Yan
Summary: This paper proposes a biclustering method called RCSBC, which aims to find checkerboard patterns within gene expression data. By exploiting the relationship between the row/column structure of a gene expression matrix and the structure of biclusters, the method achieves low time and space complexity and outperforms existing algorithms in terms of clustering accuracy and time/space complexity.
IEEE-ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS
(2022)
Article
Computer Science, Information Systems
Mehmood Nawaz, Hong Yan
Summary: By utilizing high-level features and affinity-based techniques, a precise salient region can be extracted effectively from noisy and cluttered backgrounds, controlling foreground and background information to enhance detection quality.
IEEE TRANSACTIONS ON MULTIMEDIA
(2021)
Article
Computer Science, Information Systems
Mengxu Zhu, Rizwan Qureshi, Hong Yan
Summary: EGFR plays a vital role in lung cell proliferation and mutations in its kinase domain may lead to cancer. This study investigated the binding mechanism of EGFR drug mutant complex through molecular dynamics simulation and geometrical properties analysis. The results showed that drug-sensitive mutants have tighter interactions, while drug-resistant mutants have looser bindings.