Article
Automation & Control Systems
Hanyi Zheng, Qing Wang, Jingshan Li
Summary: Improving workflow efficacy in operating rooms is crucial for hospital management. Current literature lacks efficient methods for detailed analysis of surgical workflow. This study introduces a system-theoretic approach, presenting a Markov chain-based analytical model, which is validated through numerical experiments.
IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING
(2023)
Article
Genetics & Heredity
Maciej Zakarczemny, Malgorzata Zajecka
Summary: The paper discusses the mathematical modification and redesign of DNA using Markov chains. It presents a simple mathematical technique for filling in missing parts of DNA while maintaining the frequency consistency in amino acid sequences of proteins through finding synonymous codons. The paper analyzes the dependencies in the DNA sequence of the human gene Alpha 1,3-Galactosyltransferase 2 using the Markov Chain. It also provides a theoretical introduction to aid non-mathematicians, especially biologists unfamiliar with the theory of Markov chains.
Article
Physics, Multidisciplinary
D. Bernal-Casas, J. M. Oller
Summary: This work introduces a mathematical framework based on information geometry to understand the relationship between physical matter and information theory. It explores how information can be represented and distributed over quantum harmonic oscillators, and demonstrates the quantization and lower bound of the estimator's variance. The study also connects quantum harmonic oscillators with Bayes' theorem, showing the relationship between the global probability density function and the sources of information.
Article
Computer Science, Artificial Intelligence
Rianne M. Schouten, Marcos L. P. Bueno, Wouter Duivesteijn, Mykola Pechenizkiy
Summary: This study utilizes the framework of Exceptional Model Mining to discover sequences with varying order transition behavior, proposing three new quality measures based on information-theoretic scoring functions. Controlled experiments demonstrate the sensitivity and robustness of these quality measures, with the measure based on Akaike's Information Criterion showing the most stability regarding the number of observations. Overall, the research adds to existing work by focusing on subgroups of sequences rather than transitions, with practical relevance in identifying originators of exceptional sequences such as patients.
DATA MINING AND KNOWLEDGE DISCOVERY
(2022)
Article
Multidisciplinary Sciences
Farideh Jalali-najafabadi, Michael Stadler, Nick Dand, Deepak Jadon, Mehreen Soomro, Pauline Ho, Helen Marzo-Ortega, Philip Helliwell, Eleanor Korendowych, Michael A. Simpson, Jonathan Packham, Catherine H. Smith, Jonathan N. Barker, Neil McHugh, Richard B. Warren, Anne Barton, John Bowes, Catherine H. Smith, Catherine H. Smith, Jonathan N. Barker, Richard B. Warren, Nick Dand, Nick Dand, Catherine H. Smith
Summary: The study focuses on developing feature selection and risk prediction models for psoriatic arthritis patients. Through the use of information theoretic criteria methods, the research demonstrates strategies for addressing potential confounding features in high-dimensional datasets.
SCIENTIFIC REPORTS
(2021)
Article
Mathematics
Manuel L. Esquivel, Nadezhda P. Krasii, Gracinda R. Guerreiro
Summary: This study addresses the problem of finding a natural continuous time Markov type process in open populations using information provided by discrete time open Markov chains. Two main approaches are proposed: calibrating a continuous time Markov process using a discrete time transition matrix and directly extending discrete time theory to continuous time theory using semi-Markov processes and open Markov schemes.
Article
Microbiology
Kai Song
Summary: The researcher developed a Markov model-based method, VirMC, to identify viral sequences from metagenomic data, which outperformed other two state-of-the-art methods, particularly for short contigs and contaminated metagenomic samples. This alignment-free method showed better performance in assembling viral-genome sequences and could help classify healthy and diseased statuses based on human gut metagenomes.
FRONTIERS IN MICROBIOLOGY
(2021)
Review
Physics, Multidisciplinary
Jan Mielniczuk
Summary: This paper reviews the information theoretic tools and their application in feature selection, focusing on classification problems with discrete features. The authors discuss various ways of constructing counterparts to conditional mutual information and their properties and limitations. They propose a unified method based on truncation for the Mobius expansion of conditional mutual information. The paper also discusses the main approaches to feature selection using the introduced measures of conditional dependence, along with methods for assessing the quality of the obtained predictors, including recent results on asymptotic distributions of empirical criteria and advances in resampling.
Article
Computer Science, Artificial Intelligence
Tom Lefebvre
Summary: In this study, we extend and handle the problem of imitation from observations, specifically in the case of feature-only demonstrations. Our approach combines elements from probability and information theory to develop a behavioral cloning method that extracts an executable policy directly from the given features.
PATTERN RECOGNITION LETTERS
(2022)
Article
Multidisciplinary Sciences
Cheuk Chi A. Ng, Wai Man Tam, Haidi Yin, Qian Wu, Pui-Kin So, Melody Yee-Man Wong, Francis C. M. Lau, Zhong-Ping Yao
Summary: The study demonstrates the use of peptide sequences for durable and high-density data storage. By optimizing analytical protocols and developing software, the researchers successfully stored and retrieved text and music files using 18-mer peptides. This innovative method could potentially revolutionize data storage and stimulate advancements in related fields.
NATURE COMMUNICATIONS
(2021)
Review
Engineering, Industrial
Juan Jesus Rico-Pena, Raquel Arguedas-Sanz, Carmen Lopez-Martin
Summary: Blockchain has the potential to transform business management by improving operational efficiency. However, there are performance and vulnerability issues with different types of blockchain applications in different domains. This paper provides a systematic literature review and analysis of models used to evaluate these issues, as well as a bibliometric analysis of the research. The main contribution is the overview of blockchain modeling and its direct applications in vulnerability and performance analysis of existing applications, and in the implementation of new applications.
Article
Physics, Multidisciplinary
Weng Hoe Lam, Weng Siew Lam, Saiful Hafizah Jaaman, Pei Fun Lee
Summary: This paper presents a bibliometric analysis of information theoretic publications listed on the Scopus database, revealing trends in publication growth, subject areas, geographical contributions, country co-authorship, and citation metrics. The study finds that the focus of information theoretic approaches is shifting towards technology-driven applications.
Article
Engineering, Industrial
M. L. Gamiz, F. Navas-Gomez, R. Raya-Miranda, M. C. Segovia-Garcia
Summary: The main objective of this paper is to build stochastic models to describe the time evolution of a system and estimate its characteristics in the absence of direct observations of the system state. It focuses on the application of sensor networks in industrial equipment observation and control. The model uses hidden Markov processes where observations depend on both the current hidden state and previous observations. Reliability measures are defined to control false positive (negative) signals. The paper also considers system maintenance and introduces the concept of signal-runs. A simulation study and a real application related to a water-pump system are discussed.
RELIABILITY ENGINEERING & SYSTEM SAFETY
(2023)
Article
Mathematics
Yousif Alyousifi, Kamarulzaman Ibrahim, Mahmod Othamn, Wan Zawiah Wan Zin, Nicolas Vergne, Abdullah Al-Yaari
Summary: This study focused on air pollution index (API) data from seven stations in the central region of Peninsular Malaysia. Using Bayesian information criteria (BIC), the optimum order of Markov chain (MC) models were determined for hourly and daily API sequences. The analysis revealed that second and third-order MC models were most fitting for hourly API occurrences, while a first-order MC model best described the dynamics of daily API. Understanding the delay effect of air pollution is crucial for managing air quality.
Article
Medical Informatics
Amin Jalali, Paul Johannesson, Erik Perjons, Ylva Askfors, Abdolazim Rezaei Kalladj, Tero Shemeikka, Aniko Veg
Summary: Data-driven process analysis relies on software support, with process variant analysis as an important technique. While current software supports process cohort comparison based on activity frequencies and performance metrics, it lacks the ability to compare cohorts based on transition probabilities in healthcare settings.
BMC MEDICAL INFORMATICS AND DECISION MAKING
(2021)
Article
Computer Science, Hardware & Architecture
Boris Ryabko, Anton Rakitskiy
JOURNAL OF CIRCUITS SYSTEMS AND COMPUTERS
(2020)
Article
Mathematics
Jan Levenets, Anna Novikovskaya, Sofia Panteleeva, Zhanna Reznikova, Boris Ryabko
Article
Physics, Multidisciplinary
Boris Ryabko
Article
Mathematics
Konstantin Chirikhin, Boris Ryabko
Summary: The article proposes forecasting methods based on real-world data compressors that can effectively predict univariate and multivariate data with automatic selection of the best algorithm. Additionally, the use of time-universal codes can reduce computation time without sacrificing accuracy.
Article
Computer Science, Theory & Methods
Boris Ryabko
Summary: This paper introduces a PRNG class that has been tested successfully and consists of generators that can produce normal sequences. The generators in this class also satisfy a specific mathematical property.
INTERNATIONAL JOURNAL OF FOUNDATIONS OF COMPUTER SCIENCE
(2021)
Article
Physics, Multidisciplinary
Boris Ryabko, Nadezhda Savina
Summary: This article proposes a methodology for authorship attribution of literary texts based on the use of data compressors, which allows for statistically verified results. The method is used to solve two problems of attribution in Russian literature.
Article
Statistics & Probability
Boris Ryabko
Summary: This article explores the construction of the most powerful test and effective statistical tests for RNGs used in various fields such as data protection, modeling and simulation systems, and computer games. The effectiveness of RNG statistical tests is estimated through experiments and a model suitable for binary sequences in encryption systems is proposed.
JOURNAL OF STATISTICAL PLANNING AND INFERENCE
(2022)
Article
Physics, Multidisciplinary
Boris Ryabko, Nadezhda Savina
Summary: In recent years, the task of translation has gained attention from researchers due to its practical applications. This paper proposes an information-theoretic method to assess translation quality, focusing on the impact of unconscious author's style on translation. The method is applied to translations of classic English works into Russian and vice versa, successfully determining the attribution of literary texts.
Article
Computer Science, Theory & Methods
Boris Ryabko
Summary: This article discusses the problem of creating an unconditionally secure cipher when the key length is shorter than the encrypted message. It proposes a cipher method based on data compression, randomization, and entropy-secure encryption, and applies it to two scenarios: knowing the statistics of encrypted messages, and generating messages using a Markov chain with known memory or connectivity. In both cases, the length of the secret key is negligible compared to the message length.
DESIGNS CODES AND CRYPTOGRAPHY
(2023)
Proceedings Paper
Computer Science, Information Systems
Boris Ryabko
Summary: We discuss the problem of constructing an unconditionally secure cipher when the key length is shorter than the encrypted message. By combining data compression, randomization techniques, and entropically-secure encryption, we propose a solution for encryption with known message statistics. The resulting cipher allows for key length independent of entropy or encrypted message length, but determined by the desired security level.
2022 IEEE INFORMATION THEORY WORKSHOP (ITW)
(2022)
Proceedings Paper
Computer Science, Information Systems
Boris Ryabko
PROCEEDINGS OF 2020 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS (ISITA2020)
(2020)
Proceedings Paper
Computer Science, Information Systems
Ryabko Boris, Zhuravlev Viacheslav
PROCEEDINGS OF 2020 INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY AND ITS APPLICATIONS (ISITA2020)
(2020)
Meeting Abstract
Radiology, Nuclear Medicine & Medical Imaging
J. Sauerbeck, L. Beyer, S. Schoenecker, C. Palleis, G. Hoeglinger, E. Schuh, R. Boris, G. Rohrer, S. Sonnenfeld, K. Boetzel, A. Danek, A. Rominger, J. Levin, B. Matthias
EUROPEAN JOURNAL OF NUCLEAR MEDICINE AND MOLECULAR IMAGING
(2019)
Proceedings Paper
Computer Science, Information Systems
Boris Ryabko
2019 IEEE INTERNATIONAL SYMPOSIUM ON INFORMATION THEORY (ISIT)
(2019)
Article
Computer Science, Artificial Intelligence
Boris Ryabko
Article
Computer Science, Interdisciplinary Applications
Blair Robertson, Chris Price
Summary: Spatial sampling designs are crucial for accurate estimation of population parameters. This study proposes a new design method that generates samples with good spatial spread and performs favorably compared to existing designs.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Hiroya Yamazoe, Kanta Naito
Summary: This paper focuses on the simultaneous confidence region of a one-dimensional curve embedded in multi-dimensional space. An estimator of the curve is obtained through local linear regression on each variable in multi-dimensional data. A method to construct a simultaneous confidence region based on this estimator is proposed, and theoretical results for the estimator and the region are developed. The effectiveness of the region is demonstrated through simulation studies and applications to artificial and real datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Cheng Peng, Drew P. Kouri, Stan Uryasev
Summary: This paper introduces a novel optimal experimental design method for quantifying the distribution tails of uncertain system responses. The method minimizes the variance or conditional value-at-risk of the upper bound of the predicted quantile, and estimates the data uncertainty using quantile regression. The optimal design problems are solved as linear programming problems, making the proposed methods efficient even for large datasets.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xiaofei Wu, Hao Ming, Zhimin Zhang, Zhenyu Cui
Summary: This paper proposes a model that combines quantile regression and fused LASSO penalty, and introduces an iterative algorithm based on ADMM to solve high-dimensional datasets. The paper proves the global convergence and comparable convergence rates of the algorithm, and analyzes the theoretical properties of the model. Numerical experimental results support the superior performance of the model.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xin He, Xiaojun Mao, Zhonglei Wang
Summary: This paper proposes a nonparametric imputation method with sparsity to estimate the finite population mean, using an efficient kernel method and sparse learning for estimation. An augmented inverse probability weighting framework is adopted to achieve a central limit theorem for the proposed estimator under regularity conditions.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Christian H. Weiss, Fukang Zhu
Summary: This study introduces a multiplicative error model (CMEMs) for discrete-valued count time series, which is closely related to the integer-valued generalized autoregressive conditional heteroscedasticity (INGARCH) models. It derives the stochastic properties and estimation approaches of different types of INGARCH-CMEMs, and demonstrates their performance and application through simulations and real-world data examples.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Ming-Hung Kao, Ping-Han Huang
Summary: Optimal designs for sparse functional data under the functional empirical component (FEC) settings are investigated. New computational methods and theoretical results are developed to efficiently obtain optimal exact and approximate designs. A hybrid exact-approximate design approach is proposed and demonstrated to be efficient through simulation studies and a real example.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Mateus Maia, Keefe Murphy, Andrew C. Parnell
Summary: The Bayesian additive regression trees (BART) model is a powerful ensemble method for regression tasks, but its lack of smoothness and explicit covariance structure can limit its performance. The Gaussian processes Bayesian additive regression trees (GP-BART) model addresses this limitation by incorporating Gaussian process priors, resulting in superior performance in various scenarios.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Xichen Mou, Dewei Wang
Summary: Human biomonitoring is a method of monitoring human health by measuring the accumulation of harmful chemicals in the body. To reduce the high cost of chemical analysis, researchers have adopted a cost-effective approach that combines specimens and analyzes the concentration of toxic substances in the pooled samples. To effectively interpret these aggregated measurements, a new regression framework is proposed by extending the additive partially linear model (APLM). The APLM is versatile in capturing the complex association between outcomes and covariates, making it valuable in assessing the complex interplay between chemical bioaccumulation and potential risk factors.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Lili Yu, Yichuan Zhao
Summary: The classical accelerated failure time model is a linear model commonly used for right censored survival data, but it cannot handle heteroscedastic survival data. This paper proposes a Laplace approximated quasi-likelihood method with a continuous estimating equation to address this issue, and provides estimation bias and confidence interval estimation formulas.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Shaobo Jin, Youngjo Lee
Summary: Hierarchical generalized linear models are widely used for fitting random effects models, but the standard error estimators receive less attention. Current standard error estimation methods are not necessarily accurate, and a sandwich estimator is proposed to improve the accuracy of standard error estimation.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Rebeca Pelaez, Ingrid Van Keilegom, Ricardo Cao, Juan M. Vilar
Summary: This article proposes an estimator for the probability of default (PD) in credit risk, derived from a nonparametric conditional survival function estimator based on cure models. The asymptotic expressions for bias, variance, and normality of the estimator are presented. Through simulation and empirical studies, the performance and practical behavior of the nonparametric estimator are compared with other methods.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
L. M. Andre, J. L. Wadsworth, A. O'Hagan
Summary: This paper proposes a dependence model that captures the entire data range in multi-variable cases. By blending two copulas with different characteristics and using a dynamic weighting function for smooth transition, the model is able to flexibly capture various dependence structures.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Niwen Zhou, Xu Guo, Lixing Zhu
Summary: The paper investigates hypothesis testing regarding the potential additional contributions of other covariates to the structural function, given the known covariates. The proposed distance-based test, based on Neyman's orthogonality condition, effectively detects local alternatives and is robust to the influence of nuisance functions. Numerical studies and real data analysis demonstrate the importance of this test in exploring covariates associated with AIDS treatment effects.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)
Article
Computer Science, Interdisciplinary Applications
Blake Moya, Stephen G. Walker
Summary: A full posterior analysis method for nonparametric mixture models using Gibbs-type prior distributions, including the well known Dirichlet process mixture (DPM) model, is presented. The method removes the random mixing distribution and enables a simple-to-implement Markov chain Monte Carlo (MCMC) algorithm. The removal procedure reduces some of the posterior uncertainty and introduces a novel replacement approach. The method only requires the probabilities of a new or an old value associated with the corresponding Gibbs-type exchangeable sequence, without the need for explicit representations of the prior or posterior distributions. This allows the implementation of mixture models with full posterior uncertainty, including one introduced by Gnedin. The paper also provides numerous illustrations and introduces an R-package called CopRe that implements the methodology.
COMPUTATIONAL STATISTICS & DATA ANALYSIS
(2024)