4.3 Review

Intrinsic Dimension Estimation: Relevant Techniques and a Benchmark Framework

Journal

MATHEMATICAL PROBLEMS IN ENGINEERING
Volume 2015, Issue -, Pages -

Publisher

HINDAWI LTD
DOI: 10.1155/2015/759567

Keywords

-

Ask authors/readers for more resources

When dealing with datasets comprising high-dimensional points, it is usually advantageous to discover some data structure. A fundamental information needed to this aim is the minimum number of parameters required to describe the data while minimizing the information loss. This number, usually called intrinsic dimension, can be interpreted as the dimension of the manifold from which the input data are supposed to be drawn. Due to its usefulness in many theoretical and practical problems, in the last decades the concept of intrinsic dimension has gained considerable attention in the scientific community, motivating the large number of intrinsic dimensionality estimators proposed in the literature. However, the problem is still open since most techniques cannot efficiently deal with datasets drawn from manifolds of high intrinsic dimension and nonlinearly embedded in higher dimensional spaces. This paper surveys some of the most interesting, widespread used, and advanced state-of-the-art methodologies. Unfortunately, since no benchmark database exists in this research field, an objective comparison among different techniques is not possible. Consequently, we suggest a benchmark framework and apply it to comparatively evaluate relevant state-of-the-art estimators.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.3
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Environmental Sciences

Volume-of-Interest Aware Deep Neural Networks for Rapid Chest CT-Based COVID-19 Patient Risk Assessment

Anargyros Chatzitofis, Pierandrea Cancian, Vasileios Gkitsas, Alessandro Carlucci, Panagiotis Stalidis, Georgios Albanis, Antonis Karakottas, Theodoros Semertzidis, Petros Daras, Caterina Giannitto, Elena Casiraghi, Federica Mrakic Sposta, Giulia Vatteroni, Angela Ammirabile, Ludovica Lofino, Pasquala Ragucci, Maria Elena Laino, Antonio Voza, Antonio Desai, Maurizio Cecconi, Luca Balzarini, Arturo Chiti, Dimitrios Zarpalas, Victor Savevski

Summary: This paper introduces an artificially intelligent tool for CT-based risk assessment in COVID-19 patients to improve treatment and patient care. The authors utilize a VoI-based approach to address the high dimensionality of CT inputs, create a new labeled CT dataset, and demonstrate the effectiveness of their method in patient risk assessment. Achieving high accuracy and performance, this approach shows promise in enhancing healthcare practices during the COVID-19 pandemic.

INTERNATIONAL JOURNAL OF ENVIRONMENTAL RESEARCH AND PUBLIC HEALTH (2021)

Article Multidisciplinary Sciences

Automated image analysis to assess hygienic behaviour of honeybees

Gianluigi Paolillo, Alessandro Petrini, Elena Casiraghi, Maria Grazia De Iorio, Stefano Biffani, Giulio Pagnacco, Giulietta Minozzi, Giorgio Valentini

Summary: The focus of this study is to develop an automated image processing pipeline for images acquired in uncontrolled conditions. The pipeline is specifically tested on honeybee comb images for identifying and counting uncapped brood cells. The model shows good performance in handling various acquisition conditions and achieves high correlation between automated and manual counts.

PLOS ONE (2022)

Article Virology

NSAID use and clinical outcomes in COVID-19 patients: a 38-center retrospective cohort study

Justin T. Reese, Ben Coleman, Lauren Chan, Hannah Blau, Tiffany J. Callahan, Luca Cappelletti, Tommaso Fontana, Katie Rebecca Bradwell, Nomi L. Harris, Elena Casiraghi, Giorgio Valentini, Guy Karlebach, Rachel Deer, Julie A. McMurry, Melissa A. Haendel, Christopher G. Chute, Emily Pfaff, Richard Moffitt, Heidi Spratt, Jasvinder Singh, Christopher J. Mungall, Andrew E. Williams, Peter N. Robinson

Summary: This study found that the use of non-steroidal anti-inflammatory drugs (NSAIDs) is not associated with increased severity or other adverse outcomes in COVID-19 inpatients. The results confirm and extend the findings of previous observational studies and provide evidence against the initial concerns raised about the use of NSAIDs in COVID-19 patients.

VIROLOGY JOURNAL (2022)

Review Biochemical Research Methods

Heterogeneous data integration methods for patient similarity networks

Jessica Gliozzo, Marco Mesiti, Marco Notaro, Alessandro Petrini, Alex Patak, Antonio Puertas-Gallardo, Alberto Paccanaro, Giorgio Valentini, Elena Casiraghi

Summary: Patient similarity networks (PSNs) are widely used in clinical research to summarize patient relationships and predict outcomes, phenotypes, and disease risk. PSNs can be visualized and offer explainability of machine learning predictions. This article reviews methods for integrating multiple biomedical data views and patient similarity measures to construct PSNs, while also providing a resource to navigate machine learning literature on this topic.

BRIEFINGS IN BIOINFORMATICS (2022)

Article Biochemical Research Methods

Boosting tissue-specific prediction of active cis-regulatory regions through deep learning and Bayesian optimization techniques

Luca Cappelletti, Alessandro Petrini, Jessica Gliozzo, Elena Casiraghi, Max Schubach, Martin Kircher, Giorgio Valentini

Summary: CRRs play a central role in regulating transcription under physiological and pathological conditions. Accurately identifying CRRs and their tissue-specific activity status using machine learning methods is essential for studying the impact of genetic variants on human diseases.

BMC BIOINFORMATICS (2022)

Article Endocrinology & Metabolism

Metformin is associated with reduced COVID-19 severity in patients with prediabetes

Lauren E. Chan, Elena Casiraghi, Bryan Laraway, Ben Coleman, Hannah Blau, Adnin Zaman, Nomi L. Harris, Kenneth Wilkins, Blessy Antony, Michael Gargano, Giorgio Valentini, David Sahner, Melissa Haendel, Peter N. Robinson, Carolyn Bramante, Justin Reese

Summary: Studies have shown that the use of metformin is associated with reduced severity of COVID-19, particularly for patients with prediabetes or PCOS.

DIABETES RESEARCH AND CLINICAL PRACTICE (2022)

Article Computer Science, Interdisciplinary Applications

A method for comparing multiple imputation techniques: A case study on the US national COVID cohort collaborative

Elena Casiraghi, Rachel Wong, Margaret Hall, Ben Coleman, Marco Notaro, Michael D. Evans, Jena S. Tronieri, Hannah Blau, Bryan Laraway, Tiffany J. Callahan, Lauren E. Chan, Carolyn T. Bramante, John B. Buse, Richard A. Moffitt, Til Sturmer, Steven G. Johnson, Yu Raymond Shao, Justin Reese, Peter N. Robinson, Alberto Paccanaro, Giorgio Valentini, Jared D. Huling, Kenneth J. Wilkins

Summary: Healthcare datasets from Electronic Health Records are valuable for assessing associations between patients' predictors and outcomes. However, missing values are common in these datasets, and removing them may introduce bias. Multiple imputation algorithms have been proposed to recover missing information, but there is no consensus on which algorithm works best. Choosing algorithm parameters and data-related modeling choices is also challenging.

JOURNAL OF BIOMEDICAL INFORMATICS (2023)

Article Health Care Sciences & Services

Ontologizing health systems data at scale: making translational discovery a reality

Tiffany J. Callahan, Adrianne L. Stefanski, Jordan M. Wyrwa, Chenjie Zeng, Anna Ostropolets, Juan M. Banda, William A. Baumgartner, Richard D. Boyce, Elena Casiraghi, Ben D. Coleman, Janine H. Collins, Sara J. Deakyne Davies, James A. Feinstein, Asiyah Y. Lin, Blake Martin, Nicolas A. Matentzoglu, Daniella Meeker, Justin Reese, Jessica Sinclair, Sanya B. Taneja, Katy E. Trinkley, Nicole A. Vasilevsky, Andrew E. Williams, Xingmin A. Zhang, Joshua C. Denny, Patrick B. Ryan, George Hripcsak, Tellen D. Bennett, Melissa A. Haendel, Peter N. Robinson, Lawrence E. Hunter, Michael G. Kahn

Summary: Common data models address standardization challenges in EHR data, but fail to integrate all resources for deep phenotyping. OBO ontologies provide computable representations of biological knowledge, but mapping EHR data to OBO ontologies requires manual curation. OMOP2OBO is an algorithm that maps OMOP vocabularies to OBO ontologies, enabling deep phenotyping and identification of undiagnosed patients.

NPJ DIGITAL MEDICINE (2023)

Article Medicine, General & Internal

Predictive models of long COVID

Blessy Antony, Hannah Blau, Elena Casiraghi, Johanna J. Loomba, Tiffany J. Callahan, Bryan J. Laraway, Kenneth J. Wilkins, Corneliu C. Antonescu, Giorgio Valentini, Andrew E. Williams, Peter N. Robinson, Justin T. Reese, T. M. Murali

Summary: Predicting the incidence of long COVID using electronic health record data and machine learning methods is effective. Specific features like drugs and certain symptoms showed the highest influence on the prediction task.

EBIOMEDICINE (2023)

Review Medicine, General & Internal

Image-Guided Intraoperative Assessment of Surgical Margins in Oral Cavity Squamous Cell Cancer: A Diagnostic Test Accuracy Review

Giorgia Carnicelli, Luca Disconzi, Michele Cerasuolo, Elena Casiraghi, Guido Costa, Armando De Virgilio, Andrea Alessandro Esposito, Fabio Ferreli, Federica Fici, Antonio Lo Casto, Silvia Marra, Luca Malvezzi, Giuseppe Mercante, Giuseppe Spriano, Guido Torzilli, Marco Francone, Luca Balzarini, Caterina Giannitto

Summary: This study shows that intraoperative imaging techniques such as magnetic resonance imaging (MRI) and intraoral ultrasound (ioUS) have great potential in improving the assessment of surgical margins in oral cavity squamous cell cancer (OCSCC) surgery. IoUS has comparable accuracy to ex vivo MRI and is more affordable and reproducible.

DIAGNOSTICS (2023)

Article Computer Science, Interdisciplinary Applications

GRAPE for fast and scalable graph processing and random-walk-based embedding

Luca Cappelletti, Tommaso Fontana, Elena Casiraghi, Vida Ravanmehr, Tiffany J. J. Callahan, Carlos Cano, Marcin P. P. Joachimiak, Christopher J. J. Mungall, Peter N. N. Robinson, Justin Reese, Giorgio Valentini

Summary: GRAPE is a software resource for graph processing and embedding that can scale with big graphs, showing substantial improvements in space and time complexity compared to existing resources. It offers efficient graph-processing utilities, node embedding methods, and inference models, making it a valuable tool for graph representation learning. GRAPE is capable of handling millions of nodes and billions of edges, enabling large-graph analysis in various real-world applications.

NATURE COMPUTATIONAL SCIENCE (2023)

Article Biochemical Research Methods

An expectation-maximization framework for comprehensive prediction of isoform-specific functions

Guy Karlebach, Leigh Carmody, Jagadish Chandrabose Sundaramurthi, Elena Casiraghi, Peter Hansen, Justin Reese, Christopher J. Mungall, Giorgio Valentini, Peter N. Robinson

Summary: This article proposes a method called isoform interpretation to infer isoform-specific functions using expectation-maximization. It predicts specific functional annotations for 85,617 isoforms of 17,900 protein-coding genes and outperforms other methods in comparison to manually annotated results.

BIOINFORMATICS (2023)

Article Medicine, General & Internal

Generalisable long COVID subtypes: Findings from the NIH N3C and RECOVER programmes

Justin T. Reese, Hannah Blau, Elena Casiraghi, Timothy Bergquist, Johanna J. Loomba, Tiffany J. Callahan, Bryan Laraway, Corneliu Antonescu, Ben Coleman, Michael Gargano, Kenneth J. Wilkins, Luca Cappelletti, Tommaso Fontana, Nariman Ammar, Blessy Antony, T. M. Murali, J. Harry Caufield, Guy Karlebach, Julie A. McMurry, Andrew Williams, Richard Moffitt, Jineta Banerjee, Anthony E. Solomonides, Hannah Davis, Kristin Kostka, Giorgio Valentini, David Sahner, Christopher G. Chute, Charisse Madlock-Brown, Melissa A. Haendel, Peter N. Robinson

Summary: By computationally modelling PASC phenotype data and assessing semantic similarity, we identified six distinct clusters of PASC patients with different clinical features and severity, including diverse manifestations. This semantic phenotypic clustering approach provides a foundation for stratifying and studying PASC patients for natural history or therapy studies.

EBIOMEDICINE (2023)

Proceedings Paper Computer Science, Artificial Intelligence

ParSMURF-NG: A Machine Learning High Performance Computing System for the Analysis of Imbalanced Big Omics Data

Alessandro Petrini, Marco Notaro, Jessica Gliozzo, Tiziana Castrignano, Peter N. Robinson, Elena Casiraghi, Giorgio Valentini

Summary: In the context of Genomic and Precision Medicine, this study presents ParSMURF-NG, a High Performance Computing-oriented Machine Learning approach that can effectively handle big omics data. It has shown its usefulness in the detection of pathogenic single nucleotide variants in the non-coding regions of the human genome, providing a powerful model for Genomic Medicine.

ARTIFICIAL INTELLIGENCE APPLICATIONS AND INNOVATIONS. AIAI 2022 IFIP WG 12.5 INTERNATIONAL WORKSHOPS (2022)

Article Radiology, Nuclear Medicine & Medical Imaging

An approach to evaluate the quality of radiological reports in Head and Neck cancer loco-regional staging: experience of two Academic Hospitals

Caterina Giannitto, Andrea Alessandro Esposito, Giuseppe Spriano, Armando De Virgilio, Emanuele Avola, Giada Beltramini, Gianpaolo Carrafiello, Elena Casiraghi, Alessandra Coppola, Valentina Cristofaro, Davide Farina, Francesca Gaino, Giulia Lastella, Ludovica Lofino, Roberto Maroldi, Francesca Piccoli, Lorenzo Pignataro, Lorenzo Preda, Elena Russo, Lorenzo Solimeno, Giulia Vatteroni, Antonello Vidiri, Luca Balzarini, Giuseppe Mercante

Summary: This study evaluated the quality of loco-regional staging CT and MRI reports in head and neck cancer. The results showed that the quality of tumor description was low, while the quality of lymph node description was high.

RADIOLOGIA MEDICA (2022)

No Data Available