Article
Automation & Control Systems
Daniel R. Kowal
Summary: Subset selection is a valuable tool for interpretability, scientific discovery, and data compression. We propose a Bayesian approach to address the challenges in classical subset selection, and introduce a strategy that focuses on finding near-optimal subsets rather than a single best subset. We apply Bayesian decision analysis to derive the optimal linear coefficients for any subset of variables, and our approach outperforms competing methods in prediction, interval estimation, and variable selection. By analyzing a large education dataset, we gain unique insights into the factors that predict educational outcomes and identify over 200 distinct subsets of variables that offer near-optimal predictive accuracy.
JOURNAL OF MACHINE LEARNING RESEARCH
(2022)
Article
Environmental Sciences
Karel Dieguez-Santana, Manuel Mesias Nachimba-Mayanchi, Amilkar Puris, Roldan Torres Gutierrez, Humberto Gonzalez-Diaz
Summary: This study developed Quantitative Structure-Toxicity Relationship (QSTR) models using multiple statistical models and machine learning algorithms, and found that the Random Forest regression model was the most superior. The results suggest that the developed QSTR models can reliably predict pesticide toxicity in Americamysis bahia, and can be applied in pesticide screening and prioritization.
ENVIRONMENTAL RESEARCH
(2022)
Article
Agronomy
Hong Li, Junwei Wang, Jixiong Zhang, Tongqing Liu, Gifty E. Acquah, Huimin Yuan
Summary: This study explores the potential of combining multivariate methods and spectral variable selection for estimating soil organic matter and total nitrogen using mid-infrared spectroscopy. Through analysis of 510 soil samples, the study finds that the model combining multiple linear regression with stability competitive adaptive reweighted sampling yields the most accurate estimation results.
Article
Toxicology
Seong Kwang Lim, Jean Yoo, Haewon Kim, Yeon-Mi Lim, Woong Kim, Ilseob Shim, Ha Ryong Kim, Pilje Kim, Ig-chun Eom
Summary: Recent studies have found that the submerged 2-D cell culture model can predict the in vivo acute inhalation toxicity of substances with a water solubility of >= 1 g/L in semi-volatile organic compounds and non-volatile organic compounds. These substances show a strong correlation between 24-h IC50 and 4-h LC50, allowing for accurate classification based on acute inhalation toxicity levels.
JOURNAL OF APPLIED TOXICOLOGY
(2021)
Article
Genetics & Heredity
Zongliang Hu, Yan Zhou, Tiejun Tong
Summary: A robust variable selection algorithm based on logistic regression was developed for meta-analyzing high-dimensional datasets, using a combination of least trimmed squared estimates and hierarchical bi-level variable selection technique, to achieve more reliable results.
FRONTIERS IN GENETICS
(2021)
Article
Cell Biology
Yen-Hsiu Yeh, Chia-Chih Tsai, Tien-Wen Chen, Chieh-Hua Lee, Wei-Jer Chang, Mei-Yi Hsieh, Tsai-Kun Li
Summary: Cadmium exposure activates multiple proteolytic systems and ROS formation, leading to intracellular damage and cytotoxicity.
MOLECULAR AND CELLULAR BIOCHEMISTRY
(2022)
Article
Agronomy
Franco Suarez, Cecilia Bruno, Franca Kurina Giannini, M. Paz Gimenez Pecci, Patricia Rodriguez Pardina, Monica Balzarini
Summary: This study aimed to evaluate different combinations of variable selection methods with linear and non-linear predictors to fit climate-based disease models and predict the occurrence of diseases in pathosystems. The results showed that the feature selection methods had no impact on the accuracy of predictions in the random forest algorithm, while the stepwise regression combined with VIF and p-value criteria outperformed other methods in fitting the logistic linear regression model.
EUROPEAN JOURNAL OF AGRONOMY
(2023)
Article
Environmental Sciences
Yanfeng Zhang, Minwei Xie, David M. Spadaro, Stuart L. Simpson
Summary: In order to assess the quality of sediments, it is desirable to predict the toxicity risk levels for aquatic organisms based on simple environmental measurements. This study evaluates the improvement in toxicity risk predictions by using bioavailability-modified guidelines, considering factors such as particle size, total organic carbon, and acid volatile sulfide. The results show that incorporating these factors can enhance the accuracy of toxicity risk assessments.
ENVIRONMENTAL POLLUTION
(2023)
Article
Pharmacology & Pharmacy
Jie Liu, Wenjing Guo, Fan Dong, Jason Aungst, Suzanne Fitzpatrick, Tucker A. A. Patterson, Huixiao Hong
Summary: This study demonstrates the potential of using machine learning models to evaluate the reproductive toxicity of chemicals. By developing predictive models based on rat multigeneration reproductive toxicity testing data and seven machine learning algorithms, researchers were able to achieve good performance in predicting reproductive toxicity. The findings suggest that machine learning can be a promising alternative approach in assessing the potential reproductive toxicity of chemicals.
FRONTIERS IN PHARMACOLOGY
(2022)
Article
Agronomy
Po-Ya Wu, Jen-Hsiang Ou, Chen-Tuo Liao
Summary: Genomic prediction (GP) is a statistical method that predicts genomic estimated breeding values (GEBVs) for individuals in a breeding population using a combination of phenotypic and genotypic data. The determination of the optimal sample size for a training set remains a challenge in GP studies. This study proposes a practical approach using logistic growth curve to determine a cost-effective training set size for a given genome dataset with known genotypic data.
THEORETICAL AND APPLIED GENETICS
(2023)
Article
Computer Science, Artificial Intelligence
Joaquin Pacheco, Silvia Casado
Summary: This paper analyzes the variable selection problem in the context of Linear Regression for large databases, proposing a Branch & Bound method to tackle the issue effectively in very large databases. Computational experiments show that this method performs well compared to other known methods and commercial software.
APPLIED INTELLIGENCE
(2021)
Article
Environmental Sciences
Gwilym A. V. Price, Jenny L. Stauber, Dianne F. Jolley, Darren J. Koppel, Eric J. Van Genderen, Adam C. Ryan, Aleicia Holland
Summary: Multiple linear regression (MLR) models were developed to predict chronic zinc toxicity to a freshwater microalga, Chlorella sp., using three toxicity-modifying factors (TMFs): pH, hardness, and dissolved organic carbon (DOC). The study found that hardness was the most influential TMF in the models, while pH, DOC, and interactive terms had variable influence. However, the models performed poorly when predicting toxicity in zinc-spiked natural waters during validation, indicating a potential unaccounted for TMF or a limitation in applying synthetic laboratory water-based models to natural water samples.
ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY
(2023)
Article
Chemistry, Analytical
Luciana dos Santos Canova, Federico Danilo Vallese, Marcelo Fabian Pistonesi, Adriano de Araujo Gomes
Summary: This study proposes a new algorithm called fSPA-MLR, which enhances the performance of the original SPA-MLR method by adding a filter step to reduce the number of uninformative variables. The fSPA-MLR models demonstrate superior performance compared to PLS and the original SPA-MLR models in both cross-validation and external prediction.
ANALYTICA CHIMICA ACTA
(2023)
Article
Engineering, Environmental
Ali Jozaghi, Haojing Shen, Mohammadvaghef Ghazvinian, Dong-Jun Seo, Yu Zhang, Edwin Welles, Seann Reed
Summary: This study introduces a novel multiple linear regression technique, CBP-MLR, to reduce attenuation bias and improve prediction over tails, while maintaining MLR performance near the median. Additionally, the use of composite MLR (CompMLR) linearly averaging MLR and CBP-MLR estimates is shown to be generally superior to existing forecasts in terms of mean squared error under varying conditions of predictability and predictive skill.
STOCHASTIC ENVIRONMENTAL RESEARCH AND RISK ASSESSMENT
(2021)
Article
Computer Science, Interdisciplinary Applications
Takayuki Shuku, Kok-Kwang Phoon
Summary: This study develops a lasso method specialized for geotechnical problems, and proposes an efficient implementation to address the computational challenges in 3D scenarios.
COMPUTERS AND GEOTECHNICS
(2021)
Article
Biochemical Research Methods
Hee-Sung Ahn, Seong-Jun Park, Hyun-Gyo Jung, Se Joon Woo, Cheolju Lee
JOURNAL OF MASS SPECTROMETRY
(2018)
Article
Multidisciplinary Sciences
Jihye Shin, Sang-Yun Song, Hee-Sung Ahn, Byung Chull An, Yoo-Duk Choi, Eun Gyeong Yang, Kook-Joo Na, Seung-Taek Lee, Jae-Il Park, Seon-Young Kim, Cheolju Lee, Seung-won Lee
Article
Multidisciplinary Sciences
Jeonghun Yeom, Mohammad Humayun Kabir, Byungho Lim, Hee-Sung Ahn, Seon-Young Kim, Cheolju Lee
SCIENTIFIC REPORTS
(2016)
Article
Biochemistry & Molecular Biology
Hee-Sung Ahn, Jong Ho Kim, Hwangkyo Jeong, Jiyoung Yu, Jeonghun Yeom, Sang Heon Song, Sang Soo Kim, In Joo Kim, Kyunggon Kim
INTERNATIONAL JOURNAL OF MOLECULAR SCIENCES
(2020)
Article
Oncology
Hee-Sung Ahn, Jeonghun Yeom, Jiyoung Yu, Young-Il Kwon, Jae-Hoon Kim, Kyunggon Kim
Article
Oncology
Hee-Sung Ahn, Jung Yoon Ho, Jiyoung Yu, Jeonghun Yeom, Sanha Lee, Soo Young Hur, Yuyeon Jung, Kyunggon Kim, Youn Jin Choi
Summary: Most hereditary ovarian cancer is associated with BRCA1/2 variants, and plasma protein biomarkers, specifically SPARC and THBS1, may indicate increased risk of developing ovarian cancer in BRCA1/2 carriers. Monitoring these protein concentrations could potentially help in deciding whether to undergo oophorectomy for disease prevention.
Article
Respiratory System
Soo Han Kim, Hee-Sung Ahn, Jin-Soo Park, Jeonghun Yeom, Jiyoung Yu, Kyunggon Kim, Yeon-Mok Oh
Summary: In this study, potential plasma biomarkers for AECOPD were identified using a proteomics method, with SERPINA3 potentially serving as a promising diagnostic biomarker. The research revealed associations between AECOPD and proteins related to acute-phase response and lipid metabolism.
INTERNATIONAL JOURNAL OF CHRONIC OBSTRUCTIVE PULMONARY DISEASE
(2021)
Article
Oncology
Sungchan Gwark, Hee-Sung Ahn, Jeonghun Yeom, Jiyoung Yu, Yumi Oh, Jae Ho Jeong, Jin-Hee Ahn, Kyung Hae Jung, Sung-Bae Kim, Hee Jin Lee, Gyungyub Gong, Sae Byul Lee, Il Yong Chung, Hee Jeong Kim, Beom Seok Ko, Jong Won Lee, Byung Ho Son, Sei Hyun Ahn, Kyunggon Kim, Jisun Kim
Summary: Proteomic analysis of plasma proteins in breast cancer patients undergoing neoadjuvant chemotherapy identified potential biomarkers such as MBL2 and P4HB that were significantly associated with survival outcomes and could differentiate between low- and high-risk groups. Further investigation is needed to understand the role of these markers in predicting prognosis and their therapeutic potential for preventing recurrence.
Article
Multidisciplinary Sciences
Yeon-Sook Choi, Myung Ji Kim, Eun A. Choi, Sinae Kim, Eun Ji Lee, Min Ji Park, Mi-Ju Kim, Yeon Wook Kim, Hee-Sung Ahn, Jae Yun Jung, Gayoung Jang, Yongsub Kim, Hyori Kim, Kyunggon Kim, Jin Young Kim, Seung-Mo Hong, Song Cheol Kim, Suhwan Chang
Summary: Gal-3BP is identified as a highly secreted protein in PDAC and its overexpression is closely related to cell proliferation, migration, and invasion. It is found that Gal-3BP enhances galectin-3-mediated signaling, leading to increased PDAC metastasis. Blocking Gal-3BP with antibodies can effectively suppress PDAC metastasis.
PROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF AMERICA
(2022)
Article
Pediatrics
Mi-Jin Kang, Hee-Sung Ahn, So-Yeon Lee, Jeonghun Yeom, Kyunggon Kim, Soo-Jong Hong
Summary: Proteomic analysis was used to compare biomarker differences between children with asthma and PIBO. Findings revealed that TGF-beta 1 and periostin were unique biomarkers for PIBO and asthma in children, respectively. The mechanism regulated by IKBKB may have therapeutic relevance for PIBO and asthma.
PEDIATRIC PULMONOLOGY
(2022)
Article
Andrology
Hee-Sung Ahn, Jeonghun Yeom, Hwangkyo Jeong, Won Young Park, Ja Yoon Ku, Byeong Jin Kang, Kyung Hwan Kim, Chan Ho Lee, Sangheon Song, Sun Sik Bae, Kyunggon Kim, Hong Koo Ha
Summary: This study establishes the standard procedure for preparing prostate tissues and evaluates the quality of proteomics and phosphoproteomics during transurethral resection of the prostate (TUR-P) with different surgical conditions. The results show that the proteomic profiles of prostate tissue collected by TUR-P are not significantly affected by energy levels, tissue location, or surgical technique. This study provides important guidance for tissue samples in castration resistant prostate cancer patients where it is difficult to obtain tissue.
WORLD JOURNAL OF MENS HEALTH
(2022)
Article
Respiratory System
So-Yeon Lee, Hee-Sung Ahn, Eun Mi Kim, Kyung Kon Kim, Mi-Jin Kang, Min Jee Park, Hyun-Ju Cho, Eun Lee, Sungsu Jung, Jisun Yoon, Song- Yang, Dong-Uk Park, Soo-Jong Hong
Summary: The study revealed that asthma associated with low-intensity exposure to PHMG is characterized by lower lung function, lower positive rates of bronchial hyperresponsiveness, and varied distributions of plasma proteins.
ANNALS OF THE AMERICAN THORACIC SOCIETY
(2021)
Article
Biochemistry & Molecular Biology
Eun Young Kim, Hee-Sung Ahn, Min Young Lee, Jiyoung Yu, Jeonghun Yeom, Hwangkyo Jeong, Hophil Min, Hyun Jeong Lee, Kyunggon Kim, Yong Min Ahn
Article
Biology
Sung-Yup Cho, Seungun Lee, Jeonghun Yeom, Hyo-Jun Kim, Jin-Haeng Lee, Ji-Woong Shin, Mee-ae Kwon, Ki Baek Lee, Eui Man Jeong, Hee Sung Ahn, Dong-Myung Shin, Kyunggon Kim, In-Gyu Kim
LIFE SCIENCE ALLIANCE
(2020)
Article
Biochemistry & Molecular Biology
Joon Hee Kang, Seon-Hyeong Lee, Dongwan Hong, Jae-Seon Lee, Hee-Sung Ahn, Ju-Hyun Ahn, Tae Wha Seong, Chang-Hun Lee, Hyonchol Jang, Kyeong Man Hong, Cheolju Lee, Jae-Ho Lee, Soo-Youl Kim
EXPERIMENTAL AND MOLECULAR MEDICINE
(2016)