4.6 Article

Independent Component Analysis (ICA) based-clustering of temporal RNA-seq data

Journal

PLOS ONE
Volume 12, Issue 7, Pages -

Publisher

PUBLIC LIBRARY SCIENCE
DOI: 10.1371/journal.pone.0181195

Keywords

-

Funding

  1. CAPES
  2. FAPEMIG
  3. FUNARBE

Ask authors/readers for more resources

Gene expression time series (GETS) analysis aims to characterize sets of genes according to their longitudinal patterns of expression. Due to the large number of genes evaluated in GETS analysis, an useful strategy to summarize biological functional processes and regulatory mechanisms is through clustering of genes that present similar expression pattern over time. Traditional cluster methods usually ignore the challenges in GETS, such as the lack of data normality and small number of temporal observations. Independent Component Analysis (ICA) is a statistical procedure that uses a transformation to convert raw time series data into sets of values of independent variables, which can be used for cluster analysis to identify sets of genes with similar temporal expression patterns. ICA allows clustering small series of distribution-free data while accounting for the dependence between subsequent time-points. Using temporal simulated and real (four libraries of two pig breeds at 21, 40, 70 and 90 days of gestation) RNA-seq data set we present a methodology (ICAclust) that jointly considers independent components analysis (ICA) and a hierarchical method for clustering GETS. We compare ICAclust results with those obtained for K-means clustering. ICAclust presented, on average, an absolute gain of 5.15% over the best K-means scenario. Considering the worst scenario for K-means, the gain was of 84.85%, when compared with the best ICAclust result. For the real data set, genes were grouped into six distinct clusters with 89, 51, 153, 67, 40, and 58 genes each, respectively. In general, it can be observed that the 6 clusters presented very distinct expression patterns. Overall, the proposed two-step clustering method (ICAclust) performed well compared to K-means, a traditional method used for cluster analysis of temporal gene expression data. In ICAclust, genes with similar expression pattern over time were clustered together.

Authors

I am an author on this paper
Click your name to claim this paper and add it to your profile.

Reviews

Primary Rating

4.6
Not enough ratings

Secondary Ratings

Novelty
-
Significance
-
Scientific rigor
-
Rate this paper

Recommended

Article Agronomy

Computational intelligence to study the importance of characteristics in flood-irrigated rice

Antonio Carlos Da Silva Junior, Isabela Castro Sant'Anna, Gabi Nunes Silva, Cosme Damiao Cruz, Moyses Nascimento, Leonardo Bhering Lopes, Plinio Cesar Soares

Summary: This study quantified the best approaches for predicting the importance of characteristics in flood-irrigated rice using regression, artificial intelligence, and machine learning. Computational intelligence and machine learning were found to be effective in extracting nonlinear information and determining the relative importance of variables. The results indicated that flowering, number of grains filled by panicles, and length of panicles were important characteristics for decision making.

ACTA SCIENTIARUM-AGRONOMY (2023)

Article Agriculture, Dairy & Animal Science

Estimation of genetic parameters and inbreeding depression in Piau pig breed

Leticia Fernanda de Oliveira, Paulo Savio Lopes, Layla Cristien Cassia Miranda Dias, Leandro Magno Dornelas e Silva, Hugo Teixeira Silva, Simone Eliza Facioni Guimaraes, Daniele Botelho Diniz Marques, Delvan Alves da Silva, Renata Veroneze

Summary: This study estimated genetic parameters, effective population size, inbreeding, and inbreeding depression in Piau pigs. The results showed low direct heritabilities for birth weight and weaning weight, while average pre-weaning daily weight gain showed moderate heritability. There were high genetic correlations between weight at birth and weight at weaning, as well as weight at weaning and average pre-weaning daily weight gain. Inbreeding increased over generations and led to a reduction in effective population size. Inbreeding had a significant effect on average pre-weaning daily weight gain, with a decrease of 0.005 g for every 1% increase in inbreeding coefficient. Increasing effective population size is necessary to control inbreeding and preserve genetic variability in the Piau pig population.

TROPICAL ANIMAL HEALTH AND PRODUCTION (2023)

Article Genetics & Heredity

Marker-Assisted Recurrent Selection for Pyramiding Leaf Rust and Coffee Berry Disease Resistance Alleles in Coffea arabica L.

Laura Maritza Saavedra, Eveline Teixeira Caixeta, Geleta Dugassa Barka, Aluizio Borem, Laercio Zambolim, Moyses Nascimento, Cosme Damiao Cruz, Antonio Carlos Baiao de Oliveira, Antonio Alves Pereira

Summary: Marker-assisted recurrent selection was used to pyramid resistance gene alleles against coffee leaf rust and coffee berry diseases in Coffea arabica. 144 genotypes from 12 hybrid populations were evaluated, and molecular data were used for cross-certification, diversity study, and resistance allele marker-assisted selection. The results showed that the strategy of pyramiding resistance genes using marker-assisted selection was efficient in selecting superior coffee hybrids and could be used as a source of resistance in various crosses.

GENES (2023)

Article Agronomy

A novel fuzzy approach to identify the phenotypic adaptability of common bean lines

Vinicius Quintao Carneiro, Jussara Mencalha, Isabela de Castro Sant'Anna, Gabi Nunes Silva, Julio Augusto de Castro Miguel, Pedro Crescencio Souza Carneiro, Moyses Nascimento, Cosme Damiao Cruz

Summary: The genotype by environment interaction is the main factor affecting the response of evaluated genotypes in value for cultivation and use trials. Adaptability and stability analyses are crucial for understanding genotype performance in a specific growing region. The phenotypic adaptability method by fuzzy clustering is effective in identifying adaptability patterns of common bean genotypes, with higher discriminatory power compared to the centroid method.

ACTA SCIENTIARUM-AGRONOMY (2023)

Article Biochemistry & Molecular Biology

Runs of homozygosity and signatures of selection for number of oocytes and embryos in the Gir Indicine cattle

Renata de Fatima Bretanha Rocha, Arielly Oliveira Garcia, Pamela Itajara Otto, Marcos Vinicius Barbosa da Silva, Marta Fonseca Martins, Marco Antonio Machado, Joao Claudio do Carmo Panetto, Simone Eliza Facioni Guimaraes

Summary: The study aimed to verify the impact of ROH and inbreeding depression on total oocytes and embryos in Gir Indicine cattle, and to identify genes and enriched regions related to these traits. Generally, an increase in ROH was found to decrease the number of total oocytes and viable embryos. Additionally, multiple genes and genomic regions were identified as being associated with different breeding values for these traits.

MAMMALIAN GENOME (2023)

Article Forestry

Genome-wide association study for morphological, physiological, and productive traits in Coffea arabica using structural equation models

Matheus Massariol Suela, Camila Ferreira Azevedo, Ana Carolina Campana Nascimento, Mehdi Momen, Antonio Carlos Baiao de Oliveira, Eveline Teixeira Caixeta, Gota Morota, Moyses Nascimento

Summary: The standard MTM-GWAS does not capture the interrelated dependencies between coffee yield-related traits. By applying SEM to GWAS, we discovered positive correlations between vegetative vigor and yield, as well as between vegetative vigor and the number of reproductive nodes. Additionally, we identified three genes directly affecting coffee yield.

TREE GENETICS & GENOMES (2023)

Article Biochemistry & Molecular Biology

Single-step genome-wide association studies and post-GWAS analyses for the number of oocytes and embryos in Gir cattle

Renata de Fatima Bretanha Rocha, Arielly Oliveira Garcia, Pamela Itajara Otto, Mateus Guimaraes dos Santos, Marcos Vinicius Barbosa da Silva, Marta Fonseca Martins, Marco Antonio Machado, Joao Claudio do Carmo Panetto, Simone Eliza Facioni Guimaraes

Summary: This study used GWAS to identify genomic regions, genes, and biological processes associated with the number of embryos and oocytes in Gir dairy cattle. Several protein-coding genes were found to be related to embryo development and cell functions.

MAMMALIAN GENOME (2023)

Article Agronomy

Nonlinear Mixed-Effect Models to Describe Growth Curves of Pepper Fruits in Eight Cultivars Including Group Effects

Filipe Ribeiro Formiga Teixeira, Paulo Roberto Cecon, Matheus Massariol Suela, Moyses Nascimento

Summary: Evaluating the growth of fruit width and length in peppers is crucial for decision-making in managing and harvesting the crops. The Nonlinear Mixed-Effect Models (NLME) method was used to model the growth curves and residuals of pepper and bell pepper genotypes. The Richards model showed the best fit for fruit length (R-adj.(2)=0.9960), while the Logistic model was the best fit for fruit width (R-adj.(2)=0.9957). The NLME adjustment allowed efficient prediction and characterization of the genotypes.

AGRONOMY-BASEL (2023)

Article Agriculture, Multidisciplinary

Multiple-trait model by Bayesian inference applied to environment efficient Coffea arabica with low-nitrogen nutrient

Antonio Carlos da Silva Junior, Waldenia de Melo Moura, Livia Gomes Torres, Iara Goncalves dos Santos, Michele Jorge da Silva, Camila Ferreira Azevedo, Cosme Damiao Cruz

Summary: Identifying Coffea arabica cultivars with better genetic potential for cultivation in low-nitrogen concentrations is important for reducing environmental and economic impacts. This study used a Bayesian multitrait model to estimate heritability and select high-performing cultivars. Results showed that the cultivars Icatu Precoce 3282, Icatu Vermelho IAC 4045, Acaia Cerrado MG 1474, Tupi IAC 1669-33, Catucai 785/15, Caturra Vermelho, and Obata IAC 1669/20 demonstrated greater potential for cultivation in low-nitrogen concentration.

BRAGANTIA (2023)

Article Agronomy

Genome-enabled prediction through quantile random forest for complex traits

Cristiane Botelho Valadares, Moyses Nascimento, Mauricio de Oliveira Celeri, Ana Carolina Campana Nascimento, Lais Mayara Azevedo Barroso, Isabela de Castro Sant'Anna, Camila Ferreira Azevedo

Summary: Quantile Random Forest (QRF) is a non-parametric approach that combines Random Forest (RF) and Quantile Regression (QR) to explore non-linear functions and extract information from different quantiles. This study evaluated the performance of QRF in genomic prediction for complex traits and compared it with G-BLUP. Simulation results showed that QRF had equal or greater accuracies than other evaluated methodologies, making it an alternative tool for predicting genetic values in complex traits.

CIENCIA RURAL (2023)

Article Agriculture, Multidisciplinary

Updating knowledge in estimating the genetics parameters: Multi-trait and Multi-Environment Bayesian analysis in rice

Camila Ferreira Azevedo, Cynthia Aparecida Valiati Barreto, Matheus Massariol Suela, Moyses Nascimento, Antonio Carlos da Silva Junior, Ana Carolina Campana Nascimento, Cosme Damiao Cruz, Plinio Cesar Soraes

Summary: This study evaluated the efficiency and applicability of multi-trait multi-environment models within a Bayesian framework using an informative prior distribution strategy based on previous data on rice. The results showed that the Bayesian approach with informative prior distributions provided more accurate estimates and could detect genetic correlations between traits.

SCIENTIA AGRICOLA (2023)

Article Agriculture, Dairy & Animal Science

Phenotypic causal networks between boar taint compounds measured in biopsies and carcasses

Margareth Evangelista Botelho, Marcos Soares Lopes, Pramod K. Mathur, Egbert F. Knol, Daniele B. D. Marques, Paulo Savio Lopes, Fabyano Fonseca e Silva, Simone Eliza Facioni Guimaraes, Renata Veroneze

Summary: This study aimed to investigate the causal relationship and causal effects among boar taint compounds measured in pig adipose tissue from carcasses and biopsies. The results showed that boar taint compounds measured in biopsies have direct effects on the compounds measured in carcasses.

ANIMAL PRODUCTION SCIENCE (2023)

No Data Available