miRNA-seq / Differential expression analysis using edgeR

Description

This tool performs an analysis for differential expression using the edgeR Bioconductor package.

Parameters


Details


Given an input table of raw counts, the edgeR package performs statistical analysis to identify differentially expressed genomic features between two experimental conditions. Note that in its current implementation, the tool only supports single-factor experiment designs. The experiment conditions to be compared should be defined in the phenodata.tsv file, and the appropriate column selected using the 'Column describing groups' parameter.

Normalization factors are calculated using the library size given by the user in the phenodata.tsv or by summing the counts for each sample. TMM method is then used to calculate the normalization factors in order to reduce RNA compositon bias (which can arise for example when only a small number of genes are very highly expressed in one experiment condition but not in the other).

Dispersion is estimated using qCML method. It can estimate a common dispersion for all the genomic features, or a separate dispersion for each individual feature using an empirical Bayes strategy.

It is highly recommended to always have at least two biological replicates for each experiment condition. If this is not possible, one can run the analysis by manually setting the dispersion factor through the 'Dispersion estimate' parameter. By default the dispersion estimate is set to 0.1, which is somewhere in-between what is usually observed for technical replicates (0.01) and human data (0.4). It is recommended to experiment with different values for this parameter.

Statistical testing is performed using the exact test based on qCML methods.

Output

The analysis output consists of the following files:


References

This tool uses the edgeR package for statistical analysis. Please read the following article for more detailed information:

MD Robinson, DJ McCarthy, and GK Smyth. edgeR: a bioconductor package for differential expression analysis of digital gene expression data. Bioinformatics, 26 (1):139Ð40, Jan 2010.

.