Question

Complex and Complicated RNAseq analysis

1

Entering edit mode

9.6 years ago

mfahim ▴ 10

I carried out RNAseq and smallRNAseq on WT, Mutant1, Mutant2 and Double Mutant M1M2 with NO replicates at two different temperatures.

Will the absence of Replicates affect my data analysis?

Can anyone advise how can I correlated the Transcriptome data (RNAseq) with data generated from smallRNAseq (ncRNAs)?

How can I run GO and Pathway enrichment analysis?

Is cummeRbund going to help me make sense of the data?

Differential-Analysis cummeRbund Enrichment • 2.2k views

ADD COMMENT • link updated 22 months ago by Ram 44k • written 9.6 years ago by mfahim ▴ 10

Ram · Answer 1 · 2015-05-06

Yes, it will affect. It is always a good practice to have biological replicates of your samples. The absence of biological replicates results in a low statistical power. I always recommend at least 3 biological replicates. If you do two, how do you know one isn't bad? If you do three, and one is bad, you can at least eliminate it and continue .For example, if you want to identify the genes that are differentially expressed between two strains of yeast then you will most likely grow each of the two strains in different flasks. Growing the strains in different flasks of course introduces some type of biological variance. This can be seen because if you grew two flasks of the same yeast strain then the expression would be different. These differences are caused by biological variance. Thus, biological replicates are required for this experiment, in order to discard the biological variance caused by the different flasks.

You can read this thread about this topic: Rna-Seq Biological Replicates...
What do you mean by correlate? What is your goal?
There are different tools for doing Gene Ontology enrichment analysis and Pathway enrichment analysis. For the first one you can use:
- DAVID (http://david.abcc.ncifcrf.gov/).
- GOrilla (http://cbl-gorilla.cs.technion.ac.il/)
- Ontologizer (http://compbio.charite.de/contao/index.php/ontologizer2.html)
- And there are also R packages available through Bioconductor, like GOstats).
- In this thread you can find more (Tools To Find Gene Ontology Term Enrichment)
For the pathway analysis I generally use KEGG. See this thread for more info: Best Way To Do Pathway Analysis Of A Set Of Genes?
cummeRbund is used to create SQLite database using your analysis results describing appropriate relationships between genes, transcripts, transcription start sites, and CDS regions. Once the data is inside the database, it can be retrieved in a very efficiently and easy way, allowing to explore subfeatures of individual genes, or genesets as the analysis requires. So yes, if you use cufflinks and cuffdiff to perform your DE analysis, it could be a good idea to create a DB using cummeRbund package in order to storage, access, explore and manipulate your data in a clear and easy way. Furthermore, cummeRbund provides numerous plotting functions for commonly used visualizations.

Hope it helps.