Entering edit mode
6.5 years ago
eyonesi
▴
60
hi I run a project of denovo rnaseq with 3 treatment and 3 replicate for each treatment. I performed Quality Check Your Samples and Biological Replicates step for differential expression output by trinity/rsem/edger pipline, but replicates and treatment are not properly classified. how can i solve this problem. Please help me for solving this problem thank you
Can you elaborate on what that means? What tests have you done that say that.
hello dear genomax I used trinity/RSEM/edgeR pipeline for analysis. The quality check of samples and biological replicates step was done to ensure that my biological replicates are well correlated. The Pearson correlation and PCA method were used and heatmap and PCA plot were generated. Since the outputs of RSEM were two count matrix files of gene.isoform and isoform.isoform, I performed both. The problem is that my replicates did not fit with corresponding treatments. Images of plots are attached. Thanks for your time.
How to add images to a Biostars post
Note: You may want to post full images instead of the just previews I could get from the links you had posted.
Please describe your analysis in more detail: did you assemble all samples together? What is the sequencing depth, that is, how many reads per sample did you obtain? Sequencing was single- or paired-end, unstranded or stranded? What were the commands used for assembly and differential expression? Did you apply some minimal-expression filter before differential expression analysis? You did the clustering on top of how many genes / transcripts?
RNAseq based differential gene / transcript expression analysis with a reference genome has already a considerable amount of noise, especially for lowly expressed genes, which have large variances in general. A transcriptome assembly from short reads is also very noisy, with tens / hundreds of thousands of transcripts. Finally, three biological replicates isn't really a big sample size. Summing all up, it is no surprise results seems puzzling.
What is happening is these exploratory analyses are performed using the genes with most variance, which aren't necessarily the differentially expressed genes. I don't think there is "solving" to do, maybe "mitigation". Try to filter lowly expressed genes / transcripts (see the ExN50 stats), and adjust your analysis for the number of differentially expressed genes found.