Hello, I have computed the coverage of my reads with bamQC and I got, for one of my reads:
>>>>>>> Coverage
mean coverageData = 33.5984X
std coverageData = 4,599.979X
There is a 1,3% of reference with a coverageData >= 1X
There is a 1,14% of reference with a coverageData >= 2X
There is a 0,85% of reference with a coverageData >= 3X
This 33X is good for rnaseq analysis, but seeing the std so big and a very low percentage of reference above 3X, seems very bad. What should my conclusion be? Even if its a high mean coverage if it has a standard deviation so high, then this should be a bad coverage? Or is this mean coverage still good even if I have a huge std?
Please, help me interpret these results.
Thank you
What is the motivation to do this analysis, so what question do you aim to answer? With the things given, it is hard to interpret anything, as the question is lacking.
I want to know if my read is good enough to use it for differential expression analysis. I have read 2X is enough coverage for that. What do you think?
Fold coverage is generally a useless concept in RNAseq.
So how do i know if my fastq file is good enough for differential expression analysis? This sample mapped 4,7% of reads against reference
4.7% is crazy low, there is obviously something wrong, focus on figuring out why such a small percentage of your reads are aligning.
Okay, thank you, but i have read 2X is the minimum mapped coverage for using the reads for differential expression analysis. I eant to know if this fastq is good enough or if i need to repeat the sample sequencing
I don't know where you read that, but please follow guidelines such as this one:
Bioconductor RNA-seq workflow: gene-level exploratory analysis and differential expression
Hello Corentin, thank you for your detailed answer. Then, looking at the % of uniquely mapped reads should be enough? How can I decide if the percentage of uniquely mapped reads is good enough for analysis?
Please use
ADD COMMENT
orADD REPLY
to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your reaction but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked."Good enough" is subjective and depends on your experiment, your genome, your reads quality etc... But the higher the percentage of mapped read the better.
You can look in the literature and tutorials for additional steps to check and analyse your data. RNA-seq is a popular method and a lot of resources are available that will explain it more comprehensibly and better than me.
This is the table of some of the samples after mapping:
Something appears to have gone wrong with your first 4 samples, presumably they're heavily contaminated or something. Blast some of the reads that didn't align to try and determine what went wrong.