Entering edit mode
3.6 years ago
glady
▴
320
Hello,
If I have a sample sequenced in two different depths: 15-20M and 80-100M
1) How much difference will there be in the expression(raw count), of a certain coding gene coming from these two different libraries.
2) Is sequencing depth directly proportional to the read counts? If Yes, then how can we compare genes/features across samples having different sequencing depth?
is this your homework???
NO! I had this doubt bcz, when we download sequences from a open data source, the chunk of samples downloaded might be having different sequencing protocols.
So how good is it to compare a gene/feature across samples having different sequencing depth!
ok, answering your questions,
1) many low-expressed genes will be affected, highly expressed genes can be normalized but you will be losing many low expressed genes, the question is those missing genes are affecting the DEG analysis (and could be)
2) in genera yes, more reads means more expression, to compare the conditions you will need to normalize your libraries, R packages such as edgeR or DESeq2 have specific normalization methods.