Entering edit mode
5.9 years ago
mohammedtoufiq91
▴
260
Hi,
I would like to know if sub-sampling of a particular unaligned fastq file and aligned sam/bam file produce similar outputs? Is it better to sub-sample the fastq or aligned sam file? Which sub-sampling would be better?
Thank you, Toufiq
Not part of your question, but if you are interested in the expression the easiest would probably be to sample the counts file. I think all approaches are more or less the same, but you did not tell us anything about the aim of the analysis.
Thank you. We have generated a set of RNA-seq samples from blood tissue (non globin depleted). These are human paired-end samples with read length of 150bp. After the alignment against hg19 genome, the alignment range is between 84-91% for different samples. Quantification process provided a rough estimate of read counts corresponding to the globin genes (HBA1, HBA2, HBB) between 30 - 62% which would later filtered before the normalization process. After this differential expression and fold change comparison between three subjects would be performed. We are currently doing technical validation to check if the RNA-seq analysis and results would be comparable or different with 100M or 50M reads depth.
I think this could be useful: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296149/
Thank you. This was helpful.