Sub-sampling RNA-seq data

0

Entering edit mode

6.5 years ago

mohammedtoufiq91 ▴ 260

Hi,

I would like to know if sub-sampling of a particular unaligned fastq file and aligned sam/bam file produce similar outputs? Is it better to sub-sample the fastq or aligned sam file? Which sub-sampling would be better?

Thank you, Toufiq

RNA-Seq fastq sam bam downsampling • 2.5k views

ADD COMMENT • link 6.5 years ago by mohammedtoufiq91 ▴ 260

0

Entering edit mode

Not part of your question, but if you are interested in the expression the easiest would probably be to sample the counts file. I think all approaches are more or less the same, but you did not tell us anything about the aim of the analysis.

ADD REPLY • link 6.5 years ago by WouterDeCoster 48k

0

Entering edit mode

Thank you. We have generated a set of RNA-seq samples from blood tissue (non globin depleted). These are human paired-end samples with read length of 150bp. After the alignment against hg19 genome, the alignment range is between 84-91% for different samples. Quantification process provided a rough estimate of read counts corresponding to the globin genes (HBA1, HBA2, HBB) between 30 - 62% which would later filtered before the normalization process. After this differential expression and fold change comparison between three subjects would be performed. We are currently doing technical validation to check if the RNA-seq analysis and results would be comparable or different with 100M or 50M reads depth.

ADD REPLY • link 6.5 years ago by mohammedtoufiq91 ▴ 260

1

Entering edit mode

I think this could be useful: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC4296149/