Hi:
I am interesting in using human tissue data from SRA dataset SRP007412
however, after fastq-dump and running tophat
I found that the mapping rate is rather poor: most were < 70% and even 45% for some sample.
What would you typically do when encounter such low mapping rate public data?
Thanks in advance!
Best wishes
Hi Sukhdeep:
I used the mapping rate provided in the align_summary.txt file, which provided by tophat.
I think this is the same as what you suggested.
Thanks.
Hey, that's right then. I don't we can generalise that public data has a lower mappability. You could try pulling some other recent datasets just to test that. It could also be that the library is over sequenced and thus producing lot of duplicates or some samples are contaminated. Run the downstream processing and see if you are happy with the results, if the saturation limit is reached, you might not care or could do anything about it.
Also, this might be a help
why low mapping rates for RNAseq?