Entering edit mode
8.2 years ago
IP
▴
770
Hi everyone!
I analysing some RNA-seq samples from cancer patients with default parameters in tophat. My reads are paired end reads, and the right reads mapping percentage and control quality is good.
However, the mapping rate of the left reads is around 50%, and I have high enrichment of k-mers in the middle of this reads, across all the samples of the experiments. The rest of the QC looks good for the samples.
I am afraid that I am getting low mapping rates because of this. What should I do? or how should I remove them as they are in the middle of the reads
I am attaching some figures of the kmer content.
What about the rest of the QC? Q Scores, adapter contamination, is that all looking reasonable?
Yes, the rest of the QC looks good for the reads
Were these low input RNA samples? Is there any chance of over amplification during prep? Have you checked 50% of the reads that don't align by blast?
They weren't low input RNA samples, and I am going to check those reads with blast, thank you
Are you sure you don't have a spike somewhere in the GC distribution? This is the sort of thing I often see with contamination of a single short fragment.
Yes! I have a small spike in the G+C distribution. I thought it wasn't important. How should I deal with it?
You don't need to do anything, just be aware that you have a bit extra of something in there (presumably adapter dimers or something like that).