K-mers in the middle of the read
0
0
Entering edit mode
8.2 years ago
IP ▴ 770

Hi everyone!

I analysing some RNA-seq samples from cancer patients with default parameters in tophat. My reads are paired end reads, and the right reads mapping percentage and control quality is good.

However, the mapping rate of the left reads is around 50%, and I have high enrichment of k-mers in the middle of this reads, across all the samples of the experiments. The rest of the QC looks good for the samples.

I am afraid that I am getting low mapping rates because of this. What should I do? or how should I remove them as they are in the middle of the reads

I am attaching some figures of the kmer content.

kmer profile of a control sample

kmer profile of another control sample

kmer profile of a disease sample

RNA-Seq kmers mapping • 1.9k views
ADD COMMENT
0
Entering edit mode

What about the rest of the QC? Q Scores, adapter contamination, is that all looking reasonable?

ADD REPLY
0
Entering edit mode

Yes, the rest of the QC looks good for the reads

ADD REPLY
0
Entering edit mode

Were these low input RNA samples? Is there any chance of over amplification during prep? Have you checked 50% of the reads that don't align by blast?

ADD REPLY
0
Entering edit mode

They weren't low input RNA samples, and I am going to check those reads with blast, thank you

ADD REPLY
0
Entering edit mode

Are you sure you don't have a spike somewhere in the GC distribution? This is the sort of thing I often see with contamination of a single short fragment.

ADD REPLY
0
Entering edit mode

Yes! I have a small spike in the G+C distribution. I thought it wasn't important. How should I deal with it?

ADD REPLY
1
Entering edit mode

You don't need to do anything, just be aware that you have a bit extra of something in there (presumably adapter dimers or something like that).

ADD REPLY

Login before adding your answer.

Traffic: 2020 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6