My kmer distribution is weird
0
0
Entering edit mode
8.1 years ago
Picasa ▴ 650

Hi,

I used kmergenie and sga preqc to evaluate my data before assembly.

However my graphs of kmer distribution are a bit weird; I am not sure how to interpret it.

I know that k-mers with low count typically contain sequencing errors and I should have a peak somewhere.

But here I have no peak, do you have a clue about what is going on ?

Sga preqc

http://imgur.com/a/muB6k

Kmergenie

http://imgur.com/a/UEt9e

kmer distribution • 2.7k views
ADD COMMENT
0
Entering edit mode

Did you by any chance pre-filter your data by quality? Usually the absence of peaks indicates too less coverage for your species or contamination in your samples.

If you did filter it, just try to run it without filtering or leniency while trimming by quality values (something like q=10 instead of the usual 20 or 30)

ADD REPLY
0
Entering edit mode

I used Trimmomatic to filter ma data with Q>30 and min(length)=40.

Those graphs are the raw reads; However I discard only 5% after trimming step so the graphs are quite close for trimmed data.

ADD REPLY
0
Entering edit mode

If they are for the raw-reads then probably it is just the coverage problem i.e. you need much more coverage to get your species sequenced. Try to check how much of coverage you might have with the (TotalBases/GenomeSize).

If you have run these on trimmed reads, your min-len=40 and the kmer=51, which means a significant amount of data might be lost, so just increase the min-length to 52. Q>30 is already too strict.

ADD REPLY

Login before adding your answer.

Traffic: 2930 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6