kmergenie does not predicted a best k value
0
0
Entering edit mode
6.4 years ago
StudentBio • 0

hi please, I am trying out kmergenie to determine optimal kmer values for phoenix dactylifera genome assembly. and i get this error any suggestion please

./kmergenie /home/vshare/outils/trimmed.fastq --diploid

running histogram estimation

Setting maximum kmer length to: 133 bp

computing histograms (from k=21 to k=121): 21 31 41 51 61 71 81 91 101 111 121 

ntCard wall-clock time over all k values: 2172 seconds 

fitting model to histograms to estimate best k

could not fit histograms-k101.histo

could not fit histograms-k111.histo

could not fit histograms-k121.histo

could not fit histograms-k21.histo

could not fit histograms-k31.histo

could not fit histograms-k41.histo

could not fit histograms-k51.histo

could not fit histograms-k61.histo

could not fit histograms-k71.histo

could not fit histograms-k81.histo

could not fit histograms-k91.histo

could not predict a best k value

No best k found
kmergenie assembly • 2.5k views
ADD COMMENT
0
Entering edit mode

What is the expected genome size and ploidy, and target sequencing coverage? Did you check for contaminants (bacterial, human, whatever) and did you remove sequencing adapters?

ADD REPLY
0
Entering edit mode

i'm sorry but i dont know how i can expect genome size and ploidy, and target sequencing coverage and this for what i'm trying to find the best K for use Genomescope qui (detecting the genome characteristics) according to fastqc report: Sequence length 20-397 and %GC 42

about my reads i trimmed them using sickle

(I use the diploid option because according to a study they find that the phoenix dactylifera genome contains 18 pairs chromosomes )

ADD REPLY
0
Entering edit mode

Acording to another study, the genome size should be around 670Mb. You can calculate target sequencing coverage using this estimative of genome size. These considerations are important to design the best sequencing strategy and choose an appropriate assembler.

Why do you want to assemble, if there is a reference genome availbale? If all the data you have at hand are these short (length 20-397) reads, most likely your assembly will be a worst than the published genome. What analyses you intend to perform downstream? I have the feeling mapping to this reference genome will be a better approach.

ADD REPLY

Login before adding your answer.

Traffic: 1531 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6