Dear all,
I am trying to do a short-long read hybrid de novo assembly on the MiSeq 2x75 paired end data and TruSeq long reads data (1500-16608bp) using three different assemblers, Velvet, ABySS, and SOAPdenovo.
I have already generated:
1) Velvet contigs using MiSeq only
2) ABySS contigs using MiSeq only
3) SOAP contigs using MiSeq only
And then, I combined the contigs with the long reads. For example,
cat Velvet_67/contigs.fa LongRead.fasta > Velvet_Hybrid.fasta
Now, I want to run assemblies using the hybrid fasta and three assemblers mentioned above.
My question is, how do I select the k-mer size?
For example, my Velvet_Hybrid.fasta has min read length of 133 and max read length of 16608.
Kenny
Which read categories should I pick? since I only have a single file.
I would choose long for your data. Here's what the author and developer wrote:
Source: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2952100/
Thank you Kevin! Appreciated :)
Kenny
I am currently running velvet on my velvet_hybrid.fasta file. My command is:
However, I get some of the unusual messages, for example:
[0.000000] Reading FastA file Velvet_Hybrid.fasta; [16.050157] 199270 sequences found [16.050174] Done [16.097246] Reading read set file ./Velvet_VelvetContigs/k93/Sequences; [16.750961] 199270 sequences found [18.714973] Done [18.714992] 199270 sequences in total. [18.715532] Writing into roadmap file ./Velvet_VelvetContigs/k93/Roadmaps... [25.688339] Inputting sequences... [25.692383] Inputting sequence 0 / 199270 Killed [0.000001] Reading roadmap file ./Velvet_VelvetContigs/k93/Roadmaps [0.850929] 199270 roadmaps read
[0.000000] Reading roadmap file ./Velvet_VelvetContigs/k95/Roadmaps [0.951589] 199270 roadmaps read [0.953281] Creating insertion markers [1.024402] Ordering insertion markers [1.074139] Counting preNodes [1.155569] 1735089 preNodes counted, creating them now [5.339031] Irregular sequence file: are you sure your Sequence and Roadmap file come from the same source? [0.000000] Reading FastA file Velvet_Hybrid.fasta; [13.756826] 199270 sequences found