velveth (from Velvet) not making Sequence files?
3
0
Entering edit mode
8.8 years ago
sturne29 ▴ 10

Hello. I apologize if this has been answered or addressed elsewhere, and I just wasn't able to find it through searching.

I am attempting to run velveth on a paired-end Illumina-generated sequence with short reads ( the two initial read files have been preprocessed, trimmed, and made into a single .fa file). I was trying to find the best hash length, so I wanted to use the ./velveth output_directory/ m,M,s (..data files..) function, where m-M is the range of values and s is the step. This is all being run on a cluster, so there shouldn't be any issues with computing ability. I compiled Velvet with a max kmer length of 149.

My command: ./velveth /output_directory_on_cluster/ 51,81,2 -shortPaired /path/to/sequence/files.fa with varying values for the m,M,s bit.

The job takes a while to run, and usually the first kmer value comes out correctly (makes a folder with Log, Roadmap, and Sequences files in it). However, the other ones don't have valid Sequence files; they are all empty files. I can use the typical velveth format (.velveth /output_directory_on_cluster/ hash_length -shortPaired /path/to/sequence/files.fa) and it works if I separately run each kmer value that I'd like to test, but I was hoping to streamline it a little bit by using the m,M,s ability.

Has anyone experienced this problem before or know of a solution?

Assembly software error • 3.6k views
ADD COMMENT
1
Entering edit mode
8.8 years ago
mastal511 ★ 2.1k

When you run velveth with multiple kmer lengths it usually only produces a Sequences file with all the sequences for the first kmer, then for the subsequent kmers it gives a symbolic link to the first Sequences file. It should produce Roadmaps and Log file for each kmer though.

ADD COMMENT
0
Entering edit mode

I see, thank you! That's very helpful to know.

ADD REPLY
0
Entering edit mode

A new question has come up: for all of the output files with the "symbolic link", I can't use ./velvetg on them. It gives me an error saying that it can't open the Sequences file. Any advice?

ADD REPLY
0
Entering edit mode
8.8 years ago
k.kathirvel93 ▴ 310

From my point of view, you should change your MAXKMERLENGTH value on your Makefile....check that...If you are using linux 64bit default is 31, but you mentioned in your command ./velveth /output_directory_on_cluster/ 51,81,2 -shortPaired /path/to/sequence/files.fa. K mer 51 to 81 and intervals 2. So Increase the MAXKMERLENGTH value from 31 to 61 or something.Then try again. And one more thing is if you want to find your best kmer length value for you seq, why can't you try velvetoptimiser.........all the best....

ADD COMMENT
0
Entering edit mode

the original poster mentioned that velvet had been compiled with a max kmer length of 149.

ADD REPLY
0
Entering edit mode

VelvetOptimiser has dependencies that aren't easily available on the cluster I'm on, or else I'd definitely be using it. I've tried kmergenie, but it doesn't necessarily give good results for the kmers it tries, so I'm doing it manually.

ADD REPLY
0
Entering edit mode

You can use VelvetStats also.....it might be a good one ........to find best Kmer ...you donwant to run manually with different Kmer length.....if you are mentioning 11,135,2 in velvetstats command line, VelvetStats will make 11,13,15,....133,135 like this......with separate output folder.

ADD REPLY
0
Entering edit mode
8.8 years ago
k.kathirvel93 ▴ 310

Hi sturne29, And one more thing i forget to tell you , if you are using illumina paired end sequence, you must be subsample your sequence randomly... then only it ill be easy to run velvet...otherwise you cannot.....................

ADD COMMENT

Login before adding your answer.

Traffic: 3244 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6