Merging Dataset
1
2
Entering edit mode
12.7 years ago
khikho ▴ 100

To preparing larger Dataset for velvet, I merged two unmapped human reads files and now I have 3 questions ?!

1) Is this the right way to increase the quality of velvet output (I mean length of contig which I produced) ?!

2) After merging the files ,New file contains near 320,000,000 seq but when I run velvet on 72Gb memory it used all the memory and then run time increased (near 4 days to complete hashing part). So Do you know how much memory Should I allocate?!

3) Can I split this new file to 320 files and then run velvet on each of them parallel and in the end merge the velvet output to use as one assembly result?!?

p.s: I use colorbased version of velvet.

velvet next-gen sequencing assembly human • 2.5k views
ADD COMMENT
2
Entering edit mode
12.7 years ago
Nikolay Vyahhi ★ 1.3k

1) Yes, it is. More data you have - the better results are.

2) You can try smaller K to fit in memory. But assembling mammalian genomes with Velvet is somewhat nearly impossible. Assembling human genome in 72Gb RAM is also close to impossible even with any other de-novo assembler. Usually it takes 200+Gb of RAM depending on dataset.

Consider one of the following:

  • Assembly be reference (Bowtie / BWA), not de-novo.
  • Using ALLPATH / SOAPdenovo.

3) Definitely no.

ADD COMMENT

Login before adding your answer.

Traffic: 1764 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6