Best strategy for de novo assembly with illumina reads ?
4
3
Entering edit mode
8.7 years ago
Picasa ▴ 650

Hello,

I have paired end and mate pair reads from illumina. My expected genome size is about 1 GB. I'm not a bioinformatician and I'm trying to figure out how to assemble my data.

1) Do you have any recommendation for a software ?

2) For paired end data, is it worth to merge it with a tier software or is it done with the assembler ?

Assembly • 6.8k views
ADD COMMENT
0
Entering edit mode

I used Spades and I've had good results with it.

ADD REPLY
0
Entering edit mode

SPAdes is great, but is designed for bacterial assemblies, which is probably not the case here based on the genome size.

ADD REPLY
0
Entering edit mode

I am trying to use SPAdes with a bacterial genome, also having Paired-end and Mate-pair, cannot find out how to adapt the Mate-pair reads since SPAdes only accepts "high quality reads" (also having this problem with IotTorrent mate-pairs. How are you doing this? Thanks

ADD REPLY
7
Entering edit mode
8.7 years ago
igor 13k

There was a big project Assemblathon that published a thorough review of different assemblers: http://gigascience.biomedcentral.com/articles/10.1186/2047-217X-2-10

They used three different species with 1.0-1.6 Gb genomes, so it's especially relevant in your case.

ADD COMMENT
0
Entering edit mode

Thanks for the paper.

ADD REPLY
1
Entering edit mode
8.7 years ago
Buffo ★ 2.4k

Trinity or IDBA_UD works really good for illumina reads.

ADD COMMENT
1
Entering edit mode
8.7 years ago

I have used the BBmap toolset to get some initial information in paired-end data and then use those parameters to run the data through trinity to produce the final assembly. I would also recommend running it through TransDecoder to find ORFs.

ADD COMMENT
1
Entering edit mode
8.7 years ago
Shyam ▴ 150

You can use Abyss assembly program using both mate pair and paired end reads. Recent wheat genome survey sequences were assembled using it. You can also use SOAPdenovo. Have seen assemblies of 4Gb assembled with it. You need to try different k-mer assemblies to get the sweet spot for your data.

2) For paired end data, is it worth to merge it with a tier software or is it done with the assembler ?

You mean merging the forward and reverse reads by overlap. The two programs I mentioned takes two separate files for forward and reverse reads. You dont need to merge them.

ADD COMMENT

Login before adding your answer.

Traffic: 2368 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6