Question

Hybrid Assembly +Gapcloser With 454 And Illumina Data

1

Entering edit mode

11.9 years ago

HG ★ 1.2k

Hi everyone, I am fairly new in genome assembly. I have some 454 reads as well as illumina reads for the same baacterial strain. When i am doing assembly separtely of 454 data and illumina data. But there is some gap in genome after assembly. I want to minimize the gap by insilico method. Can anyone suggest some idea how to do it. If i do hybrid assembly how much it will be helpful and which software will be useful.

Or else i will go for separate assembly and try to fill the gap by some gap filler tool. Which will be ideal way to handle such problem.

Thank you.

454 illumina • 5.4k views

ADD COMMENT • link updated 11.9 years ago by rtliu ★ 2.2k • written 11.9 years ago by HG ★ 1.2k

score 0 · Answer 1 · 2013-08-31

0

Entering edit mode

11.9 years ago

rtliu ★ 2.2k

You can have a try with Minimus2 to merge contigs, or use other assembler like MaSuRCA for a combined assembly.

Minimus2 is a modified version of the minimus pipeline designed for merging one or two sequence sets (S1,S2). It uses a nucmer based overlap detector which is much faster than the Smith-Waterman hash-overlap program used by minimus.

MaSuRCA is whole genome assembly software. It combines the efficiency of the de Bruijn graph and Overlap-Layout-Consensus (OLC) approaches. MaSuRCA can assemble data sets containing only short reads from Illumina sequencing or a mixture of short reads and long reads (Sanger, 454).

ADD COMMENT • link 11.9 years ago by rtliu ★ 2.2k

0

Entering edit mode

I did 454 assembly with Newler which follow OLC and with my illumina data set i used Spades which follow de Bruijn grap. Now if i merge all the contig with Minimus2 will it be same as the algotithnm MaSuRCA follow? or result will be different whats your opinion. I did not try with any hybrid assembly approach because either they use OLC or de Bruijn . Could you please help me out

ADD REPLY • link 11.9 years ago by HG ★ 1.2k

0

Entering edit mode

MsSuRCA uses both raw reads of 454 and Illumina, it is easy to run. While Minimus2 only merges contigs using OLC. In GAGE-B paper, there is a detailed discussion on 2.6 Combination of assemblies. (GAGE-B paper: http://bioinformatics.oxfordjournals.org/content/29/14/1718.full)

ADD REPLY • link 11.9 years ago by rtliu ★ 2.2k

0

Entering edit mode

Dear rtliu , Thanks for your suggestion. Now i can run with illumina pair end reads. Could you please let me know how i can include my 454 reads also in config file and make a hybrid assembly. My config file looks like this and i have only .ace file from 454 reads. please help me out.

PATHS
JELLYFISH_PATH=/home/hiren/Desktop/MaSuRCA-2.0.3.1/bin/
SR_PATH=/home/hiren/Desktop/MaSuRCA-2.0.3.1/bin/
CA_PATH=/home/hiren/Desktop/MaSuRCA-2.0.3.1/CA/Linux-amd64/bin
END
DATA
PE= pe 250 20 /home/hiren/Desktop/MaSuRCA-2.0.3.1/data/ResetH45_S5_L001_R1_001.fastq /home/hiren/Desktop/spades/MaSuRCA-2.0.3.1/data/ResetH45_S5_L001_R2_001.fastq
END
PARAMETERS
GRAPH_KMER_SIZE=auto
USE_LINKING_MATES=1
JF_SIZE=1800000000
DO_HOMOPOLYMER_TRIM=0
NUM_THREADS=12
END

ADD REPLY • link 11.9 years ago by HG ★ 1.2k

0

Entering edit mode

I have no idea how to convert .ace file to sff file. Please ask for 454 raw reads *.sff files. In my case I use sffToCA script: http://sourceforge.net/apps/mediawiki/wgs-assembler/index.php?title=SffToCA

ADD REPLY • link 11.9 years ago by rtliu ★ 2.2k

0

Entering edit mode

I already request for .sff file . But could you please tell me how i put it inside command line after sffToCA. ??

ADD REPLY • link 11.9 years ago by HG ★ 1.2k

0

Entering edit mode

Dear rtliu, Now i got .sff file for all the strain please let me know how can i feed the assembler this two different file fastq & .sff ???

Thank you..

ADD REPLY • link 11.9 years ago by HG ★ 1.2k

0

Entering edit mode

using above sffToCA script and .sff file, generating an output file, say, 454reads.frg then add "OTHER" line to your config file (config reference: sr_config_example.txt)

"DATA

PE= pe 180 20 /FULL_PATH/frag_1.fastq /FULL_PATH/frag_2.fastq

OTHER=/FULL_PATH/454reads.frg

END"

then run the 'runSRCA.pl' script against your config file to generate a script called 'assemble.sh' finally run the 'assemble.sh' (e.g. >bash assemble.sh)

ADD REPLY • link 11.9 years ago by rtliu ★ 2.2k