What Are You Using For A Reference Assembler?
2
1
Entering edit mode
12.5 years ago
diltsjeri ▴ 470

I need some information on reference assemblers. What are you using? Which is the most preferable reference assembler?

reference • 5.2k views
ADD COMMENT
1
Entering edit mode

What is a "reference assembler"? One that uses a reference genome, or one that you want to use as a reference for comparison with others?

ADD REPLY
0
Entering edit mode

Also: similar question, same user: http://www.biostars.org/post/show/44956/ion-torrent-reference-assembly/. Best to avoid posting multiple, highly-similar questions.

ADD REPLY
2
Entering edit mode
12.5 years ago
Lee Katz ★ 3.2k

AMOScmp-shortreads is working well for me, but it takes a bit longer.

ADD COMMENT
0
Entering edit mode

What files are you using to do the message file conversion with toAmos?

ADD REPLY
0
Entering edit mode

I have been using toAmos_new (in the new versions only) to convert the fastq to a bnk, and then I start on step 20 in amos using -s 20 so that I can trick it into starting on a bnk file instead of an afg file. It's buried in my script but I think it's something like

toAmos_new -Q run.fastq -t SANGER -b amos.bnk
AMOScmp-shortreads -s 20 amos

You'll need amos.1con and amos.bnk in the same directory for this to work. You can use "amos" or any other prefix, but it must be the same between files.

ADD REPLY
1
Entering edit mode

What's the advantage to starting with a bnk file? Also, are the options you posted above new to toAmosnew? I'm starting a pipeline with ion torrent data, so all I have is an sff. I use sffextract, to get the fasta,qual, and xml and I was going to use toAmos to convert to afg, but should I not?

ADD REPLY
0
Entering edit mode

The only advantage is that toAmos_new can read in a fastq file and therefore you skip 1) converting to fasta/qual and then 2) converting to afg. Internally, AMOScmp converts first to a bnk anyway and doesn't use the afg anymore.

ADD REPLY
0
Entering edit mode

thanks! this is really helpful.

ADD REPLY
0
Entering edit mode

After validating it, AMOScmp unfortunately does not perform as well as I thought it should. I had a few more false-positives than when I worked with bowtie2. Sorry to do this, but I withdraw this recommendation in favor of newer tools. BWA came in as a close second to bowtie2 and was still better than AMOScmp.

edit I mean AMOScmp-shortReads.

ADD REPLY
0
Entering edit mode

Thanks for following up.

ADD REPLY
0
Entering edit mode

Bowtie doesn't output contigs though correct? I need my reads to be assembled.

ADD REPLY
1
Entering edit mode

You'll have to follow up with samtools and "vcfutils.pl vcf2fq"

I am writing a script to automate this step but it is not finalized yet.

ADD REPLY
0
Entering edit mode

My reference sequence (to be indexed) has ambiguous nucleotides. This is apparently not supported by bowtie. Have you ran into this problem and if so how did you work it? I can't just replace those V, H,etc with a nucleotide because it would make the alignment bias.

I noticed bowtie offers a -ntoa option on bowtie-build, but that just changes all N's to As. Wouldn't that create a bias? Also I have other nucleotide variables like V, H as stated above, which the option --ntoa wouldn't fix.

And I have some gaps :(

ADD REPLY
0
Entering edit mode

replace all the ambiguous letters with N. You can do that with sed, or in a text editor like vim. And I'd use sed to get rid of - as well. Putting in an A will create a bias.

bwa will not crash on a genome like that. I'm pretty sure it will treat them all like N's.

ADD REPLY
0
Entering edit mode

Bowtie won't take Ns, I wish it did ;(

ADD REPLY
1
Entering edit mode
12.5 years ago
Nikolay Vyahhi ★ 1.3k
ADD COMMENT
2
Entering edit mode

Those are aligners. Assemblers are like programs like vevlet.

ADD REPLY
0
Entering edit mode

Velvet is de novo assembler. If you need to assemble by reference, then you need aligner.

ADD REPLY
0
Entering edit mode

This seems to be an ongoing debate on this forum.

ADD REPLY
0
Entering edit mode

I need my reads to be assembled based on a reference. With tools like Bowtie and BWA I get the percentage aligned and I can see the aligned regions, but the reads are not being assembled based on the reference. I believe this is the difference between the two.

ADD REPLY
1
Entering edit mode

After alignment, you can construct (assemble) consensus sequence from BAM/SAM-file using samtools: http://samtools.sourceforge.net/cns0.shtml

ADD REPLY
0
Entering edit mode

How can we have the snps and Indel replaced, and have the uncovered regions of the genome, represented as a series of Ns in the consensus assembly?

ADD REPLY
0
Entering edit mode

I really wanna know the answer to this question.

ADD REPLY

Login before adding your answer.

Traffic: 1761 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6