Detect Variants From 454 Ace File
2
1
Entering edit mode
13.9 years ago
Lhl ▴ 760

Hello everyone,

Is there any software available to directly extract variant information from a 454 ace file (produced by Newbler assembler) ? I ask this question because I think ace file contains both alignment information and base quality, which is enough for SNP and Indels detection.

PS: I know gsMapper can be used to identify variants after denovo assembly of 454 data.

snp • 3.6k views
ADD COMMENT
1
Entering edit mode
13.9 years ago

The CLC Genomics Workbench comes with this functionality, but it has quite a hefty price tag along with it. Probably not feasible for most users.

There are several approaches one can take when doing SNP detection and quite a lot of things to consider with each approach, so in some cases writing your own SNP caller is the best idea (not a trivial task though). There are ACE parsers available in several of the Bio* packages (BioPython, BioPerl I think). However, I don't think any of these have SNP detection built in. At my last institution, our solution for SNP calling involved some BioPerl libraries along with some custom in-house libraries.

ADD COMMENT
0
Entering edit mode

thanks for your answer, Daniel. I would like to combine Bioperl packages and my own script to tackle this rub.

I also have another idea of aligning each contig and the corresponding reads (which consist the corresponding contig) using blast like software. And then transform the alignment into readable format of other software and then call variants. (Say,blat to a psi format and transform the alignment to sam foramt and call snps using samtools).

What do you think of this?

ADD REPLY
0
Entering edit mode

thanks for your answer, Daniel. I would like to combine Bioperl packages and my own script to tackle this rub. I also have another idea of aligning each contig and the corresponding reads (which consist the corresponding contig) using blast like software. And then transform the alignment into readable format of other software and then call variants. (Say,blat to a psi format and transform the alignment to sam foramt and call snps using samtools). What do you think of this? For sure i will have the problem of neglecting base quality information, do you have any idea to improve this?

ADD REPLY
0
Entering edit mode

This might work, although what you're suggesting might be a bit much. What I would suggest is converting the original sequence data into .fastq or .fasta/.qual files. If you can find a de novo assembler that calls SNPs, then I wouldn't worry about trying to use the .ace file at all. If you can't, I know there is reference mapping software that will not only map to a reference (the consensus sequence from the .ace file, in this case) but also call SNPs. I'll give a shameless plug here for the GNUMAP software, developed by some colleagues of mine and of which I am a happy user.

ADD REPLY
0
Entering edit mode

Thanks for your suggestion, Daniel. I will give it a go.

ADD REPLY
0
Entering edit mode
13.8 years ago
Marina Manrique ★ 1.3k

Have you tried using GSMapper? If you are working with a reference genome you could try to map your reads against it and if you haven't a reference genome you can use as reference the de novo assembly got from Newbler.

In the files called "454AllDiffs.txt" and "454HCDiffs.txt" you can get info about SNPs, Indels...

ADD COMMENT
0
Entering edit mode

Thanks Maria. I just wanted to try a different way, other than gsMapper to call variations. Ace files contain both alignment and quality information, a different variants caller may use different algorithm to identify polymorphic sites.

ADD REPLY

Login before adding your answer.

Traffic: 1324 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6