Indel Detection For 454 Resequencing
4
7
Entering edit mode
14.4 years ago
User 59 13k

I do some work for a small diagnostics company that has a requirement for small indel detection in 454 data. Those of you familiar with Roche's pipeline will be aware that AVA (Amplicon Variant Analyzer) blissfully ignores indels <3bp. The 454 is also subject to issues with homopolymer runs.

I'd like to try some alternatives to AVA that focus more on indel than SNP discovery (although SNP discovery is still useful).

So far my inexhastive list of possibilities is:

  1. VarScan
  2. VAAL
  3. GigaBayes
  4. VARiD
  5. Variant Identification Pipeline

Does anyone have any experience of these packages for indel calling with 454 data? Or any additional suggestions? We have looked at certain commercial packages, but they tend to come up slightly short on features, largely ones of scriptability/automation.

indel next-gen sequencing • 5.2k views
ADD COMMENT
4
Entering edit mode
14.1 years ago
Erik Garrison ★ 2.4k

I'm the author of a variant detector, FreeBayes, which detects both SNPs and short insertions and deletions using BAM format alignment files. I've posted a note about this in another thread on indel detection.

In short, I strongly recommend you don't use GigaBayes, and instead use FreeBayes, which is a major improvement over GigaBayes in terms of interface, performance, and algorithm. (I need to update our documentation to this effect.)

FreeBayes can handle any insertion or deletion short enough to be spanned by a single read and represented in a single alignment record. If you want to detect of long insertions and deletions using 454 reads, you should also look into using Mosaik for your alignment step, as it can be configured to allow very long gaps alignments, although there is obviously a computational penalty for doing so. The insertion and deletion support of FreeBayes is still under development. I'm currently working to resolve some confusion about reporting them in the VCF as well as some algorithmic considerations.

ADD COMMENT
2
Entering edit mode
14.4 years ago
User 59 13k

To get the ball rolling I actually started with the Variant Identification Pipeline. Whilst seemingly a good match from the paper it suffers from a number of issues.

Firstly the source code does not work out of the box from download, and I had to make code-level changes to remove hard-coded paths, the configuration file and it's subsequent use by the pipeline is very sensitive to missing/trailing slashes on paths, it relies on BioPerl modules deprecated in the 1.6.0 release (Bio::Tools::BPlite in this case) and uses sequence names as primary keys in the back end database in one case, meaning that you cannot re-run the pipeline on data that you have run through once, as it complains about primary keys already in use.

So I'm hoping for a tool a little bit more robust than this.

ADD COMMENT
1
Entering edit mode

VAAL is potentially better as it is designed for 454 reads.

ADD REPLY
2
Entering edit mode
14.1 years ago
David ▴ 20

We are also looking for software capable to detect not only SNPs but indels for diagnostic applications, on a 8 sample test run (BRCA1 and BRCA2) both VIP and AVA missed a single nucl insertion. As a next step we will look into 'segemehl' aligner (just one thing is inconvenient with it, it doesn't output SAM format)

ADD COMMENT
0
Entering edit mode

The company in question, I should point out, eventually went for a commercial solution from BioGene which is working very well in their hands, but not, of course, open source or cheap ;)

ADD REPLY
0
Entering edit mode

There are a lot of good and probably better SAM-supported aligners for 454 reads.

ADD REPLY
2
Entering edit mode
14.0 years ago
Lhl ▴ 760

SWAP454 is designed for 454 sequencing.

ADD COMMENT

Login before adding your answer.

Traffic: 1618 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6