Short Read Aligners/ Snp Calling Comparison
6
9
Entering edit mode
14.2 years ago

Dear lazyweb,

my question is somehow related to some previous questions asked here: ( http://biostar.stackexchange.com/questions/788 , http://biostar.stackexchange.com/questions/31, http://biostar.stackexchange.com/questions/613 , etc...).

I'm looking for a table that would summarize the pro's and the con's of each tool for NGS (mappers and/or snp callers), the algorithm used, if the quality of the fastq, is taken into account, how the gaps & the indels are handled, a publication, etc... ... e.g.: something that would look like the following (almost empty) page on wikipedia: http://en.wikipedia.org/wiki/List_of_sequence_alignment_software#Short-Read_Sequence_Alignment

Thanks,

Pierre

UPDATE: @Julien_f suggested this link: http://lh3lh3.users.sourceforge.net/NGSalign.shtml

Update2: another cool resource cited in Marc's paper: http://seqanswers.com/wiki/Software/list

next-gen sequencing comparison mapping snp • 7.4k views
ADD COMMENT
0
Entering edit mode

that would definitely be a valuable resource.

ADD REPLY
0
Entering edit mode

Maybe we should write a paper on this ;)

ADD REPLY
0
Entering edit mode

In general, I am against a long list. It tells me nothing about the speed, memory, friendliness and accuracy. After reading that, we still do not know what we should choose. It is more important to know how often these are used in practice.

ADD REPLY
0
Entering edit mode

In addition, you should go to seqanswers.com. Your answers will be better replied there.

ADD REPLY
14
Entering edit mode
14.2 years ago
lh3 33k

Although there are over 30 aligners, I guess 90% of researchers only choose from the list below:

  • Illumina: Bowtie, BWA, Mosaik and Novoalign (Stampy is also a good one)
  • SOLiD: Bfast, Bowtie and BWA (perhaps also NovoalignCS?)
  • 454: SSAHA2, LAST, BLAT and Mosaik (I would recommend BWA-SW actually)

Carefully studying their features would not take much time. In addition, to "feel" a tool, we'd better use it rather than just look at the feature table.

As to SNP callers, I guess 90% who do not write in-house tools are using GATK and SAMtools/MAQ. SNVMix is also a good one, especially for cancer sequencing. As there are only a few of them, it is easy to try each.

In addition, usual SNP calling on deep resequencing data is not hard. Many groups use their in-house tools and being highly tuned for specific data, in-house tools may have better performance.

At last, as I am promoting everywhere, apply BAQ implemented in samtools. Doing this will improve the accuracy for all SNP callers so far.

ADD COMMENT
0
Entering edit mode

I should note that I've observed a marked decrease in sensitivity when using BAQ. Is it still suitable for current illumina data.

The sensitivity loss is observed in simulation and also in large-scale resequencing runs.

ADD REPLY
0
Entering edit mode

With better realignment, BAQ is optional. I agree that the original BAQ hurts sensitivity to MNPs, but extended BAQ is better. If I am right, BCM, UM and Sanger are still using extended BAQ at different steps. I do not remember they have obviously lower sensitivity than BC. I have not looked into 1000g calls for quite some time, so could be wrong.

ADD REPLY
4
Entering edit mode
ADD COMMENT
2
Entering edit mode

+1. A very good review.

ADD REPLY
0
Entering edit mode

What happened to this link?

ADD REPLY
3
Entering edit mode
14.2 years ago
Marc Perry ▴ 50

Did you already check out this review from last year?

Nat Methods. 2009 Nov;6(11 Suppl):S6-S12. Sense from sequence reads: methods for alignment and assembly. by Flicek P, Birney E.

ADD COMMENT
0
Entering edit mode

I accepted your anwser as I found some cool resources in this paper.

ADD REPLY
0
Entering edit mode

I accepted your answer as I found some cool resources in this paper.

ADD REPLY
1
Entering edit mode
11.4 years ago

Have you seen this paper? http://genomemedicine.com/content/5/3/28 Low concordance of multiple variant-calling pipelines: practical implications for exome and genome sequencing Jason O'Rawe1,2, Tao Jiang3, Guangqing Sun3, Yiyang Wu1,2, Wei Wang4, Jingchu Hu3, Paul Bodily5, Lifeng Tian6, Hakon Hakonarson6, W Evan Johnson7, Zhi Wei4, Kai Wang8,9* and Gholson J Lyon1,2,9*

Not a table per se and software used is a bit outdated (bwa is 0.5.9) but give the list of commands used and could be useful as template for doing the comparison by yourself.

ADD COMMENT
0
Entering edit mode
11.4 years ago

Here is a very handy list of short read aligners/mappers:

HTS Mappers

Tools for mapping high-throughput sequencing data (Pubmed)

ADD COMMENT

Login before adding your answer.

Traffic: 2031 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6