What Should I Use Blat For?
5
14
Entering edit mode
12.0 years ago
KCC ★ 4.1k

Is blat just general aligner like bwa or bowtie or does it have a niche application that it's especially good at?

alignment • 13k views
ADD COMMENT
28
Entering edit mode
12.0 years ago

blat is, in some aspect, a predecessor of bwa or bowtie, as it hashes the reference genome and not the query, however it differs from the short reads aligner.

It was initially developed to map many EST (fragments of cDNA) to the genome. It is much better than the aligner to map sequences with gaps (part of the sequence here, then a gap, then some more alignment) It also provides many hits (bwa only shows you one, bowtie more) The main drawback is speed and partially, memory efficiency.

Accuracy:

BLAST > BLAT > Bwa|Bowtie

Speed:

Bwa|Bowtie > BLAT > BLAST

You might want to use BLAT to refine alignment of some reads or in a particular region, if you have long reads/sequences, if you are looking for splicing sites or chromosomal rearrangements, if you are interested in many/all hits of some requences...

ADD COMMENT
3
Entering edit mode

actually you can expand to:

Accuracy:

BLAST > BLAT > Bwa > Bowtie

Speed:

Bowtie > Bwa > BLAT > BLAST

ADD REPLY
16
Entering edit mode
12.0 years ago
lh3 33k

Although many use that way, blat is not the best choice when there are many gaps. Firstly, blat does not use Smith-Waterman to refine the alignment. It does not generate the best alignment and you cannot get something equivalent to CIGAR from PSL. Secondly, blat is designed for EST alignment in mind. Sometimes it produces spurious split alignment while a better alignment is present. Blat is probably the first whole-genome cDNA aligner. It is not really an alternative to blast. SSAHA2 is.

The strength of blast/blat/ssaha2 is that they can give an exhaustive list of local hits for long and diverged sequences. The list is very helpful to investigate problems. It also gives users more control over what to accept. However, they are slower also because of this. Most recent fast long-read aligners do not attempt to go through all the local hits.

PS: bwa/bowtie1 are not general-purpose aligners. They designed for short reads only. Bwa-sw is more tuned for general purpose, but it does not output multiple hits. Bowtie2, I think, cannot output an exhaustive list local hits as blast/blat/ssaha2 does, either. It does not work very well with chimeric alignment. No aligners so far are good for everything. Choose based on your needs.

PSS: Blat/ssaha2 are good for Sanger reads, but they are very slow for >10kb sequences. For that long sequences, you need others designed for genome-to-genome or assembly-to-genome alignment.

ADD COMMENT
7
Entering edit mode
12.0 years ago

Blat was written years before bowtie, and for a different purpose. It was made for the users of the UCSC Genome Browser, as a faster alternative to Blast. Also, Blat was designed to be align sequences to a reference genome, while Blast is a more general purpose tool.

  • Compared to Blast, Blat stores its data on RAM memory instead of disk. So, it is much faster than Blast, but it also require more expensive hardware.

  • Blat is good at aligning transcript to the genome. In particular, Blat is good at recognizing Exon/Intron sequences.

Compared to bowtie, the main difference should be that bowtie is designed to align short sequences, like short reads from shotgun sequencing, while Blat is better for aligning longer sequences.

ADD COMMENT
5
Entering edit mode
12.0 years ago

I have used BLAT with some success to check suspicious alignments to pseudogenes from TopHat (by "suspicious" I mean e g that they had a large number of mismatches on average, etc.) BLAT seems to be better at finding a more plausible alignment in these cases (usually by splitting the read and aligning to two different exons, whereas TopHat often preferred to map to a contiguous pseudogenic stretch with mismatches.)

ADD COMMENT
2
Entering edit mode
12.0 years ago

BLAT is very well-suited for realigning microarray annotations where you have to update the location of 30,000 60-mers. It's a good fit for any situation where you have more than a few but less than millions of sequences larger than 20 bases long that are expected to align nearly perfectly to somewhere in the genome, but you don't know where.

ADD COMMENT
2
Entering edit mode

Well, several short-read mappers can do a better job than blat for that task. Nonetheless, for only 30k 6-mers, which tool to use makes little difference.

ADD REPLY

Login before adding your answer.

Traffic: 2412 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6