Aligning Pacbio Reads
4
3
Entering edit mode
12.0 years ago
lin.barnum ▴ 230

What are the best parameters for aligning PacBio reads with bwasw (or any other aligner)? Since PacBio has a significantly higher error rate dominated by indels, I aligned with a larger number of allowed gaps and a lower gap open penalty. While initially it appeared I got a good number of alignments on looking at it deeper I found that in almost all cases only a small fraction of the read had been aligned (say 10-20 bases in a 2 kb long read).

alignment • 15k views
ADD COMMENT
0
Entering edit mode

Also note that due to circularization of dna fragments you do not expect the whole read to align but only a portion of it (subread). Another good reason to use a specialized aligner for pacbio reads.

ADD REPLY
6
Entering edit mode
12.0 years ago
mchaisso ▴ 160

You can download blasr from github: https://github.com/PacificBiosciences/blasr . You need hdf5 installed to compile. Use default alignment parameters, but add the flag "-sam" to produce output in sam format if that is desired.

-mark

ADD COMMENT
1
Entering edit mode

blasr is giving me an error that I cannot resolve "ERROR, this path has gone awry at ## ## " Any other suggestions for alinging pacbio data?

ADD REPLY
1
Entering edit mode

The current version on github should have this fixed. E-mail me directly if not, and I'll fix it.

-mark

ADD REPLY
2
Entering edit mode
12.0 years ago
Lee Katz ★ 3.2k

You might start looking at their software. They have aligning software specifically for PacBio data called BLASR. From there though, I am not sure what the best parameters are. I'm sure it varies from run to run and between different organisms.

http://www.pacificbiosciences.com/products/software/algorithms/

ADD COMMENT
1
Entering edit mode
11.1 years ago
Buttonwood ▴ 40

I think you can have a try of bwa mem.

ADD COMMENT
0
Entering edit mode
10.1 years ago

Post error correction, you can use gmap/gsnap or bbmap for aligning PacBio reads. Please note you need fast files as input and your output will be in sam format.

ADD COMMENT
0
Entering edit mode

bbmap has an upper read length limit less than 10kb while the average pacbio read length is approaching or above 10kb.

ADD REPLY
0
Entering edit mode

That's true - the upper limit is currently 6kbp; but it will break longer reads into 6kbp pieces and map those. The majority of current PacBio reads that I've seen are still under 6kbp. Also, BBMap handles the raw reads fine; they don't need correction first.

ADD REPLY
0
Entering edit mode

The latest chemistry P6-C4 produces reads averaged >10kb reportedly. With experimental size selection, even P5-C3 reads are around 10kb in length. ~10kb pacbio reads are common these days.

ADD REPLY

Login before adding your answer.

Traffic: 2078 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6