What is the appropriate assembler for PacBio long reads
1
0
Entering edit mode
6.3 years ago

Hi folks,

We got long reads sequenced from 10 bacteria using Pac Bio sequencing platform. 5 of them don't have reference bacterial strains and 5 of them have some bacterial strain closer to the subject.

I have to identify anti microbial resistant genes from these 10 bacteria. This is the first time, I am handling PacBio sequence.

Any assembler to handle long reads?As of now don't know the coverage of the sample. Guide me through a reference article if you have encountered for this requirement. I found HGAP from PacBio sequencing platform (https://github.com/PacificBiosciences/Bioinformatics-Training/wiki/HGAP#implementations).

Celera® Assembler link is broken

Assembly alignment • 5.6k views
ADD COMMENT
3
Entering edit mode

Check this recent review (Table 2 has lists of lots of useful programs).

ADD REPLY
1
Entering edit mode

There are numerous long-read assemblers available. Many listed here.

https://academic.oup.com/bib/advance-article/doi/10.1093/bib/bbx147/4590140

ADD REPLY
0
Entering edit mode

Do you only have PacBio data? You should get some Illumina:

On stuck records and indel errors; or “stop publishing bad genomes”

ADD REPLY
0
Entering edit mode

As of now, I have been told that I am going to get only the PacBio long reads. Why do you say that I should get some Illumina?

ADD REPLY
0
Entering edit mode

From the blog post I linked:

If you can’t be bothered reading, then the summary is:

  • BOTH single molecule sequencing technologies (PacBio and Nanopore), their major error mode is insertions / deletions

  • Once a genome is assembled, some of these errors remain in the assembly

  • If they are uncorrected, they inevitably cause a frameshift or premature stop codon in protein-coding regions

  • It’s not that you can’t correct these errors, it’s that mostly, outside of the top assembly groups in the world, people don’t

PacBio and Nanopore have insertions / deletions as main error, Illumina doesn't have many insertions / deletions, so you can correct PacBio errors using Illumina reads.

ADD REPLY
2
Entering edit mode
6.3 years ago
gconcepcion ▴ 410

Your best bets are:

HGAP4 (GUI) as a pipeline provided in SMRTLink: https://www.pacb.com/support/software-downloads/

FALCON (command line) (bleeding edge HGAP): http://pb-falcon.readthedocs.io/en/latest/quick_start.html#quick-start

or Canu (command line) basically new Celera Assembler: https://canu.readthedocs.io/en/latest/quick-start.html https://github.com/marbl/canu

ADD COMMENT
1
Entering edit mode

I'd stay away from PacBio based assemblers - they're pretty difficult to get to work and take FOREVER. Use a third party assembler, like CANU.

ADD REPLY

Login before adding your answer.

Traffic: 1795 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6