long reads Pacbio SNP calling
2
0
Entering edit mode
7.2 years ago
guillaume.rbt ★ 1.0k

Hi all,

I understand that with Pacbio error rate (~15%), it is not really suitable for SNP calling.

This is maybe a naive question, but I was wondering if we have, for example, a really high coverage sequencing of a bacteria (>200X), wouldn't it make it possible to call SNP anyway?

If so, what would be the most cleaver way to do that? Try to do the "classic" way, align to a reference genome and detect variants (is there any tool doing that?). Or maybe perform a genome assembly, and then align the assembly to the reference ?

Has anyone tried that already?

Thanks in advance for your inputs

pacbio SNP • 5.0k views
ADD COMMENT
1
Entering edit mode

When you have 200x coverage, it might be easiest to simply generate consensus sequence (reads of insert) and then call variants from those. That way you avoid the problems of high error rates.

ADD REPLY
0
Entering edit mode

Ok that was what I thought, thanks for your help.

ADD REPLY
1
Entering edit mode
7.2 years ago
stolarek.ir ▴ 700

no worries, with high coverage you can do a decent SNP calling using PacBio

ADD COMMENT
0
Entering edit mode

ok great, which tools would you recommend to do that?

ADD REPLY
0
Entering edit mode

either look at their webpage if they have anything dedicated for that, or use GATK

ADD REPLY
1
Entering edit mode

GATK is not suitable for calling variants using long reads..they are developed, tested and verified keeping short reads (such as illumina) in mind..this particular github tool (by pacific biosciences itself) can be checked for variant calling - https://github.com/PacificBiosciences/GenomicConsensus

ADD REPLY
0
Entering edit mode
7.0 years ago
tjduncan ▴ 280

If you have 200x coverage of a microbe you should just make a de novo assembly with your data using a long read assembler. The two I would recommend are HGAP4 or Canu. Both of these assemblers include a consensus step and will yield an assembly that is of high enough quality to do SNP based variant calling using your preferred bfx pipeline.

HGAP 4 can be used from the command line by downloading PacBio's SMRT Link analysis suite

or

Canu may be the easier tool to quickly set up and use as it is available in bioconda.

ADD COMMENT

Login before adding your answer.

Traffic: 2808 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6