Question

Can anyone suggest me a good pipeline to perform SNP calling on haployd organisms?

0

Entering edit mode

7.6 years ago

edoardo.piombo ▴ 20

Hi,

I was wandering if anyone might suggest me a good pipeline to perform SNP calling on haployd organisms.

I am currently using this one:

samtools mpileup -guf reference_genome.fa target_organism.sort.bam |

bcftools view -cg - |

vcfutils.pl varFilter –Q 20 - > result_vcf

The problem with this pipeline is that it treats a lot of errors as SNP. I find myself with many heterozigous SNPs that are caused by the presence of a read with an error (because there might be 8 reads with the sequence of the reference and 1 with the error).

If I filter out the heterozigous SNPs I risk losing information for the same reason (1 read with an error might cause the SNP to be considered heterozygous and therefore excluded).

I am considering the possibility of using the deepness of the SNPs to filter them, but my coverage is not so high and I would risk losing data concerning regions covered by just 2 or 3 reads.

Any suggestions?

SNP calling SNP mining comparative genomics • 1.1k views

ADD COMMENT • link updated 7.6 years ago by WouterDeCoster 47k • written 7.6 years ago by edoardo.piombo ▴ 20