Calling SNPs with low coverage
2
0
Entering edit mode
7.6 years ago

Does anyone know of a program that can call SNPs with low coverage? I have been using samtools mpileup and bcftools/vcftools to find high confident SNPs. I am now trying to identify unknown samples based off of the reference SNP panel I generated. Therefore, I don't need high confidence in calling SNPs in these unknown samples since I know the SNP exists and since I will compare hundreds of these SNPs unknown sample calls. I am dealing with coverage in matters of about 3-5x on average. I really appreciate any suggestions.

Thanks!

vcf mpileup samtools SNP low-coverage • 3.9k views
ADD COMMENT
0
Entering edit mode

As long as you didn't do any prefiltering on your VCF file, you should have all the SNPs in there, from the highly to the lowly covered. In the DP sub-field of the INFO field you'll see it, and you can plot the distribution of it to have a better understanding of what your pipeline is calling (e.g. if low coverage SNPs are inside).

ADD REPLY
0
Entering edit mode

hello all.I want to use BBMap'callvariants to call variants.But where can I get the software(latest version)?could you give me the link?

ADD REPLY
0
Entering edit mode

You can download BBMap suite here.

ADD REPLY
0
Entering edit mode

Thank you very much!

ADD REPLY
0
Entering edit mode

Could you give us a little bit more background? Do you have several samples at 3-5X? Are they from the same population? Coding regions?

ADD REPLY
2
Entering edit mode
7.0 years ago

Hi there,

Maybe it's a bit late, but I'd like to highlight the discoSnp approach which might answer this initial question.

Without reference genome, discoSnp may predict SNPs and Indels from raw NGS reads. It does not depend on read alignment process and may find low covered variants. It removes all data seen less than c time. Thus just call discoSnp with -c 2, should answer the requirements (even if it'll miss variants seen only once).

Note that, during a final step, de novo predicted variants can be mapped on a genome, thus providing a VCF file that can be used for downstream analyses.

Best, Pierre

ADD COMMENT
1
Entering edit mode
7.6 years ago

BBMap has a variant-caller that is configurable to arbitrary depth or ploidy. You can use it like this:

callvariants.sh in=mapped.sam out=vars.vcf ref=reference.fasta clearfilters

The "clearfilters" flag clears ALL filters and will thus report all variants seen in the reads, regardless of depth or quality. Alternatively, you could use the flags "minreads=1 minscore=15" which would simply reduce the minimum number of reads and score a bit, or set the filters manually after reading the documentation. But probably for very-low coverage samples like you're using, since you have a set of known variants you're interested in, "clearfilters" is probably the best choice. BBMap also has another tool, used like this:

comparevcf.sh in=sample.vcf,trusted.vcf out=intersection.vcf intersection

That will yield the lines from sample.vcf for variations contained in trusted.vcf.

ADD COMMENT
0
Entering edit mode

BBmap must have changed since this answer was provided. Do you know what's the updated command-line for this in BBMap_36.28.tar.gz?

ADD REPLY
0
Entering edit mode

Oh, that version is too old. CallVariants was not added until v36.55 (but I recommend the latest, 37.61).

ADD REPLY

Login before adding your answer.

Traffic: 1939 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6