Exclude sites with a certain fraction of heterozygosity from vcf
1
0
Entering edit mode
7.8 years ago
thal ▴ 10

Hello. I received a rather messy vcf file recently and I already did some cleaning with vcftools and bcftools. But there are two more things I want to do and I cannot find an option for it.

First, I want to exclude all sites which have more than 90% heterozygous calls. And second, I want to include only those sites which have at least one homozygous genotype for both alleles (there are only biallelic sites left). I'd be very thankful for any help.

snp • 1.7k views
ADD COMMENT
2
Entering edit mode
7.8 years ago

using vcffilterjs and the following script:

function accept(v)
    {
    var nhet=0.0;
    var nhomref=0;
    var nhomvar=0;
    for(var i=0;i< v.getNSamples();++i)
        {
        var g= v.getGenotype(i);
        if(g.isHet())  nhet++;
        if(g.isHomRef())  nhomref++;
        if(g.isHomVar())  nhomvar++;
        }
    return(nhet/v.getNSamples() < 0.9 && nhomref>0 && nhomvar>0);
    }
accept(variant);
ADD COMMENT
0
Entering edit mode

Thanks a lot. Unfortunately I get an error.

ERROR jvarkit - Your input file has a malformed header: We never saw a header line specifying VCF version

I guess the problem is that my vcf does not has a header.

ADD REPLY
0
Entering edit mode

A VCF requires a header...

ADD REPLY

Login before adding your answer.

Traffic: 2673 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6