update genotype calls with PyVCF
1
0
Entering edit mode
4.0 years ago
biosol ▴ 170

Hi everyone, I'm parsing my vcf files with PyVCF and I would like to update my genotype calls: for every position in which the DP field is smaller than 10, genotype should be "./.". For the moment I've generated a list with the updated calls as I wish, like this:

vcf_reader = vcf.Reader(filename='myfile.vcf.gz')
new_gt = []
for record in vcf_reader:
    if record.samples[0].data[2] > 10:
        new_gt.append(record.samples[0].data[0])
    elif record.samples[0].data[2] < 10:
        new_gt.append('./.')

However, I've also tried to directly modifying the genotypes in the vcf_reader object, but I get an error message saying 'CallData' object does not support item assignment. In this link they have a similar problem in the FORMAT field. I was wondering: has someone found a way to directly modify genotype calls in PyVCF or to insert a list with the desired output? Thanks a lot in advanced!

pyvcf genotypes • 1.6k views
ADD COMMENT
1
Entering edit mode
4.0 years ago

a one liner using http://lindenb.github.io/jvarkit/VcfFilterJdk.html

 java -jar dist/vcffilterjdk.jar -e 'return new VariantContextBuilder(variant).genotypes(variant.getGenotypes().stream().map(G->G.isCalled() && G.hasDP() && G.getDP()<10?GenotypeBuilder.createMissing(G.getSampleName(),G.getPloidy()):G).collect(Collectors.toList())).make();' --recalc  input.vcf
ADD COMMENT
1
Entering edit mode

Yep! Works super fast, thanks a lot!

ADD REPLY

Login before adding your answer.

Traffic: 1621 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6