Entering edit mode
5.9 years ago
always_learning
★
1.1k
I have few SNP with ploidy > 2 in my VCF(Human) generated by GATK with genotypes like "0/1/1". Any idea how I can remove them from my VCF file? Any tools that do that?
Hello,
why have you run your variant calling with a parameter that produces ploidy>2? Can you show us please the complete command?
Do your really want to remove those sites from your vcf or should the genotype be fixed in some way?
fin swimmer
I can't run this again. This is merged VCF of around 6000 samples. I think It will fine to remove such sites from VCF.
Should the site get removed even only one sample have a ploidy>2 or should the genotype get set to unknown?
I think "genotype get set to unknown" will be a better approach. What do you think?
Try this
sed
command:It will look for a tab followed by at least one number, followed by
/
, followed by at least one number followed by a/
followed by at least one number, followed by anything else but a:
or a tab and replace these pattern by a./.
preceded by a tab.fin swimmer