How to clean this strange VCF file?
0
0
Entering edit mode
5.4 years ago
Shicheng Guo ★ 9.5k

Hi All,

How to clean this strange VCF file? three different styles occurred: 0, 0/0 and ./.

Thanks.

enter image description here

SNP vcf • 1.8k views
ADD COMMENT
2
Entering edit mode

0/0 and ./. actually mean different things (this was a surprise to me at one point also). The ./. indicates no-call/no-call e.g. the variant caller made no determination, and 0/0 is homozygous ref. I don't know why you would have haploid things with just "0" in the first column

ADD REPLY
0
Entering edit mode

First, thanks so much for the help!

Second, I guess I can remove these columns with single '0', I am not sure how they come out. I guess maybe caused by windows/Mac/Linux file transfer? or stupid perl/python script converting?

Third, first three columns are mutation calling by 3 different method to same sample: Mut1, Mut2 and VarD. Here, I can merge these 3 columns to a consensus genotype. Do you have any idea to how to obtain the consensus genotype based on these 3 columns?

Thanks.

Shicheng

ADD REPLY
0
Entering edit mode

You will just have to create a rule and then filter based on that rule. For example, you can include a call if it is called in at least 1 sample, or specify that it must be called in all 3 [samples].

There are indirect ways of filtering for this via, for example, using bcftools to filter for the number of missing (./.) and / or non-missing genotypes at each site.

ADD REPLY
0
Entering edit mode

Define "clean" - what is the expected output?

ADD REPLY

Login before adding your answer.

Traffic: 1910 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6