Hi everyone,
I have this vcf file which I want to filter and generate a second vcf file. I want to filter based on the genotype of fist two samples.
I am looking for variant where sample1 is 0/0 when sample2 is 1/1 and sample1 is 1/1 when sample2 is 0/0.
Following is how the rows of the vcf file looks like. Both sample1 and sample2 are 0/0 in the first example variant.
chr1 376878 . A C 17830.7 . . GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:16,0:16:45:.:.:0,45,675:. 0/0:13,0:13:39:.:.:0,3
9,504:. 1/1:0,46:46:99:.:.:1855,138,0:. 1/1:0,46:46:99:.:.:1785,138,0:. 1/1:0,63:63:99:.:.:1876,189,0:. 1|1:0,61:61:99:1|1:376878_A_C:2423,183,0:37687
8 1|1:0,19:19:66:1|1:376878_A_C:914,66,0:376878 1/1:0,9:9:27:.:.:335,27,0:. 1/1:0,26:26:78:.:.:801,78,0:. 1/1:0,61:61:99:.:.:2812,202,0:
. 1|1:0,24:24:72:1|1:376878_A_C:1052,72,0:376878
chr1 376881 . T C 6133.93 . . GT:AD:DP:GQ:PGT:PID:PL:PS 0/0:16,0:16:45:.:.:0,45,675:. 0/0:14,0:14:42:.:.:0,4
2,538:. 0/0:36,0:36:99:.:.:0,99,1453:. 0/0:39,0:39:99:.:.:0,108,1620:. 0/0:53,0:53:99:.:.:0,105,1575:. 1|1:0,47:47:99:1|1:376878_A_C:2314,171,0:37687
8 1|1:0,22:22:69:1|1:376878_A_C:951,69,0:376878 0/0:9,0:9:24:.:.:0,24,360:. 0/0:25,0:25:66:.:.:0,66,990:. 1/1:0,67:67:99:.:.:1941,204,0:
. 1|1:0,22:22:66:1|1:376878_A_C:990,66,0:376878
Please tell me if anyone knows how to do this. Thank you
Thank you. But my vcf file is not compressed with bgzip. Is there a similar command in vcftools that I can use?
I tested this command with an uncompressed test.vcf file and it worked flawlessly.
You are right. It worked fine. I compressed the file using bcftools and then your command. It worked well too. Thank you again. bcftools view file.vcf -Oz -o file.vcf.gz bcftools index file.vcf.gz
Can you please suggest a tutorial where I can learn usage of basic bioinfo tools like this?
I'm afraid I haven't come across any tutorial to learn about bcftools. All my knowledge, as with most of other NGS software, comes from solving my daily needs by inspecting its documentation and good old trial & error.