I have read this post: How bcftools isec works ? to further understand the output generated by bcftools isec.
I'm trying to find variants that are common by location in at least 2 files:
##bcftools_isecCommand=isec -p isec_out -n +2 UNC2FT198_Imtechella_halotolerans.fltq.vcf.gz UNC2FT2114_Imtechella_halotolerans.fltq.vcf.gz UNC2FT4154_Imtechella_halotolerans.fltq.vcf.gz UNC2FT55199_Imtechella_halotolerans.fltq.vcf.gz UNC2FT7534A_Imtechella_halotolerans.fltq.vcf.gz UNC2lu13_Imtechella_halotolerans.fltq.vcf.gz; Date=Wed Jun 23 15:57:36 2021
After generating the intersections of multiple vcf files I noticed in the last file containing the intersections only one sample name is present. How can I know from which samples the variants are common for?
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT UNC2lu13_vs_combined.sorted.bam
Imtechella_halotolerans_length_3113269 169317 . TG TGTAAG 148 PASS INDEL;IDV=10;IMF=0.625;DP=16;VDB=0.999554;SGB=-0.670168;MQSB=0.0871515;MQ0F=0;AC=1;AN=1;DP4=1,5,10,0;MQ=24 GT:PL 1:175,0
Imtechella_halotolerans_length_3113269 1450920 . G T 141 PASS DP=46;VDB=0.0516381;SGB=-0.692562;RPB=0.369907;MQB=0.00470369;MQSB=0.000385694;BQB=0.917285;MQ0F=0;AC=1;AN=1;DP4=17,3,7,15;MQ=12 GT:PL 1:168,0
Imtechella_halotolerans_length_3113269 1455993 . A T 195 PASS DP=30;VDB=0.63518;SGB=-0.693021;RPB=1;MQB=1;MQSB=0.0374987;BQB=1;MQ0F=0;AC=1;AN=1;DP4=0,1,25,2;MQ=28 GT:PL 1:222,0
Imtechella_halotolerans_length_3113269 1633169 . T C 60 PASS DP=34;VDB=0.0688269;SGB=-0.693127;MQSB=0.346427;MQ0F=0;AC=1;AN=1;DP4=0,0,5,28;MQ=9 GT:PL 1:90,0
Imtechella_halotolerans_length_3113269 2005523 . C T 195 PASS DP=64;VDB=0.0037203;SGB=-0.693144;RPB=0.727216;MQB=7.61251e-07;MQSB=9.95686e-06;BQB=0.86319;MQ0F=0;AC=1;AN=1;DP4=2,19,25,14;MQ=17 GT:PL 1:222,0
If it was such a case where I was looking to find variants common across all samples I would run the command with n =6 in this instance but that isn't the case.
I'm just wondering if there's a more illuminating way to identify common variants by position across multiple samples. Any insight greatly appreciated, thanks!
Brilliant thanks! I'm new to snp calling so am glad someone has had this dilemma before.
Cool, let me know if it works. The code for part 3 is well-tested but is quite an eye-sore.