Entering edit mode
4.8 years ago
goatsrunfaster
▴
60
I have a VCF file that looks something like below:
POS ID REF ALT QUAL FILTER INFO FORMAT AE017334 Mie_1969_DRR128187 Tokyo_1928_DRR150094 Shiga_1987_DRR128186 Morioka_ND_DRR128181
187 . T C . PASS WT=350;HOM=8;NC=0;AC=8;AN=73 GT 0 0 1 0 0
228 . C T . PASS WT=356;HOM=2;NC=0;AC=2;AN=73 GT 0 0 1 0 0
1981 . A G . PASS WT=355;HOM=2;NC=1;AC=2;AN=73 GT 1 1 1 1 1
2578 . A G . PASS WT=347;HOM=11;NC=0;AC=11;AN=73 GT 0 0 0 0 0
5638 . G A . PASS WT=356;HOM=2;NC=0;AC=2;AN=73 GT 0 0 1 0 0
15763 . C A . PASS WT=357;HOM=1;NC=0;AC=1;AN=73 GT 0 0 0 1 0
16963 . A G . PASS WT=357;HOM=1;NC=0;AC=1;AN=73 GT 1 1 1 1 1
Note that I want to filter it to only keep positions such 1981 and 16963. How do I filter to only keep positions that have the same SNP across all samples?
please, review your previous questions and flag them as answered or add a comment. C: How to filter .vcf based on .gbk file to remove SNP calls in non-CDS regions? (green mark on the left); C: Identify fixed differences between population in WGS data (VCF Format) ; A: Merge VCF files with diploid reference calls in different order for same positio ....
have a look at
bcftools view
+ option-i