Entering edit mode
6.0 years ago
JGuVa
▴
10
Hi there,
I am trying to extract fixed sites within the sample from a VCF file. By fixed sites, I mean those that differ from the reference genome but that are fixed within the sample.
REF ALT ind_1 ind_2 ind_3
1 A C 1/1 1/1 1/1
2 G T 1/1 0/1 0/0
3 C G 1/1 1/1 1/0
4 G C 0/1 1/1 1/0
5 A G 1/1 1/1 1/1
For instance, this is was a simplified version of a VCF file. In this case, sites 1 and 5 belong to this category of sites that contribute to divergence. Is there any tool on vcftools or R package that I can use for this purpose?
Thanks in advance.
not clear to me. You want the variants where all the genotypes are homozygous for the ALT allele ?
Yes, exactly, that is what I need.
In addition to Pierre: Is your data in this simplified format or a normal vcf?
My file is a normal VCF, I presented it like that just for the sake of the explanation.