I have a vcf file of whole genome of human and it has millions of variants. Is there any way to extract my own variants of interest from these millions of variants? And even good if extracting variants on the basis of their 'rs IDs'
I have a vcf file of whole genome of human and it has millions of variants. Is there any way to extract my own variants of interest from these millions of variants? And even good if extracting variants on the basis of their 'rs IDs'
GATK SelectVariants https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_variantutils_SelectVariants.php
And even good if extracting variants on the basis of their 'rs IDs'
--keepIDs / -IDs
List of variant IDs to select
If a file containing a list of IDs is provided to this argument, the tool will only select variants whose ID field is present in this list of IDs. The matching is done by exact string matching. The expected file format is simply plain text with one ID per line.
in VCF, the variants are mapped to the genome reference, in some cases there will be rs ID's in the VCF but it's best to extract based on the coordinates.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
from bcftools manual expressions section https://samtools.github.io/bcftools/bcftools.html:
you can use either bcftools view or filter