Extracting Variants from large set of variants
2
0
Entering edit mode
6.4 years ago
bioinfo355 • 0

I have a vcf file of whole genome of human and it has millions of variants. Is there any way to extract my own variants of interest from these millions of variants? And even good if extracting variants on the basis of their 'rs IDs'

Variant Calling WGS Data VCF • 1.5k views
ADD COMMENT
0
Entering edit mode

from bcftools manual expressions section https://samtools.github.io/bcftools/bcftools.html:

ID=@file .. selects lines with ID present in the file

ID!=@~/file .. skip lines with ID present in the ~/file

you can use either bcftools view or filter

ADD REPLY
0
Entering edit mode
6.4 years ago

GATK SelectVariants https://software.broadinstitute.org/gatk/documentation/tooldocs/3.8-0/org_broadinstitute_gatk_tools_walkers_variantutils_SelectVariants.php

And even good if extracting variants on the basis of their 'rs IDs'

--keepIDs / -IDs

List of variant IDs to select
If a file containing a list of IDs is provided to this argument, the tool will only select variants whose ID field is present in this list of IDs. The matching is done by exact string matching. The expected file format is simply plain text with one ID per line.
ADD COMMENT
0
Entering edit mode
6.4 years ago

in VCF, the variants are mapped to the genome reference, in some cases there will be rs ID's in the VCF but it's best to extract based on the coordinates.

ADD COMMENT

Login before adding your answer.

Traffic: 1603 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6