Extract all alleles/variants from BAM or VCF
2
0
Entering edit mode
2.7 years ago
loreta • 0

Hi all, I have two files: a VCF and a BAM from a WGS. The VCF reports only the variants that are different from the reference genome.

I have a list of coordinates (as a bed file) to which I would like to extract the alleles from the sequencing. The VCF at the moment does not contain all of the regions, because it does not report the coordinates/variants that are equal to the reference.

I was wondering if there is a way that I could extract the alleles as VCF or similar for these coordinates, reporting also the alleles that are equal to the reference?

I would like to have this because in the VCF I do not know if my regions are absent because they have not been sequenced or if they are equal to the reference.

Let me know if my question is not clear and I could try to rephrase.

Thank you.

vcf bam • 771 views
ADD COMMENT
1
Entering edit mode
2.7 years ago

run HaplotypeCaller with option --output-mode EMIT_ALL_CONFIDENT_SITES

https://gatk.broadinstitute.org/hc/en-us/articles/360037225632-HaplotypeCaller#--output-mode

produces calls at variant sites and confident reference sites

ADD COMMENT
1
Entering edit mode
2.7 years ago

What you are referring to is the process of generating all genotypes at every position (or some target positions) in the genome. When we talk about VCF files we typically only refer to genotypes where these are different from the reference but the values are computed for every base.

For that, you would need to run a variant caller instructed to produce all calls.

For example

bcftools call

generates all genotypes

bcftools call -v

will output the variants only (based on a number of filtering parameters that can also be set). bcftools can also take target locations, hence you would not need to literally produce a potentially gigantic file for every base in the reference.

ADD COMMENT

Login before adding your answer.

Traffic: 3707 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6