How to extract ampliconID from a bed file of each called variant from a vcf?
3
0
Entering edit mode
6.6 years ago
bioinfo89 ▴ 60

Hi All,

I want to extract ampliconIDs for all the variants called in multiple samples.

I have VCFs for each sample (~200) and I have a bed file with chrno,amplicon start and end coordinates and ampliconIDs. So for each variant, I want to extract the corresponding ampliconID in which it falls based on the coordinates and then add the ampliconID next to the variant. This will help in knowing which variant is found in which particular amplicon.

Any suggestions on the same would be helpful.

Thanks!!

SNP next-gen • 1.4k views
ADD COMMENT
1
Entering edit mode
6.6 years ago

Assuming you have a sorted BED file called amplicons.bed, you could use BEDOPS vcf2bed to convert sample.snps.vcf to BED, and BEDOPS bedmap to map SNP IDs to amplicons:

$ bedmap --echo --echo-map-id-uniq --delim '\t' amplicons.bed <(vcf2bed < sample.snps.vcf) > answer.bed

The file answer.bed will have each amplicon region and ID, and the rsIDs of all SNPs that overlap that amplicon's region, specific to that SNP's sample. The amplicon ID and rsIDs will be in adjacent columns, if I understand your file format correctly.

To repeat for all samples, use a for loop or the like:

$ for sampleFn in `ls *.vcf`; do bedmap --echo --echo-map-id-uniq --delim '\t' amplicons.bed <(vcf2bed < ${sampleFn}) > ${sampleFn}.answer.bed; done
ADD COMMENT
1
Entering edit mode
6.6 years ago
Tm ★ 1.1k

After converting .vcf to bed, similar thing as mentioned by Alex can also been done using 'bedtools intersect'

ADD COMMENT
1
Entering edit mode
6.6 years ago

Another way is to use bcftools.

To get it work, you must first bgzip and index your bed file:

$ bgzip -c amplicons.bed > amplicons.bed.gz
$ tabix -p bed amplicons.bed.gz

For annotating all the vcf files I prefer gnu parallel:

$ parallel 'bcftools annotate -c CHROM,FROM,TO,ID -a amplicons.bed.gz -o {.}.ann.vcf {}' ::: *.vcf

fin swimmer

ADD COMMENT

Login before adding your answer.

Traffic: 1880 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6