Subset VCF to exon region but still get intron_variant annotation
1
0
Entering edit mode
5 months ago
tomas4482 ▴ 430

I subset a vcf file using bed that only include exon region.

The bed file is created by the following command:

gtf_file="/home/user/ref/Ensembl/homo_sapiens_112/Homo_sapiens.GRCh38.112.gtf.gz"
agat_convert_sp_gff2bed.pl --gff ${gtf_file} -o ${exon_bed} --sub exon
cut -f 1-3 ${exon_bed} | sort -k1,1 -k2,2n > exon.bed

After that, I used vcftools to subset my vcf file only to exon region.

bed_file="/home/user/data/exon.bed"
vcftools --vcf ${vcf_dir}/all.vcf \
--out ${vcf_dir}/all_exon \
--bed ${bed_file} --recode

Then, I annotated this vcf file with Ensembl-VEP docker.

docker run -t -i --rm --user $(id -u) \
-v ${docker_workdir}:/data \
-v ${ref_dir}:/ref \
-v ${software_dir}:/software \
-v ${cache_dir}:/cache ensemblorg/ensembl-vep \
vep --assembly GRCh38 --species homo_sapiens --cache --offline \
--dir_cache /cache \
--input_file /data/all.vcf --format vcf \
--output_file /data/all.vep.vcf \
--buffer_size 50000 \
--vcf --force_overwrite \
--plugin CADD,snv=/software/CADD-1.7/data/prescored/GRCh38_v1.7/no_anno/whole_genome_SNVs.tsv.gz,indels=/software/CADD-1.7/data/prescored/GRCh38_v1.7/no_anno/gnomad.genomes.r4.0.indel.tsv.gz \
--plugin AlphaMissense,file=/ref/AlphaMissense/AlphaMissense_hg38.tsv.gz

Everything went fine. But when I check the statistics, I found something abnormal.

The total amount of input variants is 40142. Overlapped genes are 13303. Overlapped transcripts are 85208. It reports 24706 intronic variants (61.5%). But how could that be? I have already masked intronic regions.

At first I thought maybe it is something wrong with my annotation file. But I carefully checked every step, the annotation I used for creating BED, masking VCF etc. are correct. I don't know what's going on.

Do you have any suggestion? Thank you.

Ensembl vcf VEP vcftools • 469 views
ADD COMMENT
0
Entering edit mode
--input_file /data/all.vcf

It's this line correct in the VEP command?

ADD REPLY
0
Entering edit mode

Yes. This is a normal argument in VEP docker.

ADD REPLY
1
Entering edit mode
5 months ago

One position overlaping an exon can be the intron of an alternative transcript.....

ADD COMMENT
0
Entering edit mode

Oh...Of course. What am I thinking... :)

ADD REPLY

Login before adding your answer.

Traffic: 1838 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6