how do you know an SV is located in exton, intron or Intergenic region
0
0
Entering edit mode
2.2 years ago
Maxine ▴ 50

I have a VCF file including millions of structural variations. I want to filter those SVs by their locations (exons, introns, intergenic or mixed). Is any mature pipeline I can follow? Thanks in advance.

Maxine

chromosome variation structural region • 1.5k views
ADD COMMENT
0
Entering edit mode

get a BED of exons and the extract the SV using "bcftools view --regions-file exons.bed SV.indexed.vcf.gz"

ADD REPLY
0
Entering edit mode

In my understanding, it would work to determine if the POS from VCF is located in an exon region. Am I right? But the thing is the POS only stores the start position of an SV, its end position is stored in INFO column.

ADD REPLY
0
Entering edit mode

the VCF should contain the INFO/END attrribute in the INFO column.

ADD REPLY
0
Entering edit mode

Yes, the INFO/END exit in VCF file. But does "bcftools view --region-file ..." subset VCF based on only POS column? I wonder "bcftools view" take account of INFO/END.

ADD REPLY
0
Entering edit mode

I wonder "bcftools view" take account of INFO/END.

YES

and you can just try it.

ADD REPLY
0
Entering edit mode

you can try to run it through SnpEff ? (given that your genome is available for it)

ADD REPLY
0
Entering edit mode

As I posted above, for SV, the end position should be considered. I'm not familiar with SnpEff, can it manipulate SV data?

ADD REPLY
0
Entering edit mode

As lieven.sterck noted, your most straightforward option is SnpEff. If you have a standard VCF format, you don't need to manipulate anything. Do you have any idea what variant caller was used for calling the SVs?

ADD REPLY

Login before adding your answer.

Traffic: 1658 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6