Question

Structural Variant annotation

0

Entering edit mode

7.2 years ago

maheetha.b ▴ 70

Hello,

I'm trying to understand how structural variants are reported from tools, and then subsequently how they're annotated in MAF format (if at all). Can someone point me to a papers or tutorials?

structural variants Delly • 4.1k views

ADD COMMENT • link updated 7.2 years ago by d-cameron ★ 2.9k • written 7.2 years ago by maheetha.b ▴ 70

1

Entering edit mode

Did you try google or google scholar?

edit: added How To Ask Good Questions On Technical And Scientific Forums link.

ADD REPLY • link 7.2 years ago by h.mon 35k

3

Entering edit mode

While google does wonders in most instances it is not able to always always surface the best resource (for one it can't read your mind, though it seems to be getting close with each passing day). For a newbie trying to weed out usable resources from a search can be a daunting task.

ADD REPLY • link 7.2 years ago by GenoMax 147k

1

Entering edit mode

I vote for you : best answer 2017 :)

ADD REPLY • link 7.2 years ago by Titus ▴ 910

1

Entering edit mode

Thank you for the comments.

Papers are papers. They're geared towards audiences that already know the ins and outs of structural variants. I understand what a VCF/BCF format is, but my annotation has columns named split read support and tumor split read support and tumor variant allele count, and perhaps I'm not understanding these definitions properly to understand the real difference between total split read support and tumor split read support. I thought only the tumors would have viable split reads anyways, and so wouldn't all the split reads that surface result from tumor only? I was hoping to find documentation on such annotation to maybe clarify these definitions. The annotation could have been done in house in my lab (and I'm still trying to contact/find this person). I haven't been able to find any such documentation, even in the DELLY paper it's a bit vague.

Note that i wouldn't be posting here if I hadn't tried for hours to search online and found nothing. If you searched and found something, just let me know. If you don't have any input, I would rather hear nothing. I post here because it's supposed to be a collaborative community and it has helped me in the past.

ADD REPLY • link 7.2 years ago by maheetha.b ▴ 70

1

Entering edit mode

I understand what a VCF/BCF format is, but my annotation has columns named split read support and tumor split read support and tumor variant allele count

These are not standard VCF SV fields and are specific to DELLY. The standard VCF fields can be found in the VCF file format specifications.

perhaps I'm not understanding these definitions properly to understand the real difference between total split read support and tumor split read support.

Somatic SV are those that appear in the tumour but not the normal. It appears that DELLY is reporting to overall total (both normal & tumour) split read counts and the tumour-only split read counts. Consult the DELLY documentation/paper/source code for exact details on how DELLY calculcates these fields.

ADD REPLY • link 7.2 years ago by d-cameron ★ 2.9k

0

Entering edit mode

This detail is what should have been in the original post. With this hopefully you will get an answer soon. You should add this info in the original post.

ADD REPLY • link 7.2 years ago by GenoMax 147k

0

Entering edit mode

I appreciate that it should have been detailed, but I was wondering if there was something for structural variants in general outside of DELLY.

ADD REPLY • link 7.2 years ago by maheetha.b ▴ 70

1

Entering edit mode

The only 'standard' we have is the VCF file format specifications. Unfortunately, the standardized fields do not include the fields required for a breakdown of the SV support by type (split read, discordant read pair, one-end anchored read, assembled breakend contig, assembled breakpoint contig, and so on) so each caller has to define their own fields (e.g. my caller GRIDSS, reports split read counts using a "SR" field). Note that caller counts can differ due to differences in the algorithm and filtering steps applied (e.g. some callers do not include split reads that have a MAPQ of 0 in their counts).

ADD REPLY • link 7.2 years ago by d-cameron ★ 2.9k

score 1 · Answer 1 · 2017-09-06

Here are a couple of good reviews of SV detection methods, if you haven't already seen them:

For understanding information about reads, the documentation for IGV might be a good place to start. Try loading an example tumor BAM file with a known fusion breakpoint, and see if you can sort out all of the information IGV is displaying in the reads around that site.

score 1 · Answer 2 · 2017-09-07

1

Entering edit mode

7.2 years ago

trausch ★ 1.9k

SVs are usually reported in VCF format. Such a VCF you can then annotate with SnpEff, VEP or Annovar. Conversion to MAF for a tumor-normal VCF should be possible with vcf2maf.

ADD COMMENT • link 7.2 years ago by trausch ★ 1.9k

score 1 · Answer 3 · 2017-09-10

The 'standard' format for reporting SVs is the VCF file format using the SOMATIC flag (note that not all callers will actually write that flag). Converting VCF SVs into MAF is problematic since the MAF format appears to only support single-loci indel SV events. I recommending not converting to MAF as it is likely you will lose important SV events (such as driver gene fusion events), in the conversion to MAF.