gVCF what is a "break-end replacement"
1
0
Entering edit mode
5.4 years ago
LayneSadler ▴ 90

As seen in the gVCF documentation by Illumina - what are break-ends in the ALT column?

ALT: Comma-separated list of alternate non-reference alleles called on at least one of the samples.

Options are:

• Base strings made up of the bases A,C,G,T,N

• Angle-bracketed ID String (”<id>”)

• Break-end replacement string as described in the section on break-ends.

https://support.illumina.com/help/BaseSpace_App_WGS_BWA_help/Content/Vault/Informatics/Sequencing_Analysis/BS/swSEQ_mBS_gVCF.htm

I can't find this section on "break-ends"

gvcf • 1.1k views
ADD COMMENT
2
Entering edit mode
5.4 years ago
Ram 44k

Here's the VCF format specification document. On page 17, there is a description of how break-ends are represented.

    These 3 elements are combined in 4 possible ways to create the ALT. In each of the 4 cases, the assertion is that s
    is replaced with t, and then some piece starting at position p is joined to t. The cases are:
    REF ALT Meaning
    s t[p[ piece extending to the right of p is joined after t
    s t]p] reverse comp piece extending left of p is joined after t
    s ]p]t piece extending to the left of p is joined before t
    s [p[t reverse comp piece extending right of p is joined before t
ADD COMMENT
0
Entering edit mode

Thanks @RamRS. Do you know if this is actually put into practice in variant calling pipelines?

ADD REPLY
0
Entering edit mode

I'm sorry, I don't really know. It depends on how new this convention is, as well as which tools are used in the pipeline. If GATK 3.7+ says it follows this convention, most pipelines would have it. If only GATK 4 adopted this, then it might not be as widespread. In conclusion, it all depends.

ADD REPLY

Login before adding your answer.

Traffic: 1953 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6