Question

Reality check: insertion v duplication

0

Entering edit mode

7.5 years ago

andrewl ▴ 10

Quick reality check - I have been normalizing VCFs and annotation files according to this methodology:

Tan A, Abecasis GR, Kang HM. Unified representation of genetic variants.

An example implementation would be this here: https://github.com/ericminikel/minimal_representation/blob/master/normalize.py

A consequence of this is that all duplications get converted to insertions post normalization.

For example: ref: C, alt: CC would be normalized to ref: A, alt: AC (assuming A is the base pair preceding the ref position) or ref: CAC alt: CACCAC would be normalized to ref: G alt: GCAC (assuming G is the base pair preceding the ref position)

Does this make sense? Other than the label "insertion" v "duplication", should there be any importance given to the fact that these variations were duplications before the normalization, from a biological/clinical POV?

normalization DNA • 2.8k views

ADD COMMENT • link updated 7.5 years ago by harold.smith.tarheel ★ 5.0k • written 7.5 years ago by andrewl ▴ 10

score 2 · Accepted Answer · 2017-06-13

2

Entering edit mode

7.5 years ago

harold.smith.tarheel ★ 5.0k

Duplication vs insertion distinction certainly has biological/clinical relevance, such as trinucleotide repeat expansion in Huntington's disease. Duplications are meta-stable and subject to copy number changes during replication, while non-duplicated insertions are not. And, depending upon the size and orientation, duplications are also prone to intra- and inter-molecular recombination, whereas non-duplicated insertions can actually suppress recombination.

ADD COMMENT • link 7.5 years ago by harold.smith.tarheel ★ 5.0k

0

Entering edit mode

Wow - thanks I was expecting that this was a silly question to answer, now glad I did.

ADD REPLY • link 7.5 years ago by andrewl ▴ 10