Question

I didn't understand a term about non-template DNA sequence

0

Entering edit mode

10.7 years ago

mangfu100 ▴ 810

Hi.

I have a question about non-template DNA sequence and several other things.

To better explain what I didn't understand, I quote a paragraph below. just look at it first.

It has difficulty with analyzing repetitive DNA sequence regions, rearrangements that occur within or next to germline polymorphic SVs, and rearrangements that contain nontemplate DNA sequences that are inserted at the breakpoints and are of similar or longer length than the next-generation sequencing reads.

As shown above sentence, I can't understand bold sentence along with nontemplate DNA sequence.

I know that nontemplate DNA sequence is strand of 3'->5' orientation but why rearrangements occur in nontemplate DNA is harder?

Furthermore , The sentence after nontemplate DNA sequences get confused to me.

Why the rearrangements that have longer length than ngs reads in nontemplate or even similar is hard to detect?

Can anybody explain bold sentence to me?

genome sequence next-gen • 7.5k views

ADD COMMENT • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by mangfu100 ▴ 810

0

Entering edit mode

Can you provide us with the context of where this sentence comes from? Cancer genome breakage? Is this sentence from this paper -- I don't have access, but Google guides me here. That might help us more.

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by Josh Herr 5.8k

0

Entering edit mode

Yes. That's right.

Here is the full paragraph.

Although CREST performs better than standard paired-end approaches in identifying SVs, it has difficulty with analyzing repetitive DNA sequence regions, rearrangements that occur within or next to germline polymorphic SVs, and rearrangements that contain nontemplate DNA sequences that are inserted at the breakpoints and are of similar or longer length than the next-generation sequencing reads. CREST, like all mapping methods, demands high-quality DNA reads of sufficient coverage to accurately define the DNA sequence (Supplementary Discussion). The method provides base-pair resolution of breakpoints and can therefore be used not only for identifying the number and type of SVs in a tumor genome but should also allow an analysis of the breakpoint DNA sequence as a way to gain insights into the mechanism responsible for the generation of the structural rearrangement.

The paper you mentioned is correct :)

ADD REPLY • link updated 3.4 years ago by Ram 45k • written 10.7 years ago by mangfu100 ▴ 810

Ram · Accepted Answer · 2014-09-25

Good question.

I'll try to answer it with examples. Let say we have a DNA sequence:

ACGTACGTACGTACGTACGTACGTACGAGACTACGT

Now, for some unknown reason DNA break occurs:

ACGTACGTACGTACGTA | CGTACGTACGAGACTACGT

DNA ends can be joined back:

ACGTACGTACGTACGTACGTACGTACGAGACTACGT

In this example, no evidence of breakage is present.

ACGTACGTACGTACGTATACGTACGTACGTACGAGACTACGT
_________________^----^___________________

In this example, DNA breakage site has additional DNA inserted that is duplicate of one DNA end.

ACGTACGTACGTACGTATTTTTAAAAACCCCGGGCGTACGTACGAGACTACGT
_________________^---------------^___________________

In this example, DNA breakage site has additional DNA inserted and it is unclear were it came from.

This "unclear were it came from" DNA is called "non-template DNA".

DNA end joining mechanism determines such addition of extra DNA in break site (see, http://en.wikipedia.org/wiki/Non-homologous_end_joining).

In breakpoint sequencing there are problems where inserted DNA is longer than reads, because it's hard (maybe even impossible) to map breakpoint.

DNA:   ACGTACGTACGTACGTATACGTACGTACGTACGAGACTACGT
#                       ^^^^^^
Reads: ----------- ----------- ----------- ----------- ----------- -----------
             -----------   -----------  -------------   -----------  -----------

In this example reads mapped all breakage site (DNA ends + insert): CGTATACGTACGT - now we know that there changes in our DNA and we can still trace back were those changes come from.

DNA:          ACGTACGTACGTACGTATTTTTAAAAACCCCGGGCGTACGTACGAGACTACGT
#                              ^^^^^^^^^^^^^^^^^
Short reads:  ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- 
                ----  ----  ---- ----  ---- ----  ---- ----  ----

In this example, with such long insertion (that is non-template) it's hard to trace back native DNA site.

Edit

In order to solve this problem Sanger sequencing is used at the guessed breakpoint site.

See figure 4A from http://www.sciencedirect.com/science/article/pii/S0092867411008853 In breakpoint 6 (3105 strand): there is 64bp insertion of non-template (probably) DNA.