Question

how sequenced? dna between the pair-end reads

0

Entering edit mode

9.3 years ago

adamselwithadulcimer • 0

OK, this question may rank in the "dumb-sounding question" section, but it's not a hugely well tackled one I'd venture. Apology over.

So having understood what pair-end reads are, how they take a fragment from a DNA library, and read from each end, the dumb question is, how is the dna between the reads actually sequenced? i.e. as the number of nucleotides between the pair-ends reads is often called the insert size, how is the insert itself actually sequenced?

The question depends on assuming that there will be only one fragment in the DNA library representing that part of the transcriptome. I expect however, that that is not the case. I expect that there are numerous fragments representing a certain region, starting and ending at different points so that overlapping occurs. That would offer an answer to the question, in which case, I'd like to know what is the terminology for the number of fragments that "cover" a certain region of the transcriptome?

It can't be the usual coverage value, because that refers to the average number of short reads covering a locus, not the DNA library fragments themselves, as far as I know.

Hope I've explained myself properly. Thanks in advance / Adam.

RNA-seq • 2.3k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by adamselwithadulcimer • 0

0

Entering edit mode

read depth or fragment depth at a locus ? :-o

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.3 years ago by GouthamAtla 12k

0

Entering edit mode

not read depth, but certainly the concept of fragment depth is interesting, thanks.

ADD REPLY • link 9.3 years ago by adamselwithadulcimer • 0

0

Entering edit mode

The coverage by DNA, regardless of whether it has been sequenced, is usually called "physical coverage". BBMap will calculate this if you output coverage and use the physicalcoverage flag.

ADD REPLY • link updated 5.4 years ago by Ram 45k • written 9.3 years ago by Brian Bushnell 20k

Ram · Answer 1 · 2016-01-17

1

Entering edit mode

9.3 years ago

Pierre Lindenbaum 166k

the DNA between the pairs of one read is not sequenced, but if the sequencing depth/coverage is greater than 0, you have a high probability to have another read overlapping the other bases of the reference.

REF      ###############################
READ1F   -------->
READ1R                      <-----
READ2F       ---------->
READ2R                          <--------
READ3F             ---------->
READ3R                              <--------

ADD COMMENT • link updated 5.4 years ago by Ram 45k • written 9.3 years ago by Pierre Lindenbaum 166k

0

Entering edit mode

Thanks for the response and diagram Pierre. However the relationship of the reads tot he reference are not really my main concern. What I'm interested in is the intermediate stage, the DNA fragments, and how they (might) relate to the reference.

ADD REPLY • link 9.3 years ago by adamselwithadulcimer • 0

0

Entering edit mode

Denovo assemblers do not make use of a reference when constructing contigs. It is the overlap between reads, either via k-mer methods or simple overlaps which construct contigs. So if you forget the REF line in Pierre's diagram, and assembler will see that READ1F and READ2F overlap, and as such, the grap between READ1F and READ1R gets smaller. If you add READ3F to the mix, the gap is completely closed. Meanwhile 2R and 3R were able to extend the sequence relative to 1R.

ADD REPLY • link 9.3 years ago by Adrian Pelin ★ 2.7k