OK, this question may rank in the "dumb-sounding question" section, but it's not a hugely well tackled one I'd venture. Apology over.
So having understood what pair-end reads are, how they take a fragment from a DNA library, and read from each end, the dumb question is, how is the dna between the reads actually sequenced? i.e. as the number of nucleotides between the pair-ends reads is often called the insert size, how is the insert itself actually sequenced?
The question depends on assuming that there will be only one fragment in the DNA library representing that part of the transcriptome. I expect however, that that is not the case. I expect that there are numerous fragments representing a certain region, starting and ending at different points so that overlapping occurs. That would offer an answer to the question, in which case, I'd like to know what is the terminology for the number of fragments that "cover" a certain region of the transcriptome?
It can't be the usual coverage value, because that refers to the average number of short reads covering a locus, not the DNA library fragments themselves, as far as I know.
Hope I've explained myself properly. Thanks in advance / Adam.
read depth or fragment depth at a locus ? :-o
not read depth, but certainly the concept of fragment depth is interesting, thanks.
The coverage by DNA, regardless of whether it has been sequenced, is usually called "physical coverage". BBMap will calculate this if you output coverage and use the
physicalcoverage
flag.