How does the DNA-seq machine know which strand that is forward/reverse?
1
1
Entering edit mode
3.0 years ago

Hi!

As the title says, how does the sequencing machine know which strand is the forward/reverse?

Let's say I have a fastq file of a WGS, is that presented in a single strand with the forward strand followed by the reverse complement from the reverse strand?

help, I'm confused!

best Jonas

dna-seq • 2.4k views
ADD COMMENT
4
Entering edit mode
3.0 years ago

The sequencer does not now, or care, whether the DNA fragment being sequenced comes from the forward or reverse strand. Therefore, the reads in a fastq file are single-stranded and could come from either strand. It is only after mapping the sequences to the reference genome that you can infer strandness.

ADD COMMENT
0
Entering edit mode

Thanks for you answer! But when they decided the reference genome, how did they know which was the forward and reverse strand from the beginning? was that just arbitrarily decided by humans?

ADD REPLY
1
Entering edit mode

It is mostly arbitrary, although by convention the forward strand (a.k.a, Watson strand or plus strand) is the strand of a chromosome that has its 5'-end at the short-arm telomere and its 3'-end at the long-arm telomere. Obviously, this is only relevant for linear genomes.

forward reverse strands

More details on the usage of reference DNA strand on this paper: The multiple personalities of Watson and Crick strands

ADD REPLY
0
Entering edit mode

Great, thank you so much! exactly the information I was lacking:) Now I understand!

ADD REPLY
0
Entering edit mode

just one last question that I hope you can answer:) The WGS I have been involved in is using Illumina and the DNA is fragmented using sonication. What technique was first used to determine the forward/reverse strands when e.g., the human reference genome was established? Obviously, the library can't have been prepared with sonication because then it would be impossible to distinguish which fragment belongs to which strand. So how was this done?

ADD REPLY
1
Entering edit mode

Haaaa interesting question ! You should read about the human genome project, it is a very interesting part of the recent history of biology and bioinformatics.

Actually two methods were used for the first human whole genome sequencing: hierarchical shotgun sequencing by the publicly funded human genome project (an international consortium) and the more "brute force" whole genome shotgun sequencing by their competitor, the private firm Celera Genomics led by Craig Venter (who tried to patent the human genome). Fortunately, politics stepped in and declared the human genome public (so that nobody could patent it), which ended the race and led the consortium and Celera to cooperate at the end.

In both methods, the DNA is sheared in small fragments (as with the sonication) and sequencing occurs from both ends of the fragment (so you get partial information for each strand). At that point, we still don't know which strand is which. It is only after careful assembly of all reads – like a really big jigsaw puzzle – that strands are attributed. Note that a strand carry exactly the same information than the other one (it is reversed complemented), and that fragments are made of both strands.

ADD REPLY
0
Entering edit mode

Alignment file would have the strand information

ADD REPLY

Login before adding your answer.

Traffic: 1540 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6