Question

Some questions on RAD paired-end sequencing

0

Entering edit mode

22 months ago

john • 0

Hi.

I'm working with RAD-sequencing where I'm specifically doing paired-end sequencing. I'm a bit confused, and I would greatly appreciate any answers to my questions.

According to the protocol I'm using, my understanding is that, prior to sequencing using Illumina, is that the fragments produced by PCR are size-selected i.e., cut down, to a uniform size of 500 bp. These 500 bp are then sequenced from both ends. The resulting sequenced fragments, the reads, are 150 bp each. My first question is: How does the sequencing initiate in the "sheared" end (reverse read, R2) of the fragment? For the forward read (R1), the P1 adapter which I assume is necessary for initiating sequencing, can bind to the restriction site, because the restriction site "overhang" sequence is known. But the "sheared end", i.e., from where the reverse-read will be made from, doesn't have a known sequence. How then is the "P2" adapter able to bind to the sheared (reverse) end of the sequence, to allow for sequencing?
The result from the sequencing, to my understanding, is that 150 bp are sequenced on each end of a 500 bp fragment. This creates an gap of 200 bp between the sequences. How then is it possible to match the forward read with the reverse read?
This question is related to question 2. When I finally have the reads, they are to be processed with trimmomatic. Here, I should use the paired option. But what's the point of it, if the reads can't be aligned to each other anyways (see question 2)? Wouldn't it be more logical to just concatenate the forward and reverse read of each sample into one, and treat them as single reads?

Thank you!

RAD-sequencing • 1.3k views

ADD COMMENT • link updated 22 months ago by Ram 44k • written 22 months ago by john • 0

score 0 · Answer 1 · 2023-01-19

0

Entering edit mode

22 months ago

noodle ▴ 590

I think if you watch some Illumina videos everything will become much clearer - they put out many resources, for example; https://www.youtube.com/watch?v=fCd6B5HRaZ8&ab_channel=Illumina

ADD COMMENT • link 22 months ago by noodle ▴ 590

score 0 · Answer 2 · 2023-01-19

0

Entering edit mode

22 months ago

benformatics 4.0k

No the fragments ends become agnostic to the restriction enzyme shearing when sequenced. They are generated by the sequencing library kit (probably from illumina) that adds (i.e. ligates) the sequencing adapters.
A major facet of next-generation sequencing is that the read/fragment information is maintained during the sequencing procedure. You should probably read up on that to understand how it is done as it is far outside the scope of this website. And more importantly, this is part of the due diligence you should be undertaking as a scientist. https://www.floragenex.com/rad-seq
The reads will likely be delivered to you in FASTQ format wherein the read pairing information will be conserved (each read has the same read name).

ADD COMMENT • link 22 months ago by benformatics 4.0k

0

Entering edit mode

Thanks for your reply, but didn't really answer any of my questions. I have been trying to look up this information, but can't find it being explained anywhere, which is why I'm guessing why I'm missing something fundamental.

What do you mean by "ends become agnostic"? Do you mean that the shearing process produces specific ends?
In that link, the diagram at the bottom "How RAD-seq works", in the paired-end assembly part, that's what my question is about. If there's no overlap, which in this case there seems to be however (?), how are forward reads paired with reversed reads?
Each individual sample consist of two files, with the forward and reverse reads. But again, how are the reversed reads within any sample connected to the forward reads?

ADD REPLY • link 22 months ago by john • 0

0

Entering edit mode

You are definitely missing something fundamental. You need to learn how next generation sequencing works but again; how the sequencing works is way outside the scope of the forum. You should watch the video mentioned by joe and look for similar ones describing the Illumina technology.

All of your questions are related to the sequencing methods and have nothing to do with RAD-seq. The barcode information and the way the sequencer stores/saves the base-calling/cluster information are how the read-pairing information is conserved. I strongly recommend you talk to somebody in your lab/work, who you think has strong knowledge of this topic, and ask them. If you are unaware of how the technology works you need to have a significant discussion that is much longer than is possible to type out.

No, nothing about RAD-seq matters - any DNA fragments could be used and the result would be exactly the same.
See below
Again, see the bold paragraph above. Each read is given a name by the sequencer. Specifically, here each dot/cluster (see here: ) has its own "ID" and thus both ends of the fragment has the same "ID".

ADD REPLY • link 22 months ago by benformatics 4.0k

0

Entering edit mode

Thank you. Ok, I'll rehearse my NGS and come back or go somewhere else if I can't figure it out.

ADD REPLY • link 22 months ago by john • 0