In paired-end RNA sequencing, can it definitively be known which two paired reads come from the same fragment, or is it inferred based on the distance between the reads?
In paired-end RNA sequencing, can it definitively be known which two paired reads come from the same fragment, or is it inferred based on the distance between the reads?
When you go for pair-end library preparation from RNA fragments followed by its sequencing, it results in generation of 2 files, one read generated from forward sequencing (mostly denoted as R1) and second read is generated from reverse sequencing (mostly denoted as R2)
That means for each of your fragments generated during library preparation, there is 1 forward sequence in R1 and its corresponding reverse sequence in R2. So, all the reads generated from the sequencer are properly paired only having same read name/header.
Here, you can see two reads, one from R1 file and second from R2 file. I.e they are representing same RNA fragment and thus they have same read name except part highlighted in red and blue which starts with 1 and 2 respectively, indicating 1st read is from R1 and 2nd read is from R2 file
You tell from the shared read name.
Thanks for the answer. To clarify my (perhaps poorly-worded) question, how is it determined which reads are paired? Through a bioinformatics algorithm, or perhaps through some feature of the adaptor/ flowcell? Is it possible that two reads are incorrectly paired? Apologies for my ignorance - I had trouble finding the answer elsewhere.
The instrument knows the xy coordinates of every cluster. Reads from the same xy coordinates are paired. If two clusters overlap such that the software can't distinguish them, it will throw them out. So no, the software can't mix up read pairs. And of course, the software knows the reads are paired long before any mapping coordinates are known.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
In addition to that, from a technical side: Both the forward and reverse read are detected from the exact same spot on the flow cell, so that they are assigned the same name. Check this video for details on the Illumina process.
Thank you - this is exactly what I wanted to know.
You're very welcome ;-)
Hey, may I ask a question following your answere? Should these two reads be reverse complementary with each other? But I didn't see they are reverse complementary. Could you help me with this?