Hi, I have a couple questions regarding adapters+barcodes for paired-end mRNA-Seq. The way I understand it is that after mRNA fragmentation and first- and second- strand synthesis, you ligate the adapters, on either ends of the fragment. Here, I thought that the barcodes are on both ends of the fragment, between the adapter and the sequence... That is, in this fashion.
Adapter1, barcode, fragment, barcode, Adapter2
Basically, the procedure is then repeated two times, one reading from Adapter1 end and other from Adapter2 end. My understanding is that the barcode is the same.
My questions are:
1) Even if the barcodes are same, how does the sequencer know the matching pairs (or) how does it identify a particular pair come from the same fragment ?
2) Is it possible to sequence a paired-end read with barcodes only on one end? That is,
Adapter1, barcode, fragment, Adapter2.
How would this be possible/work?
Thank you.
Hi Sean, thanks for the link. I get the procedure, except for this line where they mention the usage of the index primer... "At the end of the first read, the extended sequenc- ing primer is removed and the Index Sequencing Primer, provided in the Multiplexing Sequencing Primers and PhiX Control Kit, is annealed to the same strand. This approach leverages the Paired-End Module to avoid the loss of high-quality sequencing data from the unknown sample that would occur if the index sequence had been included at the start of an application read."
As for the first question, it makes sense if they are at the same location on the flowcell. Thank you. But then, why would we require barcodes on both sides? Could you point me to a manual for paired-end reads with barcodes similar to the other one you linked me to? I tried searching Illumina website, but it wasn't successful. Thanks again!
@Arun: There aren't barcodes on both sides -- look at the picture closely. Look at figure A. Let's assume that the adapter ligated "at the top" of the read (the red/orange bit) is called adapter 1, and the adapter at the bottom (blue/green) is adapter 2. The multiplexed barcode is in adapter 2 there. In C, the read is now "flipped" so adapter 2 is at the "top", but there its the same adapter + multiplex barcode -- note that adapter 1 is now at the bottom, hybridized to the flowcell. Look at Figure 3 in that PDF, it should be clear.
@Arun: Just to be clear, the link above does describe the process for paired-end reads. Steve's explanation provides some detail.
Sean, I follow the procedure from the pdf you linked, except for the line I pasted above. I still don't understand that part. Also, I think this procedure is an older one and is no more used.
My next part about barcodes was not connected to the pdf. But, we have RNA-Seq data where library preparation was Adapter1 barcode Fragment barcode Adapter2. This was employed after, I suppose. But now, there's another way they do it. The barcode (yes, barcode), from TrueSeq kit comes in between the adapter1, that is, Adapter1-half1 Barcode (6 bases) Adapter1-half2 Fragment Adapter2.