Hello ~,
I am analyzing the raw sequencing data from illumina Miseq. I am puzzled at the question. In my impression, it outputs the newly synthesized strand from 5' to 3'. And this is consistent with the low quality at the 3' end because the longer the sequencing the larger the noise.
However I noticed the following read in my pair-end raw data. The read1 contains the "TruSeq Adapter, Index 1", which is "GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG" at the start of the read. (See in the reads below.) I am puzzled about how this happens. So I doubt whether my original thought is right or not. The Youtube video given by illumina Inc puzzles me even more. At time 2: 35, it reverses the newly synthesized strand without complementary.
I guess it would be easy to discuss if we use the following simplified sequence(i.e. P5 adapter is AAA). If we get this a DNA molecule examined by Miseq, what the read1 and read2 results will be? If the sequencing cycles is lager than the read length(I guess it is called read-through, but I am not sure), where does the adapter/primer occur?
AAA(P5)TTT(read1 sequencing primer)AATTCCGG(DNA interest)GGG(read2 primer)ATAT(index)CCC(P7)
read1 result?
read2 result?
what the results look like if they contain any adapter?
In addition, I will very appreciate it if anyone can tell me how to specify the adapter.fa in Trimmomatic. Since the built-in adapter sequences only contain "TruSeq 2 PE""TruSeq 3 PE", but no "TruSeq Adapter, Index 1" in "TruSeq LT Kits".
Correct me if I have some wrong expressions. Thanks in advance!
Roden
Read1
@M03891:12:000000000-AL2FN:1:1101:11541:2121 1:N:0:1
GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTGAAAAAAAAAAATCACGACTGAATCTTTCATCACATCGCTACAATGCAGCCATGTCAGGGGCGAGGGTTAGACATCATTCTTCGTGTTT
+
BABBBBBAACFFGGGFCGEGGGHHHHHHHFFGHGFHHFGE22BFFFAFG5FEEEHFGGHHHHECGFGAEFCF?/133311//>3344444B4444BB4B//B/B223B2332/0322>>220///<---<<..00000000000=.;.:.:
Read2
@M03891:12:000000000-AL2FN:1:1101:11541:2121 2:N:0:1
GATCCGGAAAGCGTTCTGGTGGGGAAGAGTGTCAAAATCAGGTGTCCCCCCATCCTTTAAAAAAAAAATTCTCTCCTACACCCATCACCTGCCAGCACTTTAGCTATCTTCCTTATTTTACCAACCTGCATTACACGCTCCTTTCTCATA
+
11111>1>1111111001110A000A0B0D11222111B1B11B1111A/A/B001B212111>/>E/0112@2B01111>0>/BE1<0100011/001B1211B22012B1122122221111??@000?11111//...1><1111=<