Entering edit mode
6.8 years ago
Arindam Ghosh
▴
530
I was reading some papers based RNA-seq. In found these details regarding adaptor sequence used in 3 papers:
- TruSeq RNA sample preperation kit v2, TruSeq Universal Adapter*
- ScriptSeqâ„¢ v2 RNA-Seq Library Preparation kits
- Illumina TruSeq kit
For Illumina TruSeq I saw that there are different types: TruSeq Single Indexes, TruSeq CD Indexes, TruSeq Targeted RNA Expression. How do I know which of these have they used? Also under the categories there is something as Universal adaptor and some index. Can anyone please explain this or refer some paper/article regarding this.
I also could not find much information regarding ScriptSeq adaptor.
Most read trimming/cleaning software will use a default (built-in) list of adaptors, those will allow removal of most (all) commonly used adaptor sequences. So as long as you do not have something really exotic you should be fine.
Some quality checking software (eg fastqc) will to some extent also report which kind of adaptors is found in your dataset, but the only real answer you will get from the people who made the libraries (== your wetlab partners or seq-faciity).
Thanks for replying Lieven. I am new to RNA seq data analysis and trying my hand with existing SRA data. Since here you have mentioned about trimming/cleaning I would like to clear a doubt. How do I judge how much trimming should be done? In FastQC there is a plot that gives the average score for each position of the reads. The score usually degrades towards the end. Should I blindly trim the bases that scores too low? What are the other criteria I should look forward to?
Yes, that's what people normally do: they set a certain threshold and trim any base that falls below that threshold (however, the process is a bit more complex then that, often it's a weighted score in a sliding window or so and not just single base score. The plot might give you some clues as to which threshold to apply.
However, nowadays (and certainly depending on the analysis you want to perform) it might not be necessary anymore to do trimming on base quality. You can/should check the software you intend to apply on your data to confirm whether it expects trimmed or not data.