Entering edit mode
19 months ago
hamarillo
▴
80
Hi!
I was wondering if anyone knows what the current best practices are for QC and trimming sequencing bulk RNA-seq reads before aligning them to a reference genome.
I was motivated by reading about how STAR works (finding a seed within each read that matches the reference and extending as much as possible, penalizing gaps, etc). In my mind maybe this means that it's not necessary to trim reads, but I don't really know.
Thanks!
You don't need to trim RNA-seq reads unless you have a unique/specific use-case that requires it. The aligner will soft-clip parts of the read as needed.
Yup. STAR is robust to having wrong parts hanging off the ends. The aligner will generally find the right place to put the read even if something wrong is stuck on.
Yep most aligners do a pretty good job of trimming - if you'd like an overview of exactly where and what your aligner typically soft-clips, and the reference genome sequence context of clipping sites, I've written a capability for that into trimViz (see example output for bam files here) as well as the usual analysis for the results of raw read trimming.