Hello there,
It may sound naive but sometimes people I know tend to avoid trimming step (i.e. using Trimmomatic) when
- the software which convert raw BCL data manages to remove adapters and
- alignment rate is more than 50%
What do you think?
Thanks
Hello there,
It may sound naive but sometimes people I know tend to avoid trimming step (i.e. using Trimmomatic) when
What do you think?
Thanks
I rarely trim RNA-seq data, since the aligners will just soft-clip adapters anyway.
Note that (1) will just trim adapters, so if you have quality issues that are actually affecting alignment then you'd still need to trim. With (2), 50% alignment is VERY low. I start to wonder if alignment is <90% (at least with common model organisms).
Another anecdote...my lab does Lexogen RNA-Seq preps, I align with STAR, then I use RSEM to count reads assigned to genes.
What I observed is that Lexogen preps, which seem to involve a poly-T, sometimes generate reads with poly-A ends at the ends. This is fine for STAR, it doesn't mind soft-clipping, but the transcriptome alignment file it prepares that I use as input into RSEM can't handle soft-clipping, so those reads ended up dropped from the transcriptome alignment file, so RSEM of course could not count them. So I always trim poly-A's from the end of RNASeq read, to help them end up in the RSEM-friendly transcription file.
If you are working with smallRNA data then you will almost certainly have to trim your data. There may be kit specific adapters/procedures to deal with. A small number of bases left over (even if you use bcl2fastqv2 to trim data). bbduk.sh
from BBMap suite will remove those by doing a trim by overlap
.
When alignment falls below a reasonable it is time to grab a few non-aligning reads and head to blast.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
sorry for resurrecting this old post but I would like to clarify a couple of things here.
I am trying to avoid trimming step also because when I look at the untrimmed fastq files there is no adapter contamination and Sequence Quality Histograms returned green check for all the fastq files.
percentage of pseudo alignment is between 82-85%. What do you recommend?
thanks
If you think results look reasonable you don't have to trim as @Devon already indicated (as long as you are not working with smallRNA, which you are not).
Thanks a lot for clarifying this @genomax. I will keep this in mind for future experiments.