Hi,
Some FASTQC tests fail when processing TrueSeq SmallRNA samples. Since FASTQC wasn't optimized for RNAseq some failed tests are expected and I can explain the results (weird CG distribution, overly abundant sequences), but I don't have an explanation to the uneven distribution of per base sequence contents I found on my samples.
According some resources I found these are somewhat normal for RNAseq. Although mostly due to a bias of the library preparation, and only at the beginning of the sequences. My results are fairly different as they span the entire sequence lengh. I think that the presence of overly abundant sequences may also be contributing (some sequences account for up to 5% of the reads) but I'm not sure.
So how normal is this for my particular case? And if I should't worry about it, how can I explain this result to my team?
SmallRNA are likely going to be in the first 22-25 bp so most of the rest is likely adapter. There are kit specific instructions to trim those adapter bases off. Don't overly worry about FastQC. Pre-process your data and then align. If there is a problem downstream of that you can back track at that point.