Heyya,
I am processing publicly available SRA. I developed the pipeline, however I skipped the FASTQC step so I have to go back. I tried to figure out a quick way to do it. Correct me if I am wrong:
Processing publicly available data has some downsides: you don't know loads of things about the RNA isolation protocol or the sequencing step. So it is hard to find the adapters for the quality check. I looked into some tools:
Trimmomatic has ILLUMINACLIP. Does it work? I still don't understand properly how it identifies adapters. Additionally I am processing data from a large number of papers. Often the sequencing technology is not even mentioned.
CutAdapt - for that one I need to insert the sequence.
...and there are loads of other tools as well.
But I figured that would be much easier to do the following:
-use Trimmomatic Operation SLIDINGWINDOW (which is cutting once the average quality within the window falls below a threshold)
-for pair-end sequences (PE) (and for single-end (SE)), the adapters are sequenced towards one end '3 or '5 (and that end is usually '3 where the quality is starting to drop)
Isn't that an indicator of the adapters? Do I really need to know more about it? It looks like an easy uncomplicated way of repeating my whole batch. Is this enough for the quality step (maybe some filtering depending on the data)? Am I missing something?
Cheers!