Dear all,
As I am working with public data, I would like to confirm the stated information and estimate whether the RNA-seq data comes from a total RNA protocol or one with a PolyA enrichment step. How would you recommend estimating this? Or is there even a tool available for this? I was thinking about using the proportion of exonic and intronic reads (qualimap output) as a measure - but that probably varies quite a bit between datasets. Any other suggestions? Thanks for you input.
Best,
The approach sounds reasonable. I would try though to get as "positive controls" some published data which used one or the other method and then see if this gives you enough confidence to really call your sample polyA-enriched or rRNA-depleted.
If the data is public you could try to look the information up in associated publication or write to the submitter and ask.
Thanks for your input. I extracted the information of the associated publication - It's not always well described though. I will try to contact the submitters - however I am in general looking for confirmation of the data extraction.