Hi biostars,
In RNAseq tutorials, I have learned that if the assignment rate of reads to genes in the counting step is below 50%, we should check where the reads are mapped. Are they mapped to exons or introns? I guess this is to detect contamination. So, detecting contamination would be useful only when we want to analyze the RNAseq data we have produced ourselves and redo the experiment without contamination. But when we aim to do a meta-analysis of multiple datasets from public databases such as GEO and we have assignment rates below 50%, should we look for contamination or move on to the next steps of analysis?
For example: Suppose we have downloaded an RNAseq dataset with 30 samples from the GEO database. And the assigned reads to genes are below 50% in 10 samples. Should we continue with the next steps or can we do something to enhance the assignment rate of those samples?
You can't "enhance" read assignment (assuming you are doing everything right on analysis side) just like you can't enhance alignments, if analysis was done without errors.
So, should I move on with the steps in the mentioned example?