Hi Biostars.
According to the previous answered questions here, when we have RNAseq samples of human or mouse with mapping percentage lower than 70%, we should BLAST some of unmapped reads to find the source of contamination.
1- I want to know the reason for that. Is this useful just for when we have produced data ourselves and we have to redo the experiment and prevent that type of contamination? Or knowing the source of contamination is essential even when we want to analyze other's data on GEO for the sack of eliminating genes related to the contamination ?
2-If I have a dataset with 30 samples and 15 of them are aligned less than 70%, is it essential to eliminate gene counts related to the contamination or I should simply remove those 15 samples because there is nothing I can do to rescue them? What if a less number of samples (e.g. 3samples) are aligned less than 70%?
Thanks GenoMax .I'm trying to do meta analysis of my own.
So, what should we exactly do when we are analyzing a GEO dataset with 30 samples that some samples are mapped lower than 70%? Should we skip the analysis of that dataset and find another dataset suitable for our propose?
Follow your normal analysis protocol. It is possible that your own data could have looked like this. While it is desirable to have upwards of 80-90% alignments, that is not always a feasible (e.g. if you had FFPE/degraded samples).