What is the benefit of knowing the source of contamination when we have RNAseq data with less than 70% mapping?
1
0
Entering edit mode
18 months ago
Sib ▴ 60

Hi Biostars.

According to the previous answered questions here, when we have RNAseq samples of human or mouse with mapping percentage lower than 70%, we should BLAST some of unmapped reads to find the source of contamination.

1- I want to know the reason for that. Is this useful just for when we have produced data ourselves and we have to redo the experiment and prevent that type of contamination? Or knowing the source of contamination is essential even when we want to analyze other's data on GEO for the sack of eliminating genes related to the contamination ?

2-If I have a dataset with 30 samples and 15 of them are aligned less than 70%, is it essential to eliminate gene counts related to the contamination or I should simply remove those 15 samples because there is nothing I can do to rescue them? What if a less number of samples (e.g. 3samples) are aligned less than 70%?

alignment mapping STAR contamination RNAseq • 896 views
ADD COMMENT
0
Entering edit mode
18 months ago
GenoMax 147k

When one is using public data to do analysis you are likely going to run into this type of situation.

Answer to your question depends on what you are trying to achieve. Are you trying to reproduce published analysis or hoping to do meta analysis of your own? Since you had no control over the data generation there is not much you can achieve by finding out what the contamination is (if it exists). There is nothing to be done in terms of "eliminating genes related to contamination" since you would not be counting them (as that data will not be present in your alignment file, if contamination is related to rRNA you could ignore those counts). You can't "rescue" samples informatically since you will be changing original data. If you choose to select only some samples then you run the risk of biasing your analysis.

Also please don't ask about the same content in multiple threads: Low mapping percentage

ADD COMMENT
0
Entering edit mode

Thanks GenoMax .I'm trying to do meta analysis of my own.

So, what should we exactly do when we are analyzing a GEO dataset with 30 samples that some samples are mapped lower than 70%? Should we skip the analysis of that dataset and find another dataset suitable for our propose?

ADD REPLY
1
Entering edit mode

Follow your normal analysis protocol. It is possible that your own data could have looked like this. While it is desirable to have upwards of 80-90% alignments, that is not always a feasible (e.g. if you had FFPE/degraded samples).

ADD REPLY

Login before adding your answer.

Traffic: 2731 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6