Remove ribosomal genes in single-cell RNAseq analysis?
1
0
Entering edit mode
2.4 years ago
shuaizh117 ▴ 10

After I performed the standard Seurat workflow, I found top expressed markers are ribosomal genes in two clusters in my dataset.

My question is should I remove these two cell clusters and then perform normalization, scale, etc. again on the rest of the cell populations? Or, I can go back to the very beginning, grep ribosomal genes from my top VariableFeatures, remove them, and then perform dimension reduction and clustering with the rest of VariableFeatures? Which method do you think make more sense?

quality Seurat control single-cell • 6.0k views
ADD COMMENT
1
Entering edit mode
2.4 years ago
V ▴ 410

Personally, I would not remove, but regress out ribosomal genes, if what you are seeing is distinct clusters which are high in them.

I'm not sure if you have done this or not, but the ribosomal genes you are detecting maybe a biproduct of ambient RNA present in your sample. If this is your own sequencing experiment where you have access to the empty droplets, I would run something like SoupX to identify if the above is the case. If it is then you should remove them, if it's not, then regress them out as they may be obscuring more interesting sources of variation. (Assuming you arent working on ribosomal genes).

ADD COMMENT
0
Entering edit mode

Hello! Thanks for the suggestion! It is our own sample and it's a PDX tumor sample actually. Interestingly, in these two cell clusters that have high expression of ribosomal genes, they also express relatively high expression of cancer markers, in addition, there is a third cell cluster having a high expression of HBB and HBD, but also express cancer markers. Could these all indicate that it is possible these three clusters may come from ambient RNA contamination?

ADD REPLY
0
Entering edit mode

It's hard to intuitively tell if these clusters are an 'artifact' or if indeed they are genuine clusters. Generally, it would be rare for ambient RNA to give you distinct clusters, it mostly just contaminates everything, which ends up obscuring a clusters more ‘interesting’ features, or dampening them down, which makes them more difficult to detect when running differential expression testing when you do FindAllMarkers (or similar).

Since these are your samples, I would suggest you attempt SoupX and see how the resulting clustering looks. If you’re lucky the ambient RNA genes will adjust and you’ll still keep your 3 interesting clusters. Even if you don’t follow through with that analysis it will most definitely be something reviewers ask for since these are human tumour samples, which tend to be of ‘lower’ quality due to how they are extracted. So, better be informed early on about how your samples look.

P.S if the above answered your question, please upvote it 😊

ADD REPLY
0
Entering edit mode

Did it! I'll try SoupX and see where it leads! Thanks!

ADD REPLY

Login before adding your answer.

Traffic: 3745 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6