Entering edit mode
8 weeks ago
carolofharvest
▴
40
Is it okay to include those genes starts with "Gm" to variable features or should I remove them from variable features?
Which genome are you referring to?
Hi. I only have count matrix without empty droplets. I really dont know what they did in previous steps in cellranger etc. But it is a mouse data.
Single-cell RNA-seq is a lot of "exploratory" work; you can explore both approaches and see.
If a strong signal is being driven by a noncoding RNA, it might be something interesting to look into further. On the other hand, many ncRNAs are poorly annotated and/or have repetitive elements which causes reads to misalign to them.
If you're using the default cellranger genome, you can just use all those genes and proceed forward with your analysis like everyone else does, unless you encounter something funky downstream.
Hi,
I used those Gm42418 , Gm26917 and AY036118 genes in PercentageFeatureSet() function. Some cells have %60 of those genes. And also they are forming a single cluster in umap. I cant find a way to annotate them. In DEG results ın which I compared this cluster to other cluster, pct1 and pct2 are almost same .but this cluster expresses those genes highly.
What do you mean by "funky" ?
OK, good exploratory work -- yeah, you encountered the "funky" stuff I was talking about. Those are genes that have repetitive elements in them, so ribosomal RNAs present in your data will misalign to them.
I would remove them.
Hi, Would you remove genes or cells ?
Remove genes. After all, it’s those genes that are problematic.