Hi,
I am analyzing a Single Cell RNA-seq data set and most of the top highly variables genes are mitochondrial genes. These are not genes of interest and I am primarily interested in nuclear genes. Is it ok to remove the mitochondrial genes from the gene expression matrix before analysis? Also curious is if this commonly observed.
Thanks,
- Pankaj
In my opinion mitochondrial genes are interesting, but I don't know what your research question is of course... By your name I would guess it is cancer research. In cancer mitochondria seem to play an important role.
It is also unclear if you speak of nuclear genes encoding mitochondrial proteins, or only speak of genes in the mitochondrial genome.
Generally, this is expected. First, because genes encoded in mitochondria are often so-called housekeeping genes that are of high importance to the cells' basic functions and are often transcribed both ubiquitously as well as at high levels.
It becomes an issue if mitochondrial genes contribute the majority of the reads, and significantly more so than, for example, other housekeeping genes that are not encoded in the mitochondrial genome. It is a practical problem, because it means you have few reads left for the remainder of the transcriptome, but it has also been suggested that it is a sign of increased cell death which leads to increased accessibility and release of mitochondrial transcripts.
I usually look at the distribution of % mitochondrial reads per cell; most of the time I see that the bulk of the cells will have low to moderate levels of mitochondrial reads (below 5-10%). I will exclude those cells that show much higher % mito reads because these will presumably have a very skewed read count distribution. I would not exclude the actual genes though.
In my opinion mitochondrial genes are interesting, but I don't know what your research question is of course... By your name I would guess it is cancer research. In cancer mitochondria seem to play an important role.
It is also unclear if you speak of nuclear genes encoding mitochondrial proteins, or only speak of genes in the mitochondrial genome.