Hello BioStars Community,
I'm working on single-cell RNA sequencing data analysis using Seurat for rat kidney samples. I've encountered an issue during the mitochondrial gene identification step.
Here's a brief outline of my workflow:
- Generated count matrices using CellRanger count for 3 male samples.
- Used CellRanger count and aggregate for 3 female samples, which were sequenced across 4 lanes.
- I'm using the mRatBN7.2 reference transcriptome for my matrices.
The problem arises when I attempt to identify mitochondrial genes. Using the grep function grep("^mt-", rownames(all.merged@assays$RNA$counts), value = TRUE)
, I only retrieve two genes: mt-Rnr1
and mt-Rnr2
. In contrast, when I run the code grep("^Mt-", rownames(all.merged@assays$RNA$counts), value = TRUE)
on the data from previous references(downloaded from the paper below) yielded a more comprehensive list of mitochondrial genes ("Mt-nd1" "Mt-nd2" "Mt-co1" "Mt-co2" "Mt-atp8" "Mt-atp6" "Mt-co3" "Mt-nd3" "Mt-nd4l" "Mt-nd4" "Mt-nd5" "Mt-nd6"). Also, I couldn't find any genes starting with Mt-
in the current analysis.
I'm looking for insights into why there's a discrepancy in the mitochondrial gene list when using the mRatBN7.2 reference, and how I can resolve this to get a complete list of mitochondrial genes. Any advice on troubleshooting this issue or suggestions for alternative approaches would be greatly appreciated.
For context, the analysis is similar to what was done in the paper titled "Caloric Restriction Reprograms the Single-Cell Transcriptional Landscape of Rattus Norvegicus Aging".
Thank you in advance for your help!
P.S I am a beginner and learning this to work on a real project, so please correct me for anything if am naming it wrong.
Cant you simply get the GTF file and filter for genes on the mitochondrial chromosome? That's independent of messy gene names.