I've integrated some single nucleus RNA seq data after the standard Seurat preprocessing workflow. I normalized with SCTransform, regressed mitochondrial DNA, removed cells with >5% mitochondrial DNA (not in this order).
obj <-CreateSeuratObject(obj.data)
obj <- PercentageFeatureSet(object = obj, pattern = "^mt-", col.name = "percent.mt")
obj <- SCTransform(object = obj, vars.to.regress = "percent.mt", verbose = FALSE)
Then I integrated with FindIntegrationAnchors, etc.
When I identify markers between my treated and untreated data (using PrepSCTFindMarkers and then DESeq2 FindAllMarkers), I have mitochondrial genes come up as markers (~0.4 avg_log2FC). This is un-expected considering this is nuclear sequencing data, and I've regressed percent.mt.
Am I missing something? Did I do something incorrectly, or is there something wrong with the data, or is this totally normal?
Thanks!
Did you also filter your "cells" based on mtRNA levels (remove cells with >X% mitochondrial reads)? I have no idea how the nucleus RNA-seq works but it could be that empty wells still contain extra-cellular mitochondria.
I did. I removed any cells with more than 5% mitochondrial DNA, but certain mitochondrial genes (ie mt-Nd3) are coming up as differentially expressed.