These comparative analysis are not my field, but "naively" if you use existing resources and do simple filtering, then you can use biomaRt from Ensembl. Essentially, they provide homology tables between mouse and human. Hence, if you retrieve all human genes, and the human-to-mouse homology table, then the human-only genes would be those human genes not found in the homology table, right? Again, this is a naive solution, it is not my field, I do not guarantee for integrity of the results, but it might be a start.
library(biomaRt)
human_mart <- biomaRt::useEnsembl("genes", dataset="hsapiens_gene_ensembl", version=100)
mouse_mart <- biomaRt::useEnsembl("genes", dataset="mmusculus_gene_ensembl", version=100)
all_human <- biomaRt::getBM(attributes=c("ensembl_gene_id", "hgnc_symbol", "gene_biotype"),
mart=human_mart)
human2mouse_homologs <-
biomaRt::getLDS(attributes=c("ensembl_gene_id", "hgnc_symbol"),
attributesL=c("ensembl_gene_id", "mgi_symbol"),
mart=human_mart,
martL=mouse_mart,
uniqueRows=TRUE)
colnames(human2mouse_homologs) <- c("human_id", "human_name", "mouse_id", "mouse_name")
only_human <-
all_human[!all_human$ensembl_gene_id %in% human2mouse_homologs$human_id,]
head(only_human)
table(only_human$gene_biotype)
Created on 2022-11-29 with reprex v2.0.2