Hello,
I took a 10x matrix from a collaborator and created a Seurat object. The issue I am having which I only realised when attempting to visualise my favourite genes, is that the original matrix has gene names in the format of "gene name - ensembl"
example :
Dcn-ENSMUSG00000019929.16, Inmt-ENSMUSG00000003477.5, Mfap4-ENSMUSG00000042436.12
So I can not search for my genes of interest unless I know what the corresponding ENSMUS following it is. Is there I way I can remove at this stage everything after (and including) the "-" so that I am left only with the gene name?
So convert "Dcn-ENSMUSG00000019929.16" to simply "Dcn".
Thank you :)
Simply grep it while ensuring not to get false partial matches:
I've tried this and it only returns the row number of where my gene is :(
Hi, I am running into a similar problem in Seurat: My genes have the following format ENSEMBL-gene-biotype (e.g. ENSG00000000003_TSPAN6_ProteinCoding) so I run into the same problem whenever I need a reference (cell cycle scoring, label transfer,...). I would like to remove everything before and after the gene name (including the two "_"). And ideally I would like to store this original format so that I can later change it again (to know the biotype).
Thanks so much for your help!
Best Julia