Sorry for the basic question, it stems from a bigger issue I'm having with matching my RNAseq DE genes with the full list of annotated genes.
In this setting, I have pulled out the top 10 DE genes which I have stored as an object "top10". This is a list of the Ensembl identifiers e.g. ENSMUSGxxxx. I then have created an annotation data frame "ann" consisting of the 47,729 genes in the mouse genome which contains a "gene_id" column i.e. ENSMUSGxxxx and a gene name column "gene_name" i.e. Gnai3.
What I'm trying to do is to match my "top10" with the "gene_id" column in my object ann but also show the corresponding gene name. I'm sorry if this is simple, but I can't figure out the code.
I've tried many things, the last of which was trying to create a new object "keepgene":
keepgene <- (ann$gene_id == top10) But I get the error message "longer object length is not a multiplier of shoter object length".
I'd appreciate any help as I seem to have reached a sticking point!
Thanks very much,
Morag
Try using
ann$gene_id %in% top10
Thank you! I tried this and it returned many falses (47,719) and the 10 genes of interest. There is a maybe a way to then say to take the trues identified from that piece of code and then show their matching gene_name. I did it a different way where I used the merge function to merge the top10 object with a modified annotation object (ann), matching them by their row names. This seemed to do the trick!
Find yourself a basic R tutorial. I can't think of a good one
?which
will tell you this off the top of my head