Hello everyone,
I am new to GO database and R.
I have a data table with a protein list (gene symbols) which I got from filtering a bigger table by fold-change and pvalue. See below for the head of my table.
> head(table)
X log2FC FC pvalue padj
45 DDX21 0.8358637 1.784925 0.021688905 0.2737480
82 PDCD11 0.7647240 1.699045 0.037086918 0.2947572
104 RSL1D1 0.7923346 1.731875 0.034387111 0.2938215
202 TBL3 0.7217412 1.649171 0.004074165 0.1638396
228 NOP2 0.8724764 1.830803 0.005316531 0.1786989
I now need to filter this table by organelle location of these proteins and by their transmembrane/or not status.
How can I get GO terms for location and transmembrane status so that I can filter my proteins? I was planning on using packages. org.Hs.eg.db and GO.db to get GO terms and add them as columns to the table to ultimately filter my proteins but what I tried is not working.
Could you please help me to find a proper code for this purpose?
Thank you!
Thank you so much for your help, I tried reproducing your code with my whole list of proteins and got this warning:
Does that mean there are any incorrect gene symbols in my list?
Do you know how can I filter and remove from my table proteins which are located in one concrete organelle or cell compartment? For example, removing all proteins which are located in cytosol.
Check these posts: https://support.bioconductor.org/p/132388/ and 98.21% of input gene IDs are fail to map
Once you identified which protein is associated with let's say cytosol, then subset your data frame to remove those proteins. Steps toward that would be something like this:
Thank you. The post you recommended was useful!
I tried what you suggested to filter my data.
With this code I did select the cell compartments I wanted to remove from my table as you suggested (just changed the items in "Description").
Now, how could I remove the genes from "geneID" column from my original data frame?