Entering edit mode
2.2 years ago
chemokine-1
▴
10
I have a list of genes and I run them in DAVID, I get many GO terms and I want to see which genes are taking part with the most GO terms. I have the results in a CSV file. I also look for a way to detect the GO terms, in which specific genes appeared in. How can I do it? Thanks in advance!
can you show a couple of lines from your CSV file?
Of course, it looks like this:
I played around with the table you provided. The main thing that you are looking for, i.e. which genes are taking part with the most GO terms, can be done by first filtering your results for significant hits, then combining all the genes from the significant GO terms and collecting a count on how many times each gene occurs in this combined list.
If you have genes of interest that you want to know what GO terms they are associated with there are several web-based platforms for doing that such as http://geneontology.org/. If you want to specifically search your table of GO results then you'll benefit from reformatting the data such that each gene in each go term is a line in the table as opposed to a single line per GO term with the genes listed as a single string.
R tidyverse packages such as
dplyr
andtidy
will be useful the above data wrangling suggestions https://www.tidyverse.org/