I have a gene expression matrix , I would like to extract a submatrix of genes annotated by a GO term. I use the R code to have a list of genes annotated by this term:
mart <- biomaRt::useMart(biomart = "plants_mart", dataset = "athaliana_eg_gene", host = 'plants.ensembl.org')
GTOGO <- biomaRt::getBM(attributes = c( "ensembl_gene_id", "go_id"), mart = mart)
head (GTOGO)
geneList <- biomaRt::getBM(attributes = c( "ensembl_gene_id", "go_id"), filters = "go", values = "GO:......", mart = mart)
How to extract from my gene expression matrix (genes, conditions) the submatrix of these annotated genes, can I use merge as R function? Thank you
Unless I misunderstood OP, I think you should do the other way round. First subset genes of interest from the expression matrix, then do GO analysis and then make a final matrix (with expression values and GO terms).
Hi, thank you. but how to subset genes of interest? my idea is to extract the genes list according to the GO term.
Please post a few lines (10) from your expression matrix. In general, genes of statistical significance (with lowest adjusted p-values) will be taken for analysis and many people also consider fold change for differential expression. To select for top represented GO terms, you may need to filter your list on some criteria that prevents noise.
ok, my matrix is large, I poste here only 10 lignes (genes) and 4 conditions:
I have pasted an answer in my original answer (below).
They are not genes, they are Affymetrix probes. I guess you already extracted corresponding genes using either Affymetrix tools or BiomaRt (or any other tool). For probe definition look here: http://www.affymetrix.com/support/help/IVT_glossary/index.affx. If you have probes in a list, you can simply use bash solution or you can follow the R solution proposed below post by Kevin.
significant probes:
Expression matrix:
command and output: