I get some ensembl gene id after gene different expression analysis with DEseq2. I want to perform GO enrichment analysis, but almost half of them can't be recognized by DAVID. some people said I could use biomart in ensembl to get corresponding GO term of each gene, but what should I next do?
Give GeneSCF a try.
It supports Ensembl ID's.Sorry to say this. GeneSCF does not support Ensembl IDs directly. But you can convert into Gene Symbols and Entrez ids and use it in GeneSCF.
It's a pity that it doesn't work with EnsEMBL. In my work I find EnsEMBL a much better resource than NCBI.
It was problem when I try to implement Ensembl with GeneSCF. Because for some of the GeneSymbols the Ensembl ID (ENSG) is varying depending on the version of Ensembl.
Example, for KCNQ1OT1, I can see different ENSG-ID in old Ensembl (ENSG00000258492.1, GRCh37.66, gencode v11) and new Ensembl (ENSG00000269821.1, GRCh37.74-75, gencode v19). Only thing constant here was Gene Symbol or Entrez ID for this gene.
Atleast if I have something constant (fixed) like Gene Symbols (I can easily deal with multiple alias) or Entrez IDs, I can use it confidently (Otherwise, this might mislead).
Don't use the .x version number of EnsEMBL IDs, they should be more stable this way. Gene symbols are also not stable (although I must say they change less often than they used to a few years ago). Also the whole problem is to define what a gene is and work with this definition in a consistent way. It seems that for you a gene is defined by whatever share the same symbol. This is reasonable as this is more or less the definition used by biologists but as you've already experienced, it can create computational problems. It is also not always the best definition to use, especially when the underlying genome matters. The problem with Entrez is that it is unclear what a gene is. From this paper:
And from the RefSeq book section on curation:
This looks very circular and ad hoc to me.
Notice the quote around the word gene, which I take to indicate there's no formal definition of the term.
Anyway, the conclusion is that there are different definitions of what a gene is and that one should pick a reference and stick to it for the duration of a project or risk inconsistent results.