Does it really matter?
I took a look of the annotation file, they looked very different...
Does it really matter?
I took a look of the annotation file, they looked very different...
It's not surprising the annotation files are different. They are independent projects annotating the same assembly (e.g. GRCh38 in human, GRCm38 in mouse) but relying on different methodologies for calling genes and transcripts. In Ensembl, the annotation needs to be supported by biological evidence (mRNA, EST, protein, RNASeq reads). Ab initio predictions are not listed in the annotation file whereas you may have some predicted transcripts in the RefSeq set (those based on XM or XP entries). The Ensembl annotation is the Gencode annotation, a merge between automatically annotated genes with manually annotated genes by HAVANA. The latter collaborates with labs/other groups to experimentally validate some of their transcripts (see Howald et al) and to include pseudogene annotation from Yale (see these slides).
Choose whichever is the most up to date for the organism that you're studying; for the popular model organisms, it won't matter, but for more rarely studied organisms, it will depend on which database is the most up to date for that model.
Yes, it matters, but only in a reduced set of genes that are being differentially annotated among the different platforms
It has been recently published a paper about it, but I cannot find it right now. Sorry
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Which organism are you working with ?
If they looked very different, other than the naming conventions of genes/chromosomes, don't you think it matters ?