I could not find any information on 1) how are built the annotation files that one can find in UCSC database and 2) what's the difference between the different annotation files they offer.
Do someone know if they have their own pipeline to build an annotation ? Can we find instructions of how they're built ?
For example, when trying to download chicken annotation data, I go to https://hgdownload.soe.ucsc.edu/goldenPath/galGal6/bigZips/ and then click "genes/" folder. Here I have the choice between 3 annotations:
- galGal6.ensGene.gtf.gz
- galGal6.ncbiRefSeq.gtf.gz
- galGal6.refGene.gtf.gz
I am not sure which one I should use. And they are very different, even just the features numbers (number of lines in each file) is highly variable:
- 833601 galGal6.ensGene.gtf
- 1768359 galGal6.ncbiRefSeq.gtf
- 163989 galGal6.refGene.gtf
Any advice on that would be welcome.
Thanks a lot, very helpful !