I would like to know if there is any difference between the genome build fasta files from UCSC and Gencode/Ensembl ? For example, is there any difference between the GRCh38/hg38 of UCSC to Gencode/Ensembl and similarly for mm10/GRCh38 of UCSC and Gencode/Ensembl ?
If not, then why there is a difference in genomic coordinates of these genes ? For example why Mecp2 coordinates in UCSC for GRCh38/hg38 is chrX:154,021,813-154,097,731 and for Ensembl Chromosome X: 154,021,573-154,137,103
Can I use the fasta file downloaded from NCBI/UCSC and annotation file downloaded from Gencode for alignment and other bioinformatics downstream purposes ?
Thanks!
See these posts for human genome:
GRCh37/38(NCBI) vs hg19/hg38(UCSC)
Is there any differences between Human Genome downloaded from UCSC website and the on from Ensembl
Resources for converting between UCSC <-> Gencode <-> Ensembl chromosome names
The problem is that they are rather old, there are some new releases, etc.
Ensembl and UCSC do some de novo gene predictions. It is possible that the longest transcript identified is slightly different. see this and this.
Thanks a lot. That was helpful!