Hi I have a question about the differences between the FASTA files that can be downloaded from the ensembl ftp (ftp://ftp.ensembl.org/pub/release-75/fasta/homo_sapiens/dna/Homo_sapiens.GRCh37.75.dna.primary_assembly.fa.gz) and the ncbi ftp (ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/annotation/GRCh37_latest/refseq_identifiers/GRCh37_latest_genomic.fna.gz).
As far as I could get tell, both are GRCH37 versions, so I was curious are the references identical or not? If they are, could I use the FASTA file downloaded from the ensembl ftp together with the gene-annotation file downloaded from ncbi ftp?
I know UCSC differs by chromosome naming, and I know there are tools that can convert from one to another, that's why I opt to download UCSC FASTA and GTF and use them together. I was also using up until now the ensembl FASTA and GTF together. But I was just curious, if I want to use ncbi GTF, do I need to download the FASTA from the ncbi ftp, or will the ensembl one do the job? From what I understood, they should be identical, I just couldn't confirm this...
For reference: chromosome coordinates remain unchanged by patches.
Thanks, I wasn't sure about that.