the cDNA file will contain all mRNAs of the human genome. It will be CDS + UTR (if available) and thus represents the transcribed part of the genome that will eventually be translated into proteins.
was thinking that as well but since they also offer a ncRNA fasta file I would assume they focus on the protein coding in the cDNA one but indeed possible it contains all transcribed things.
Depends however how you define 'exons' , it might be that it is mainly/only the ones being part of an mRNA and thus not includes the non-translated ones (not sure if you're interested in those as well)
It is not exactly clear to me why and what you want to do with it, but if you look at your same link https://www.ensembl.org/info/data/ftp/index.html in column "gene sets" you will find GTF and GFF3 annotation files with all exons (in coordinates).
Just to show the difference between exons, mRNA, and CDS here the info from such annotation file of mouse genome. Let's have a look at the gene ENSMUST00000130201:
To be more precise cDNA consists of all transcribed RNAs so mRNA as you say but also ncRNAs, pseudogenes, rRNAs, etc..
was thinking that as well but since they also offer a ncRNA fasta file I would assume they focus on the protein coding in the cDNA one but indeed possible it contains all transcribed things.
is what's written on their site but does not give much additional info
So, technically this is the exome, i.e., the set of all (known) exons?
I would say yes indeed.
Depends however how you define 'exons' , it might be that it is mainly/only the ones being part of an mRNA and thus not includes the non-translated ones (not sure if you're interested in those as well)
Currently I am not, so this seems to hold! Thanks.