Hi to all,
Where can I download all exons of the human genome in FASTA format (one big file!)?
Thanks
Hi to all,
Where can I download all exons of the human genome in FASTA format (one big file!)?
Thanks
Using Ensembl BioMART
The download took a while but eventually I got a file 'mart_export.txt.gz'. This file has output that looks like this
>ENSG00000264715|ENST00000578347|ENSE00002715610
AGGAGTGACCAGAAGACAAGAGTGCGAGCCTTCTGTTATGCCCAGACAGGGCCACCAGAGGGCTCCTTGGTCTAGTGGTAACGCCA
>ENSG00000265161|ENST00000580394|ENSE00002716056
AGTAGAGATGGGGTTTCACCATGTTGGCCAGGCTGGTCTCAAACTCCTGACCTCAGGTGATCCATCCACC
>ENSG00000200917|ENST00000364047|ENSE00001438810
GCACATACATATACTAAAATTGGAACAATACAGAGAAGATTAGCATAGCCCCTGCGCAAGGATGACATGCAAATTCGTGAAGTGTTCCATATTAAA
...
get the coordinates:
Exon coordinates of hg19 genome download
and fetch the sequences:
Have a look at the UCSC Table Browser: position base query
Here - ftp://ftp.ebi.ac.uk/pub/databases/astd/current_release/human/
If anyone is looking for the same but for GRCh37 here is the link.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Thank you very much to all. Thank you Malachi, this is what I am looking for.
Not an answer... This should go as a comment. Thanks!
Here - ftp://ftp.ebi.ac.uk/pub/databases/astd/current_release/human/