the latest human reference genome fasta file
1
4
Entering edit mode
10.0 years ago
hana ▴ 190

Hi all

I would like to download the latest human reference genome (GRCH38) in fasta and gtf format for my RNA seq analysis. I would like to know which database is the beast,Genbank version 21 or ensemble?

Where can I get the fasta file of whole genome of ensemble version?

Is the below link below contains this file?

ftp://ftp.ensembl.org/pub/release-77/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz

Is there any difference between GenBank and Ensembl's alignment and annotation output result ?

Thanks in advance

RNA-Seq • 24k views
ADD COMMENT
1
Entering edit mode

You can download it from UCSC database: http://hgdownload.cse.ucsc.edu/downloads.html#human

ADD REPLY
0
Entering edit mode

Hi all

I would like to download the latest human reference genome (GRCH38) in fasta and gtf format for my RNA seq analysis. I would like to know which database is the beast, Genbank version 21 or ensemble?

where can I get the fasta file of whole genome of Ensembl version?

Is the below link below contains this file?

ftp://ftp.ensembl.org/pub/release-77/fasta/homo_sapiens/dna/Homo_sapiens.GRCh38.dna.toplevel.fa.gz

Is there any difference between Genbank and Ensembl's alignment and annotation output result?

Thanks in advance

ADD REPLY
0
Entering edit mode
10.0 years ago
Manvendra Singh ★ 2.2k

database is the beast?????

Yes, Its the one from ensembl.

You can download it from here, same way as you previously downloaded hg19 from UCSC

whole genome fasta

http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/

GTFs from

http://genome.ucsc.edu/cgi-bin/hgTables?

ADD COMMENT
0
Entering edit mode

Which file is contained the whole genome file? hg38.2bit?

Thank you

ADD REPLY
2
Entering edit mode

Just directly download the fasta file. There's no need to deal with 2bit.

ADD REPLY
1
Entering edit mode

Exactly, hana was asking about .2bit so I wrote that he can convert them as well.

ADD REPLY
0
Entering edit mode

Sorry @Devon, from where I could find fasta file for each individual chromosome for hs37d5.fa ?

ADD REPLY
1
Entering edit mode

https://www.gencodegenes.org/human/release_5.html assuming you are referring to release 5. All releases can be found on this page at GENCODE.

ADD REPLY
1
Entering edit mode

Yes, you need to convert it to fasta

You can get the utility program TwoBitToFa from here.

Once you downloaded it, you must change permissions first to allow it to be executed as a program.

Then you execute it from a terminal without arguments to see the options:

$ /path/to/twoBitToFa

twoBitToFa - Convert all or part of .2bit file to fasta
usage:
twoBitToFa input.2bit output.fa
ADD REPLY
0
Entering edit mode

thank you very much

ADD REPLY
0
Entering edit mode

Hi

I have already download the fasta and gtf files of hg38 from USCS database and run tophat.

But I have a problem with running cuffllinks . I got the below error

cufflinks  --GTF   genome.gtf   -o   /home/ra/cufflinks    /home/ra/accepted_hits.bam

[20:55:32] Loading reference annotation.
Error parsing strand (1) from GFF line:
uc001aaa.3    chr1    +    11873    14409    11873    11873    3    11873,12612,13220,    12227,12721,14409,        uc001aaa.3

Can you please tell me what dose it mean and how can I solve it

Thank you

ADD REPLY
0
Entering edit mode

That's not a GTF file. You have to explicitly set the output format to GTF, otherwise you'll get all of the table columns as is.

ADD REPLY

Login before adding your answer.

Traffic: 1713 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6