Entering edit mode
6.8 years ago
Abiterkuile
▴
30
Hi there, I'm trying to find the right Refseq reference transcriptome link to use as an index for kallisto. I've been looking on the NCBI FTP site but not sure where to go from ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/. Can anyone help?
GENCODE releases are the official location of human genome data. If you only need the transcripts then look for that fasta file.
Unless you have a particular reason for using RefSeq I would not recommend using it with Kallisto since Kallisto obtain it's accuracy from the transcript quantification - and Refseq have very limited transcripts. Use Gencode as suggesed by Genomax. Else you can find RefSeq via the table browser
If you must have RefSeq versions then take a look at the last column of this file. It contains the accession numbers of the genes you want. You could then use NCBI eutils to get the fasta sequence.
To get the sequence of RefSeqGenes as fasta using NCBI eUtils.
If you have no particular reason to go for RefSeq, I would suggest you the same thing that was suggested to me by a Kallisto author in june 2017 :