Refseq human reference transcriptome link for RNA-seq differential expression analysis
0
1
Entering edit mode
6.8 years ago
Abiterkuile ▴ 30

Hi there, I'm trying to find the right Refseq reference transcriptome link to use as an index for kallisto. I've been looking on the NCBI FTP site but not sure where to go from ftp://ftp.ncbi.nlm.nih.gov/refseq/H_sapiens/. Can anyone help?

RNA-Seq alignment • 2.7k views
ADD COMMENT
0
Entering edit mode

GENCODE releases are the official location of human genome data. If you only need the transcripts then look for that fasta file.

ADD REPLY
0
Entering edit mode

Unless you have a particular reason for using RefSeq I would not recommend using it with Kallisto since Kallisto obtain it's accuracy from the transcript quantification - and Refseq have very limited transcripts. Use Gencode as suggesed by Genomax. Else you can find RefSeq via the table browser

ADD REPLY
0
Entering edit mode

If you must have RefSeq versions then take a look at the last column of this file. It contains the accession numbers of the genes you want. You could then use NCBI eutils to get the fasta sequence.

ADD REPLY
0
Entering edit mode

To get the sequence of RefSeqGenes as fasta using NCBI eUtils.

awk -F '\t' '{print $4}' gene_RefSeqGene | sed -e '1d' - > genes
for i in `cat ./genes`; do efetch -db nuccore -id $i -format fasta; done
ADD REPLY
0
Entering edit mode

If you have no particular reason to go for RefSeq, I would suggest you the same thing that was suggested to me by a Kallisto author in june 2017 :

Anyway, our links would simply point you to the ensembl transcriptome when it exists. We typically use this as it is fairly complete (by comparison with others) and the transcript to gene name mapping is pretty nice (also by comparison with others).

ADD REPLY

Login before adding your answer.

Traffic: 1791 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6