Data file to convert from Ensembl Transcript Identifiers (ENST) to RefSeq transcript Identifiers (NM_*)
0
0
Entering edit mode
10.0 years ago
pwg46 ▴ 540

Hello, I am looking for a data file which I can parse locally to map between ENST and RefSeq transcript identifiers. Please don't link me to biomart or give me an sql query. Like I said, I want the actual raw data file, which I can have locally and parse on my own. For example, I found that human.protein.faa in refseq's databaase is good for Uniprot<-->Refseq protein conversions. If any of you know of a good data file for ENSG<--->Refseq gene conversions, that would be great as well.

refseq ensembl • 4.9k views
ADD COMMENT
0
Entering edit mode

Just save the biomart output for your organism of interest to a file. Then you can parse everything locally as many times as you like without the network delay.

ADD REPLY
0
Entering edit mode

Is this what you are asking, the transcript file from Ensemble?

ADD REPLY
0
Entering edit mode

No, that's not what pwg46 was asking for. The refseq to ensembl mappings are in one of the database tables on the FTP site, but I haven't a clue which one.

ADD REPLY
0
Entering edit mode

Is it readily available like that? I always thought i need to do it by my own using some ID converter.

ADD REPLY
0
Entering edit mode

Yes, one can just download the database table...if you can find out which table you need (this is the case for UCSC too). The Ensembl database is large enough that it's simpler to just use biomart and save the results to a file. You then have a flat file with all of the conversions in a species. Why OP doesn't just want to do that (it's the quick and easy route) is beyond me.

ADD REPLY

Login before adding your answer.

Traffic: 1865 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6