Downloading Cdna Sequences Of The Trembl And Nr Databases
2
1
Entering edit mode
10.8 years ago
Pappu ★ 2.1k

Could you please tell me how to download all the cDNA sequences of the entries in trEMBL and nr databases?

python • 2.2k views
ADD COMMENT
1
Entering edit mode
10.8 years ago

Hi,

You could use Biomart and choose the database Ensembl Genes and go to the particular species. In the Filters section on the left side, go to Gene and select Limit to genes... With UniProtKB/TrEMBL Accession(s)

Select the attributes you want to download which has the option for cDNA sequence in Sequences radio button.

This is the easy and fast way. You could use Ensembl Perl API too if you would like to customize and batch download for multiple species.

PS: This is a targeted search of Ensembl database and may not be totally up to date with the most recent updated records at UniProtKB/trEMBL.

ADD COMMENT
1
Entering edit mode
10.8 years ago
hpmcwill ★ 1.2k

For UniProtKB (UniProtKB/SwissProt + UniProtKB/TrEMBL) the set of source coding sequences is equivalent to all the CDS features in EMBL-Bank.

The European Nucleotide Archive (ENA) provide a set of data files for ENA Coding sequences (formerly known as EMBLCDS) which is available from the EMBL-EBI FTP site:

For what it is worth, ENA also provide an equivalent dataset for non-coding RNA features appearing in EMBL-Bank entries:

ADD COMMENT

Login before adding your answer.

Traffic: 2354 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6