Retrieving CDS sequences NCBI
2
0
Entering edit mode
9.5 years ago

I have a list of gi corresponding to the nucleotide sequence in NCBI. I need to compile a file of cds sequences of those gi. I attempted to use batch entrez but it yielded whole genome sequences.

Can anyone suggest a method to retrieve cds sequences from NCBI given I have a list of corresponding gi

ncbi fasta batch-entrez cds • 5.6k views
ADD COMMENT
0
Entering edit mode

You can try the e-utilities, for example e-fetch.

A greedy solution could be to get the genbank file corresponding to your gi with efetch with get, something like that in Perl:

$gbk = get(http://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nucleotide&id=$id&retmode=text&rettype=gb);

Then parse the genbank with BioPerl:

my $gbk_stream = Bio::SeqIO->new(    -file => $gbk,
                                    -format => 'GenBank');
my $seq_obj = $gbk_stream->next_seq();

if (defined ($seq_obj)){
  for my $feat_object ($seq_obj->get_SeqFeatures) {
    if ($feat_object->has_tag('cds')) {
      $cds = $feat_object->get_tag_values('cds'));
    }
  }
}

If you only need the CDS it may be a little bit to much. But if you are going to need other information from the genbank file it could be useful.

There is probably a simpler and more elegant solution.

ADD REPLY
1
Entering edit mode
9.5 years ago
roy.granit ▴ 890

I believe this could be done using UCSC table browser just select the type of identifiers, input the list of ids, and select CDS output.

ADD COMMENT
0
Entering edit mode
9.5 years ago
biocyberman ▴ 870

I haven't tried UCSC's tablebrowser for this, but my favorite tool is Ensembl's Biomart: http://www.ensembl.org/biomart

It 's a very useful tool for this kind of things any many other kind of queries, for example: finding homolog genes across species, convert one type of IDs to another. It's worth to familiarize yourself with it.

When it comes to coordinates, be sure to choose the correct version of assembly. For example if you are working with GRCh37/hg19, it is a good idea to go to this site instead: http://grch37.ensembl.org/biomart/

ADD COMMENT

Login before adding your answer.

Traffic: 2847 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6