Question

Retrieve Fasta Sequences From Kegg By Keyword

0

Entering edit mode

12.1 years ago

marcelolaia ▴ 10

Hi, I would like to do a search in the KEGG database 'xac' (organism t00084) with 'keyword' (hypothetical) and retrieve all fasta printed out.

Here is a search example

I need a tab delimited text to do a downstream analysis. Thanks!

kegg fasta dna protein search • 4.4k views

ADD COMMENT • link updated 12.1 years ago by Neilfws 49k • written 12.1 years ago by marcelolaia ▴ 10

1

Entering edit mode

What do you mean by "retrieve all fasta printed out"? Fasta is a sequence format. And then you say "I need a tab delimited text." Please give an example of the output that you want.

ADD REPLY • link 12.1 years ago by Neilfws 49k

0

Entering edit mode

sory! I need all fasta in a text flat file.

ADD REPLY • link 12.1 years ago by marcelolaia ▴ 10

1

Entering edit mode

(1) see http://www.genome.jp/kegg/catalog/org_list.html (2) download all sequences from Xanthomonas axonopodis and then (3) use your favorite programming language to retrieve all sequences annotated as hypothetical.

ADD REPLY • link 12.1 years ago by Andrzej Zielezinski 11k

0

Entering edit mode

Which organism? That link lists all organisms.

ADD REPLY • link 12.1 years ago by Neilfws 49k

1

Entering edit mode

xac Xanthomonas axonopodis pv. citri 306

ADD REPLY • link 12.1 years ago by Andrzej Zielezinski 11k

Neilfws · Answer 1 · 2012-10-31

1

Entering edit mode

12.1 years ago

Neilfws 49k

I find interacting with KEGG using dbget via the Web extremely painful. So I'd go for a different approach.

Approach 1

Based on Is There Any Way To Retrieve Genes' Sequences In Fasta Format Using The Kegg Orthology Code? to a previous question, you could use the BioRuby Bio::KEGG::API to search and retrieve something like this:

#/usr/bin/ruby
require 'rubygems'
require 'bio'

serv = Bio::KEGG::API.new

# search for xac + hypothetical
xac = serv.bfind("T00084 hypothetical")
# get the IDS into an array
ids = xac.map { |gene| $1 if gene =~/^(.*?)\s+/ }
# retrieve fasta and print
ids.each { |id| puts serv.bget("-f -n 1 #{id}") }

This retrieves protein sequences; you'd need to adjust the parameters to bget for other options.

Approach 2

Download the fasta files from the NCBI (e.g. the *.faa files for protein sequence) and parse the header for the word "hypothetical" using one of the many tools available to parse fasta files.

ADD COMMENT • link 12.1 years ago by Neilfws 49k

0

Entering edit mode

Thank you very much! I love your approach 1. It give me the chance to get more knowledge. I am a biologist. However, it printed out an error:

> $ get_fasta4  
/usr/lib/ruby/vendor_ruby/bio/io/soapwsdl.rb:63:in `create_driver': uninitialized constant Bio::SOAPWSDL::SOAP (NameError)
from /usr/lib/ruby/vendor_ruby/bio/io/keggapi.rb:201:in `initialize'  
from /home/marcelo/bin/scripts/get_fasta4:5:in `new'  
from /home/marcelo/bin/scripts/get_fasta4:5:in `<main>'

ADD REPLY • link updated 12.1 years ago by Neilfws 49k • written 12.1 years ago by marcelolaia ▴ 10

0

Entering edit mode

I'm impressed that you tried this solution. I do not see that error, I'm using ruby 1.8.7. Perhaps you are using ruby 1.9? Try "ruby -v" to find out. In which case, you may need to "gem install soap4r-ruby1.9".

ADD REPLY • link 12.1 years ago by Neilfws 49k

0

Entering edit mode

ruby 1.9.3p194 (2012-04-20 revision 35410) [i486-linux]

ADD REPLY • link 12.1 years ago by marcelolaia ▴ 10

0

Entering edit mode

# gem install soap4r-ruby1.9
Fetching: soap4r-ruby1.9-2.0.5.gem (100%)
Successfully installed soap4r-ruby1.9-2.0.5
1 gem installed
Installing ri documentation for soap4r-ruby1.9-2.0.5...
Installing RDoc documentation for soap4r-ruby1.9-2.0.5...

$get_fasta4
/usr/lib/ruby/1.9.1/rubygems/custom\_require.rb:36:in \`require': iconv will be deprecated in the future, use String#encode instead.
/home/marcelo/bin/scripts/get\_fasta4:10:in `<main>': undefined method `map' for #<String:0x99bd1fc> (NoMethodError)

ADD REPLY • link updated 12.1 years ago by Neilfws 49k • written 12.1 years ago by marcelolaia ▴ 10

0

Entering edit mode

OK, installation was successful but for some reason, map not working as expected. I'm afraid that as I do not use ruby 1.9, I don't have time to troubleshoot this. My best suggestion is to use 1.8.7 if possible (perhaps under RVM - https://rvm.io/) since I know the code works in that case.

ADD REPLY • link 12.1 years ago by Neilfws 49k