Bash Script To Bp_Fetch A List Of Cds Acession Numbers
2
0
Entering edit mode
11.0 years ago
eddie.im ▴ 140

Hello,

I'm trying to write a bash script to use bioperl bash command bp_fetch and return a CDS list in fasta format.

I have installed bioperl and bp_fetch works as intended:

xxxx@xxxx:~/Documents$ bp_fetch net::embl:AAA24658.1

>AAA24658 Escherichia coli hypothetical protein
ATGGAACGTTGCGGCTGGGTGAGTCAGGACCCGCTTTATATTGCCTACCATGATAATGAG
TGGGGCGTGCCTGAAACTGACAGTAAAAAACTGTTCGAAATGATCTGCCTTGAAGGGCAG
CAGGCTGGATTATCGTGGATCACCGTCCTCAAAAAACGCGAAAACTATCGCGCCTGCTTT
CATCAGTTCGATCCGGTGAAGGTCGCAGCAATGCAGGAAGAGGATGTCGAAAGACTGGTA
CAGGACGCCGGGATTATCCGCCATCGAGGGAAAATTCAGGCAATTATTGGTAATGCGCGG
GCGTACCTGCAAATGGAACAGAACGGCGAACCGTTTGTCGACTTTGTCTGGTCGTTTGTA
AATCATCAGCCACAGGTGACACAAGCCACAACGTTGAGCGAAATTCCCACATCTACGTCC
GCCTCCGACGCCCTATCTAAGGCACTGAAAAAACGTGGTTTTAAGTTTGTCGGCACCACA
ATCTGTTACTCCTTTATGCAGGCATGTGGGCTGGTGAATGATCATGTGGTTGGCTGCTGT
TGCTATCCGGGAAATAAACCATGA

The problem is that i have a file with long list of acession numbers, and i'm trying to do a bash script to run "bp_fetch net::" with every line of the file (each line has an acession number) and output the list of FASTA CDS's in a txt file. I have tried something like:

for line in $file
do
    bp_fetch net::$line

But I'm having no success. Can I get some help?

Thanks in advance.

cds bioperl bash • 3.1k views
ADD COMMENT
3
Entering edit mode
11.0 years ago

You can try something like that (assuming the accessions are listed in the file accessions.list):

while read accession_number ; do bp_fetch net::${accession_number} ; done < accessions.list > results.txt

Note that you can inject the list of accessions at the end of the while loop (< accessions.list), and collect the results in a new file (> results.txt).

ADD COMMENT
0
Entering edit mode

Hey Frédéric, thanks a lot! Your script worked flawlessly! :)

ADD REPLY
1
Entering edit mode
11.0 years ago
lh3 33k
cat accessions.list | xargs -i echo bp_fetch net::{} | sh > results.txt

I use the above command line more often as I can check whether the command line is correct before feeding is to sh. The shorter version is:

xargs -i bp_fetch net::{} < accessions.list > results.txt

I believe you can also give bp_fetch multiple accessions in the command line. In that case:

cat accessions.list | sed s,^,net::, | xargs bp_fetch > results.txt
ADD COMMENT

Login before adding your answer.

Traffic: 1614 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6