Extract Line Organism From Genbank File With A List Of Accession Number
2
1
Entering edit mode
12.3 years ago
Dph ▴ 10

I have a file with numerous accession numbers (listid.txt) and would like to extract the whole line ORGANISM (in bold below) from a genbank file for each accession

LOCUS AB000106 1343 bp rRNA linear BCT 05-FEB-1999

DEFINITION Sphingomonas sp. 16S ribosomal RNA.

ACCESSION AB000106

VERSION AB000106.1 GI:1754587

KEYWORDS 16S rRNA.

SOURCE Sphingomonas sp. ORGANISM Sphingomonas sp. Bacteria; Proteobacteria; Alphaproteobacteria; Sphingomonadales; Sphingomonadaceae; Sphingomonas.

.....

does somebody have an idea of a script to do this (i'm a begginner in python) ? thanks in advance David

parsing genbank • 8.1k views
ADD COMMENT
0
Entering edit mode

@David: What format is the genbank file in? I mean, do you have one file containing all the genbank records? Or do you have several genbank files? Also, do you have genbank files that are not in your listid.txt? What is the format of this txt file?

ADD REPLY
10
Entering edit mode
12.3 years ago

Say, you have your genbank records stored in the all.genbank file. Assuming you have BioPython installed, here's one solution:

from Bio import SeqIO

# Read all accession numbers into the list.
accession_numbers = [line.strip() for line in open('listid.txt')]

# Iterate over each genbank record.
fh = open('all.genbank')
for gb_record in SeqIO.parse(fh,'genbank'):
    acc = gb_record.annotations['accessions'][0]
    organism = gb_record.annotations['organism']
    tax_line = ("; ").join(gb_record.annotations['taxonomy'])
    if acc in accession_numbers:
        print acc, organism, tax_line
ADD COMMENT
0
Entering edit mode

I'm currently trying to get this to work. Can you help? I'm also a newbie at python! Thanks!

ADD REPLY
0
Entering edit mode
12.2 years ago
Dph ▴ 10

sorry for belated answers but was on holidays... happy to get back and see a little scirpt which is exactly what i needed.... thanks a lot Andrzej for this !!

ADD COMMENT

Login before adding your answer.

Traffic: 2542 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6