Get locus_tag list from gene list using genbank file and Biopython
1
5
Entering edit mode
10.3 years ago
spezza ▴ 50

I have a list of >100 gene names and want to get the locus_tags. Is there a way I can do this using its genbank file and Biopython?

For example, the genbank file has

 /gene="murE"
 /locus_tag="BSUW23_07815"

If my list just has murE, I'd like it to print out the corresponding BSUW23_07815

genome annotation biopython genbank • 9.4k views
ADD COMMENT
0
Entering edit mode

I'm having a similar problem

If i have a list with gene names and i want to get all the information from that gene contained in CDS and GENE in the gbk file? How could i do?

ADD REPLY
9
Entering edit mode
10.3 years ago
Peter 6.0k

Something like this should work:

from Bio import SeqIO
​genbank_file = "example.gbk" # insert your filename here
wanted = ["murE", ...] # or load all your 100 genes from a file
for record in SeqIO.parse(genbank_file, "genbank"):
    for f in record.features:
        if f.type == "CDS" and "gene" in f.qualifiers:
            gene = f.qualifiers["gene"][0]
            if gene  in wanted:
                print f.qualifiers["gene"][0], f.qualifiers["locus_tag"][0]

See also http://www.warwick.ac.uk/go/peter_cock/python/genbank/

ADD COMMENT
0
Entering edit mode

thanks! I've came across your site before and found it quite helpful... I only wish there were more examples

ADD REPLY
0
Entering edit mode

Hello, Peter. If i have a list with gene names and i want to get all the information contained in CDS and GENE in the gbk file? How could i do?

ADD REPLY

Login before adding your answer.

Traffic: 1652 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6