Get location based on protein ids from genbankfile
1
1
Entering edit mode
8.7 years ago
Tony ▴ 10

Hello everyone.. I am quite new in here also with biopython.

I have a set of protein ids, using them i extracted locus_tags from genbank file, But i wonder if there is a way to extract locations of those genes i.e. start and end position using existing information??

for example: I have two files,

  1. genbank file
  2. text file containing protein_ids

using this protein id file, i need to get start and end positions of the corresponding gene from that gbk file.

Many thanks and i really appreciate this service.. :)

genome gene R python • 2.5k views
ADD COMMENT
0
Entering edit mode

But i am looking for the gene location on genome, as we know in genbank file, for each gene start and end co-rodinates are given, using existing information i.e. gene i.d or locus tag, i would like to get those start and end co-ordinates of corresponding genes.

ADD REPLY
0
Entering edit mode

Please add an example .

ADD REPLY
1
Entering edit mode
8.7 years ago
skbrimer ▴ 740

You should be able to make a script that will reference the protein file and the genbank file and either append the protein file or just make a new one. For the looping through of the genbank file you can use this loop from one of my scripts

for record in SeqIO.parse(open(gb_file,"rU"),"genbank"):
    for feature in record.features:
        if feature.type == 'CDS':
            start = int(feature.location.start)
            stop = int(feature.location.end)
            try:
                name = feature.qualifiers['gene'][0]
            except:
                #some features only have locus tags
                name = feature.qualifiers['locus_tag'][0]
            if feature.strand < 0:
                strand = "-"
            else:
                strand = "+"
            bed_line = record.id +"\t{0}\t{1}\t{2}\t500\t{3}\t{0}\t{1}\t50,205,50\n".format(start, stop, name,strand)
            out_bedfile.write(bed_line)

this should get you wan you want, you can find the whole script here More file parsing :) EDIT how do I make a fast and bed file from Genbank file - SOLVED

ADD COMMENT

Login before adding your answer.

Traffic: 2650 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6