Question

Beginner in Python- translating DNA given in GenBank file format into its six reading frames as output

0

Entering edit mode

8.4 years ago

oki4 ▴ 10

Goal: Your task is to write a program to translate a DNA sequence, given in a GenBank file format called sequence.gb, into all six reading frames as output. We are given a template or starting code to work with

GenBank input file: http://web.njit.edu/~kapleau/teach/current/bnfo135/sequence.gb

My code:

    from urllib.request import urlopen
##    ''' The dna2rna function converts a sequence of DNA, given as a
##        parameter and returns an RNA sequence.
##    '''

def dna2rna(sequence):
    rna_seq = sequence.replace('T', 'U')
    return(rna_seq)

codon2aa = {'aaa': 'K', 'aac': 'N', 'aag': 'K', 'aau': 'N',
            'aca': 'T', 'acc': 'T', 'acg': 'T', 'acu': 'T',
            'aga': 'R', 'agc': 'S', 'agg': 'R', 'agu': 'S',
            'aua': 'I', 'auc': 'I', 'aug': 'M', 'auu': 'I',

            'caa': 'Q', 'cac': 'H', 'cag': 'Q', 'cau': 'H',
            'cca': 'P', 'ccc': 'P', 'ccg': 'P', 'ccu': 'P',
            'cga': 'R', 'cgc': 'R', 'cgg': 'R', 'cgu': 'R',
            'cua': 'L', 'cuc': 'L', 'cug': 'L', 'cuu': 'L',

            'gaa': 'E', 'gac': 'D', 'gag': 'E', 'gau': 'D',
            'gca': 'A', 'gcc': 'A', 'gcg': 'A', 'gcu': 'A',
            'gga': 'G', 'ggc': 'G', 'ggg': 'G', 'ggu': 'G',
            'gua': 'V', 'guc': 'V', 'gug': 'V', 'guu': 'V',

            'uaa': '_', 'uac': 'Y', 'uag': '_', 'uau': 'Y',
            'uca': 'S', 'ucc': 'S', 'ucg': 'S', 'ucu': 'S',
            'uga': '_', 'ugc': 'C', 'ugg': 'W', 'ugu': 'C',
            'uua': 'L', 'uuc': 'F', 'uug': 'L', 'uuu': 'F'}
if __name__ == '__main__':
    with urlopen('https://web.njit.edu/~kapleau/teach/current/bnfo135/sequence.gb') as conn:
        data = conn.readlines()
    lines = [line.strip() for line in [datum.decode() for datum in data]]
    flag = False
    dna = ''

for line in lines:
    ## if the flag is 'True', append the line to 'dna'.
    if flag == True:
        dna.append(line)
    ## if the word "ORIGIN" is in the line, set 'flag' to 'True'
        if 'ORIGIN' in line:
            flag = True
    pass

## gets rid of any non-dna character.
dna = dna.translate(str.maketrans('acgt', 'acgt', '0123456789 /'))

## calls the dna2rna function
rna = dna2rna(dna)

**## process the first 3 reading frames
for i in range(3):
    if rna[0:3] in codon2aa:**

    ## create a variable 'seq' and assign it the rna to process
    seq = ''
    amino = ''
    while len(seq) >= 3:
        ## use the codon2aa table to append an amino acid to 'amino'
        ## update 'seq' to the next codon
        pass
    print('--- Reading Frame %i ---' % (i+1), amino, sep='\n')
##
##    ## compute the reverse complement of 'rna' and assign the result
##    ## back into the 'rna' variable
##
##    ## process the next 3 reading frames. hint: just like the first 3
##    for i in range(3):
##        ## same as the first 3
##        print('--- Reading Frame %i ---' % (i+4), amino, sep='\n')
##

I would like to know if I'm on the correct path so far. Also I'm having trouble, processing the 3 reading frames (bolded section), and would like some input. Thanks.

python GenBank • 4.6k views

ADD COMMENT • link updated 7.0 years ago by connor.cattafe21 • 0 • written 8.4 years ago by oki4 ▴ 10

0

Entering edit mode

Have you been instructed not to use a library like biopython?

This can be accomplished pretty easily with SeqIO with the builtin translate() from biopython.

ADD REPLY • link 8.4 years ago by Eric Lim ★ 2.2k

0

Entering edit mode

No I can't use BioPython unfortunately.

ADD REPLY • link 8.4 years ago by oki4 ▴ 10

0

Entering edit mode

Hi,

Did you ever find an answer to the project?

ADD REPLY • link 7.0 years ago by connor.cattafe21 • 0

0

Entering edit mode

The answer is probably here: C: Beginner in Python- translating DNA given in GenBank file format into its six re

ADD REPLY • link 7.0 years ago by WouterDeCoster 48k

0

Entering edit mode

Is that hyperlink meant to take me back to this page?

ADD REPLY • link 7.0 years ago by connor.cattafe21 • 0

0

Entering edit mode

It takes you to the comment made by Eric Lim, suggesting you to use translate() from Biopython.

ADD REPLY • link 7.0 years ago by WouterDeCoster 48k

1

Entering edit mode

The biopython cookbook actually shows how to do this but instead of translate you could just call your table instead.

ADD REPLY • link 7.0 years ago by skbrimer ▴ 740