Hello everyone,
I am supposed to write a function that takes a name of a file (FASTA) as an argument. When passed the name of the file, the function should read the file, discard the header and return the sequence as a string. Now, I am being asked to rise a predefined (subclass?) (defined before my code) if the sequence part of the file contains characters that are not of the letters A,C,T,G,U. Also, all U nucleotides should be replaced by T in the returned string. I think I am on the right track but have no idea how to incorporate this subclass in my code if any of the letters are not A,C,T,G,U. I am working with a small file before defining the function but this is what I have got:
This is defined before my code:
# Run this cell to define the exception
class BadSequenceException(Exception):
pass
#my code:
file = open("sequence1.fasta")
all_lines = file.readlines()
sequences = []
with open('sequence1.fasta', 'r') as seq:
sequence = ''
for line in seq:
if line.startswith('>'):
sequences.append(sequence)
sequence = ''
else:
sequence += line.strip()
def check (sequence, code="ATGCU"):
for x in sequence:
if x not in code:
return False
return sequence.replace("U","T")
check(sequence)
I presume that the subclasse must be raised where the RETURN FALSE is?
Also, BadSequenceException is a subclass of the class Exception and inherits all its functionalities right? Any guidance on this would be very much appreciated. Thank you so much.
Hi! Is this the script that you use? If so, the def check part should be moved to the top.
Also if you're running
check
after reading the entire file, I think you should run it as you read each line (beforesequence += line.strip()
).https://stackoverflow.com/questions/23657545/classes-with-exception
Thank so much for your help. I will look at it carefully once I get back home after work :)
That was very helpful thank you so much.
Indeed, instead of the
return False
you'draise BadSequenceException(x + " is not a valid nucleobase")
(or something like that).In addition to that, do you really want to add the empty sequence (upon encountering the first sequence header
>
) to the set of sequences?That makes sense and helped me a lot thank you very much :)