Hi group,
I'm trying to make a script in python that will change formats from fastq to fasta, which I have:
from Bio import SeqIO
import sys
# grabbing the file and the name
seq_file = sys.argv[1]
labels = seq_file.split(".")
# converting the file from fastq to fasta
SeqIO.convert(seq_file,"fastq",labels[0]+".fasta","fasta")
no problem; but now I would like to change the header of the fasta file in the same script and I'm stuck. When I add the SeqIO.parse
function like this
for seq_record in SeqIO.parse(labels[0]+".fasta","fasta"):
seq_record.id = labels[0] # renaming the pseudogene with the lab id
SeqIO.write(seq_record,labels[0]+".fasta","fasta")
I get an error saying I didn't define seq_record, which I thought I did, and the script fails. I thought the way this script would work is it would convert the file, making the new fasta file (which it does when I do not have the parse function in there), then parsing that file.
So now I'm wondering if it is in fact producing that file since its no longer the end of the script, do I need to make a temp file in order to due both actions in one script?
EDIT
Well it works now, so if anyone would like to do a similar thing here was my solution
# this script is used to convert fastq files to fasta files
# then to rename the fasta ID with the sample ID from the lab
from Bio import SeqIO
import sys
# grabbing the file and the name
seq_file = sys.argv[1]
labels = seq_file.split(".")
# converting the file from fastq to fasta
SeqIO.convert(seq_file,"fastq",labels[0]+".fasta","fasta")
# taking the converted file and then changing the fasta header
for seq_record in SeqIO.parse(labels[0]+".fasta","fasta"):
seq_record.id = labels[0] # renaming the pseudogene with the lab id
SeqIO.write(seq_record, labels[0]+".fasta","fasta")
No worries! Thank you for your help and you are correct if you are using the an older version, the latest version of Biopython, at least for the
SeqIO.parse
function, doesn't require the handle anymore.Now I'm just trying to figure out why its renaming the header and keeping the old header as well.
Thank you again.
I also add your suggestion, it makes for better file control. I'm really bad about remembering to use open and close file commands since my stuff is small. I just need to be more vigilant and the handle helps with that.
I also found a post that said to completely remove the old header I need to edit the old description I will paste my final code below.
Thank you again.