When I use BioPython to create a sequence iterator, I find that any characters after the first space (" ") in the header are ignored. For instance:
If file.fasta is:
>A Header With Spaces
ATCGATCGATGC
The following code:
for sequence in SeqIO.parse(open("file.fasta"), "fasta"):
print sequence.id
Will print:
A
Is there a way to get the full sequence id while still taking advantage of BioPython's utility?
The ID in FASTA is defined as everything that comes before the first space, so this behaviour is exactly right.