Which Alphabet Type Should I Use With Fasta Files In Biopython?
1
0
Entering edit mode
11.7 years ago
sameer ▴ 10

If I'm using the FASTA files from the link below, what Alphabet type should I use in Biopython? Would it be IUPAC.unambiguous_dna?

link to FASTA files: http://hgdownload.cse.ucsc.edu/goldenPath/hg19/chromosomes/?C=S;O=A

fasta biopython • 2.7k views
ADD COMMENT
1
Entering edit mode
11.7 years ago
Peter 6.0k

This is a duplicate of your question to the Biopython mailing list: http://lists.open-bio.org/pipermail/biopython/2013-March/008415.html

My answer is here: http://lists.open-bio.org/pipermail/biopython/2013-March/008416.html

Essentially I would use genericdna rather than unambiguousdna for now. The IUPAC alphabet object has a white list of expected letters, but current versions of Biopython do not enforce this in the sequence objects. That may change.

from Bio.Alphabet import generic_dna
from Bio.Alphabet.IUPAC import unambiguous_dna

You don't actually need to specify an alphabet at all, but telling Biopython the sequence is DNA will prevent some user errors.

ADD COMMENT
0
Entering edit mode

And as of Biopython 1.78, you can't specify the alphabet - Bio.Alphabet was removed.

ADD REPLY

Login before adding your answer.

Traffic: 1574 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6