I would like to know if it is possible to download the sequence FASTA of a pdb file using biopython
I would like to know if it is possible to download the sequence FASTA of a pdb file using biopython
Kind of a hacky solution (since it downloads the PDB first technically) but here's something you can use as a one-liner:
$ wget -O - https://files.rcsb.org/download/1A80.pdb 2>/dev/null \
| python -c "import sys; from Bio import SeqIO; SeqIO.convert(sys.stdin, 'pdb-atom', sys.stdout, 'fasta')"
>1A80:A
TVPSIVLNDGNSIPQLGYGVFKVPPADTQRAVEEALEVGYRHIDTAAIYGNEEGVGAAIA
ASGIARDDLFITTKLWNDRHDGDEPAAAIAESLAKLALDQVDLYLVHWPTPAADNYVHAW
EKMIELRAAGLTRSIGVSNHLVPHLERIVAATGVVPAVNQIELHPAYQQREITDWAAAHD
VKIESWGPLGQGKYDLFGAEPVTAAAAAHGKTPAQAVLRWHLQKGFVVFPKSVRRERLEE
NLDVFDFDLTDTEIAAIDAMDPGDGSGRVSAHPDEVD
Just replace 1A80
in the wget
link to whatever the PDB ID you're interested in is. BioPython doesn't have the ability to download the data inherently, so you need to pass it the file somehow. I've elected to do this in the shell, but you could also do this natively with python, but its more complicated (IMO).
If you want to save it as a file, stick a redirect to a file at the end of the command:
(previous command)... > pdbsequence.fa
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
[ Please read before posting a question ] -- How To Ask A Good Question - what have you tried so far?
You can use NCBI unix eutils
There was a post some time ago:
How download a sequence fasta from PDB using biopython / python?