Sequence extaction using Biopython
2
0
Entering edit mode
9.2 years ago

Hi.

I am a new user of Biopython. I need to extract some sequences from a fasta file based on their coordinates. The file is composed of 10 chromosomes and I need to extract say sequence with coordinate such as chr3:101456..105689 .How do I do it. Thanks in advance!

genome • 3.5k views
ADD COMMENT
0
Entering edit mode

People, this might be an old post, but from Bio import SeqIO does not seem to work. is it depreciated?

ADD REPLY
0
Entering edit mode

Please use ADD COMMENT or ADD REPLY to answer to previous reactions, as such this thread remains logically structured and easy to follow. I have now moved your post but as you can see it's not optimal. Adding an answer should only be used for providing a solution to the question asked.

This probably would have been more appropriate as a separate question - since it's not the same as the initial question.

does not seem to work.

At the very minimum, provide the error message. We can't get in front of your pc and see what goes on. Biopython for sure isn't deprecated. But I just think you haven't installed biopython. Read the installation instructions.

ADD REPLY
0
Entering edit mode
9.2 years ago
Asaf 10k

Read the fasta file:

from Bio import SeqIO

chrs = {}
for seq in SeqIO.parse("filename.fa", "fasta"):
    chrs[seq.id] = seq.seq

Extract:

chrs[chrname][from_pos:to_pos]

e.g., chrs['chr3'][101455:105689]

Be aware that the sequence is 0-based and the last coordinate is one after the end coordinate like everything in python

ADD COMMENT
0
Entering edit mode
9.2 years ago

Thanks, the program does work but it is repeating itself. What shall I do. Here's what I wrote

from Bio import SeqIO
chrs={}
for seq in SeqIO.parse("brapav5.fa","fasta"):
    chrs[seq.id]=seq.seq
    print chrs['A01'][101455:105689]
ADD COMMENT
0
Entering edit mode

You should read the sequences only once and then retrieve as many times as you want.

ADD REPLY

Login before adding your answer.

Traffic: 1667 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6