Entering edit mode
7.5 years ago
s.i.lipworth
•
0
I have a list of positions of interest eg:
10
20
1000
4000000
I want to extract the base call at these positions from a fasta file using biopython. This is what I have tried:
query_dic ={}
with open(line) as pos_file:
for x in pos_file:
for seq_record in SeqIO.parse(query_file, "fasta"):
nuc = seq_record[x]
query_dic[x]=nuc
The error message says 'invalid index' - what is wrong?
Steps:
iterate FASTA records:
Firstly, you should get the right Chromosome; then extract the base from fasta sequence.
Does you FASTA file have one sequence in it, or many?
If one, you only need to open the FASTA file once, and you should use
SeqIO.read
for that.If many, you need to know which sequence each of the values
x
refers to. PerhapsSeqIO.index
would be useful here for loading the relevant record from a multiple sequence FASTA file?