How to use python to align 30 kb genome to another 30kb reference, repeat 10x times ?

0

Entering edit mode

4.0 years ago

d.s.account ▴ 10

So the problem is stated in the title. The two genomes are from NCBI (SARS-CoV-2). In addition, I want to obtain the variants. I want to note that the sequences are highly similar, and I think the task should be facilitated by this similarity

I almost finished my script using biopython but a run_out_of_Memory-error appeard after I switched from my 20 bp toy-sequence to the 30kb genome. I found then that biopython is more a tool to glue other programs together.

I tried using mafft, and it runs, but I get a weird error. (ApplicationError: Non-zero return code 1 from 'mafft Ref_andOneMore.fasta', message 'nthread = 0' ) I'm further trying MAFFT, but I doubt that this is the right tool. Can somebody tell me what pipeline is best suited for this task?

Alignment reference-alignment python • 927 views

ADD COMMENT • link updated 4.0 years ago by WouterDeCoster 48k • written 4.0 years ago by d.s.account ▴ 10

0

Entering edit mode

Why do you voluntarily make things harder by using these tools through python? Can you not just use MAFFT or any other tool directly?

ADD REPLY • link 4.0 years ago by WouterDeCoster 48k

Login before adding your answer.