Entering edit mode
9.2 years ago
dmenning
•
0
I have a working python script for EMBOSS Needle that takes the input from two files and outputs the expected paired alignment.
from Bio.Emboss.Applications import NeedleCommandline
from Bio import AlignIO
needle_fname = open("0needleout.txt", "a")
needle_cline = NeedleCommandline(asequence="2for.fasta", bsequence="2rev.fasta", gapopen=10, gapextend=0.5, outfile="0needle.fasta")
print(needle_cline)
needle_fname.write('\n' + str(needle_cline))
stdout, stderr = needle_cline()
print stdout + stderr
align = AlignIO.read("0needle.fasta", "emboss")
print(align)
needle_fname.close()
However, instead of the current output showing both sequences and where they overlap, I would like to output the complete consensus sequence including the non overlapped ends.
Current output:
UAF 0 ------------------------------------------------------------------------------------------------------------- 0
UAR 1 TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC 50
UAF 0 --------------------------------------------------------------------------------------------------------------- 0
UAR 51 TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG 100
UAF 1 ----------------------------GTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC 38
UAR 101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC 150
UAF 39 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT 88
UAR 151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT 200
UAF 89 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT 138
UAR 201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT 250
UAF 139 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT 188
UAR 251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT 300
UAF 189 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT 238
UAR 301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT 350
UAF 239 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT 288
UAR 351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT 400
UAF 289 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT 338
UAR 401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT 450
UAF 339 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA 388
UAR 451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA 500
UAF 389 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG 438
UAR 501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG 550
UAF 439 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG 477
UAR 551 GTGTGGGGGTTTCTATGTTGAAACTATACCTG--------------- 582
Desired output, ideally in .fasta format:
UAC 1 TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC 50
UAC 51 TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG 100
UAC 101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC 150
UAC 151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT 200
UAC 201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT 250
UAC 251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT 300
UAC 301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT 350
UAC 351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT 400
UAC 401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT 450
UAC 451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA 500
UAC 501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG 550
UAC 551 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG 589
Any suggestions?