Output EMBOSS Needle consensus sequence Biopython/Python 2.7
0
0
Entering edit mode
9.2 years ago
dmenning • 0

I have a working python script for EMBOSS Needle that takes the input from two files and outputs the expected paired alignment.

from Bio.Emboss.Applications import NeedleCommandline
from Bio import AlignIO

needle_fname = open("0needleout.txt", "a")

needle_cline = NeedleCommandline(asequence="2for.fasta", bsequence="2rev.fasta", gapopen=10, gapextend=0.5, outfile="0needle.fasta")
print(needle_cline)
needle_fname.write('\n' + str(needle_cline))

stdout, stderr = needle_cline()
print stdout + stderr

align = AlignIO.read("0needle.fasta", "emboss")
print(align)

needle_fname.close()

However, instead of the current output showing both sequences and where they overlap, I would like to output the complete consensus sequence including the non overlapped ends.

Current output:

UAF                0 -------------------------------------------------------------------------------------------------------------         0
UAR                1 TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC         50

UAF                0 ---------------------------------------------------------------------------------------------------------------       0
UAR               51 TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG       100

UAF                1 ----------------------------GTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC      38
UAR              101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC     150

UAF               39  GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    88
UAR              151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    200

UAF               89  ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT      138
UAR              201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT      250

UAF              139 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT        188
UAR              251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT        300

UAF              189 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT      238
UAR              301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT     350

UAF              239 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT     288
UAR              351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT    400

UAF              289 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT     338
UAR              401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT    450

UAF              339 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA    388
UAR              451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA    500

UAF              389 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    438
UAR              501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    550

UAF              439 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG    477
UAR              551 GTGTGGGGGTTTCTATGTTGAAACTATACCTG---------------    582

Desired output, ideally in .fasta format:

UAC                1   TCCCTTCATTATTATCGGACAACTAGCCTCCATTCTCTACTTTACAATCC        50
UAC               51  TCCTAGTACTTATACCTATCGCTGGAATTATTGAAAACAGCCTCTTAAAG      100
UAC              101 TGGAGAGTCTTTGTAGTATAGCAATTACCTTGGTCTTGTAAGCCAAAAAC     150
UAC              151 GGAGAATACCTACTCTCCCTAAGACTCAAGGAAGAAGCAACAGCTCCACT    200
UAC              201 ACCAGCACCCAAAGCTAATGTTCTATTTAAACTATTCCCTGGTACATACT       250
UAC              251 ACTATTTTACCCCATGTCCTATTCATTTCATATATACCATCTTATGTGCT         300
UAC              301 GTGCCATCGCAGTATGTCCTCGAATACCTTTCCCCCCCTATGTATATCGT      350
UAC              351 GCATTAATGGTGTGCCCCATGCATATAAGCATGTACATATTACGCTTGGT     400
UAC              401 CTTACATAAGGACTTACGTTCCGAAAGCTTATTTCAGGTGTATGGTCTGT      450
UAC              451 GAGCATGTATTTCACTTAGTCCGAGAGCTTAATCACCGGGCCTCGAGAAA     500
UAC              501 CCAGCAACCCTTGCGAGTACGTGTACCTCTTCTCGCTCCGGGCCCATGGG    550
UAC              551 GTGTGGGGGTTTCTATGTTGAAACTATACCTGGCATCTG                              589

Any suggestions?

sequence alignment • 2.4k views
ADD COMMENT

Login before adding your answer.

Traffic: 1071 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6