Python - Alignment highlighting mismatch characters
2
0
Entering edit mode
10.4 years ago
st.ph.n ★ 2.7k

Greetings, I have some protein sequences in which I want to align, and highlight the mismatches using python. Here's what I have so far (after creating a list of the AA seqs):

new = []
for seq1 in allkeeps_sorted:
        if len(seq1) != len(VJ):
                xlen = (len(VJ) - len(seq1))*"X"
                print xlen
                newseq = seq1 + str(xlen)
                new.append(newseq)
        else:
                new.append(seq1)

for seq1 in new:
        for x in range(0, len(VJ)):
                if (seq1[x] != VJ[x]):
                        match = colored(seq1[x], "red")
                else:
                        match = colored(seq1[x], "white")
                align.append(match)

print align

In the list "seqs" some of the AA sequences are shorter than the "VJ" sequence. So I attempted to add "X"'s to the end of the sequence to make the strings within the list equal, in order to highlight mismatches. However, this did not work. I'm doing this in a Linux terminal, so I want the mismatched AA's to the VJ sequence to be colored red.

All help is appreciated.

mismatch python colored-text alignment • 4.5k views
ADD COMMENT
0
Entering edit mode

I should probably stop asking this question: Why reinvent the wheel? Why not use existing MSA software (like Clustal) that can be customized in a multitude of ways?

ADD REPLY
0
Entering edit mode

Also, I'd ideally check if len(seq1)<len(VJ) and not use a !=. Just erring on the side of caution.

ADD REPLY
1
Entering edit mode
8.3 years ago
mg0010 ▴ 10

Software ResCons does this. It is a cross-OS platform software. It can highlight mismatching characters in alignment and show results in any modern browser. Available for download from https://github.com/ManavalanG/ResCons/releases

ADD COMMENT
0
Entering edit mode

That's a nice tool - and a smart plug :)

Maybe include a few screenshots on your page?

ADD REPLY
0
Entering edit mode
10.4 years ago
Ram 44k

The code looks good. Are you sure the Terminal is not suppressing the colors?

ADD COMMENT
0
Entering edit mode

I know I can run clustal and get a fasta outputted within Python from Bio Python. I just wanted to do it with Python code alone, since it may serve future purposes.

When the line: print align is executed, I get a list containing way too many of these

'\x1b[37mY\x1b[0m', '\x1b[37mN\x1b[0m', '\x1b[37mQ\x1b[0m', '\x1b[37mK\x1b[0m', '\x1b[37mF\x1b[0m', '\x1b[37mK\x1b[0m', '\x1b[31mD\x1b[0m', '\x1b[37mK\x1b[0m', '\x1b[37mA\x1b[0m', '\x1b[37mT\x1b[0m', '\x1b[37mL\x1b[0m', '\x1b[37mT\x1b[0m', '\x1b[31mI\x1b[0m', '\x1b[37mD\x1b[0m', '\x1b[37mK\x1b[0m', '\x1b[37mS\x1b[0m', '\x1b[37mS\x1b[0m', '\x1b[31mC\x1b[0m', '\x1b[37mT\x1b[0m', '\x1b[37mA\x1b[0m', '\x1b[37mH\x1b[0m', '\x1b[37mM\x1b[0m', '\x1b[37mE\x1b[0m', '\x1b[37mL\x1b[0m', '\x1b[37mR\x1b[0m', '\x1b[37mS\x1b[0m', '\x1b[37mL\x1b[0m', '\x1b[37mT\x1b[0m', '\x1b[37mS\x1b[0m', '\x1b[37mE\x1b[0m', '\x1b[37mD\x1b[0m',

So I'm not even getting the list in white. (way too many = more items than in seqs list). I tested the termcolor module out on the python shell. It does work.

ADD REPLY
0
Entering edit mode

Python is encountering some characters that it's unable to handle. That would be my most educated guess. Does the sequence go "YNQK...."?

ADD REPLY
0
Entering edit mode

They are all normal AA seqs..mostly caps, some lowercase chars.

ADD REPLY
0
Entering edit mode

Check the header line, maybe?

ADD REPLY
1
Entering edit mode

There are no heads in the input file, no fasta. just a list of seqs.

ADD REPLY
0
Entering edit mode

Which module is the "colored" method located in?

ADD REPLY
0
Entering edit mode

So, I checked. The colored() method of termcolor seems to output those weird characters. Lemme try and solve it.

ADD REPLY
0
Entering edit mode

The termcolor package is crap. Doesn't work at all. :(

ADD REPLY
0
Entering edit mode

The \x1b maps to a right arrow character. It's non printable, I guess, so that's causing a problem.

ADD REPLY
0
Entering edit mode

It appears to be ANSI code

ADD REPLY

Login before adding your answer.

Traffic: 1553 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6