Automated sliding window analysis of % identity on a Clustal Omega DNA alignment
1
0
Entering edit mode
8.4 years ago
c.merrick • 0

I would like to align ~15 DNA sequences of 1-2kb, then assess the degree of homology across the alignment using a sliding window. It will be very laborious to do this manually, even using quite large jumps in the windows. Is there a facility to do such analysis automatically with the Clustal Omega tool (or other tool)? Many thanks Catherine Merrick

alignment • 3.0k views
ADD COMMENT
0
Entering edit mode

Sorry, just a pet peeve, but "degree of homology" makes no sense. Either a sequence is homologous, or it is not. I.e. either they share a common ancestor or they do not! check here for a way to get you started.

ADD REPLY
0
Entering edit mode

Thank you. (I take the point re semantics, but it doesn't stop biologists in common parlance saying things like 'highly homologous'!). I think I now have various ways of getting the measures I need (entropy scores), inc. this http://evolve.zoo.ox.ac.uk/Evolve/SHiAT.html and this https://www.ebi.ac.uk/thornton-srv/databases/cgi-bin/valdar/scorecons_server.pl ... so I need to bin them appropriately across the alignment.

ADD REPLY
0
Entering edit mode
8.4 years ago
Whetting ★ 1.6k

Too many biologists do indeed say % homology, but that does not mean you should. It is wrong, not just a semantics issue...

Not tested and written on my tiny phone screen, but assuming you can get Biopython running, something like this should generate the input files for your favorite program?
This should print out a bunch of alignments o length window with overlap step.

from Bio import AlignIO

window=10
step=5


alignment = AlignIO.read("alignment_in_fasta_format.fas","fasta"):
    for r in range(0,len(alignment[1]),step):
        with open("window_"+r+".fas") as out: 
            print >>out, ">%s\n%s" %(alignment[:,r:r+window])
ADD COMMENT

Login before adding your answer.

Traffic: 1514 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6