Passing Alignment Parameters Through Clustalwcommandline
1
2
Entering edit mode
14.5 years ago
Thaman ★ 3.3k

As i am trying to do more work in Multiple sequence alignment using clustalw wrapper in biopython, i am wondering how can i pass parameters through clustalcommandline which i fail to do repeatedly.

Parameters in details like :

  • Gap open penalty, Gap extension penalty, no end gap(yes, no), gap distance, weight matrix(blosum, pam etc), type (DNA,protein) and other optional parameters.

Right now i am working on default alignment settings provided by clustalw. As default settings are not fulfilling my interest, i am more keen in adding parameters in given below lines.

 import sys, subprocess
 from Bio import AlignIO
 from Bio.Align.Applications import ClustalwCommandline
 cline = ClustalwCommandline("clustalw",
      infile="opuntia.fasta")
 child = subprocess.call(str(cline),
      shell=(sys.platform!="win32"))

Moreover, i will be more pleased if you guys will explain how can i know the inputted fasta file is of nucleotide,DNA or protein.

Thanks for your interest

multiple python biopython clustalw • 6.0k views
ADD COMMENT
5
Entering edit mode
14.5 years ago

I didn't test this, but it all gets a lot clearer if you look at the source of Bio/Align/Applications/_Clustalw.py

  • Gap open penalty: -gapopen
  • Gap extension penalty: -gapext
  • no end gap(yes, no): -endgaps
  • gap distance: -gapdist
  • weight matrix(blosum, pam etc): -matrix ["BLOSUM", "PAM", "GONNET", "ID"]
  • type (DNA,protein): -type

note: if you use ipython, you can look at the code of a function easily, just type ClustalwCommandline??

In any case, notice that you can access all these options after having created a wrapper for the clustalw command line. For example:

>>> c = ClustalwCommandline(type='dna')
>>> dir(c)
['__class__', '__delattr__', '__dict__', '__doc__', '__getattribute__', '__hash__', 
'__init__', '__module__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', 
'__setattr__', '__str__', '__weakref__', '_check_value', '_clear_parameter', 
'_get_parameter', '_validate', 'align', 'bootlabels', 'bootstrap', 'case', 'check', 
'clustering', 'convert', 'dnamatrix', 'endgaps', 'fullhelp', 'gapdist', 'gapext', 'gapopen', 
'helixendin', 'helixendout', 'helixgap', 'help', 'hgapresidues', 'infile', 'iteration', 
'kimura', 'ktuple', 'loopgap', 'matrix', 'maxdiv', 'maxseqlen', 'negative', 'newtree', 
'newtree1', 'newtree2', 'nohgap', 'nopgap', 'nosecstr1', 'nosecstr2', 'noweights', 'numiter', 
'options', 'outfile', 'outorder', 'output', 'outputtree', 'pairgap', 'parameters', 'profile', 
'profile1', 'profile2', 'program_name', 'pwdnamatrix', 'pwgapext', 'pwgapopen', 'pwmatrix', 
'quicktree', 'quiet', 'range', 'score', 'secstrout', 'seed', 'seqno_range', 'seqnos', 
'sequences', 'set_parameter', 'stats', 'strandendin', 'strandendout', 'strandgap', 
'terminalgap', 'topdiags', 'tossgaps', 'transweight', 'tree', 'type', 'usetree', 'usetree1', 
'usetree2', 'window']
>>> c.gapopen = -2
>>> print c
clustalw -type=dna -gapopen=-2
ADD COMMENT
1
Entering edit mode

Also try typing help(c) at the python prompt to find out more about the command line wrapper object you've just created.

ADD REPLY
0
Entering edit mode

Ok let me try and see whether my understanding will work or not. If not then i will again click you. Thanks

ADD REPLY
0
Entering edit mode

you are welcome... I don't want to be silly, but please consider voting up the answers that you find useful, even if you don't want to accept them :-)

ADD REPLY
0
Entering edit mode

Can you answer my below query about inputted fasta file.

ADD REPLY
0
Entering edit mode

I think you should ask that as a separate question. In principle, neither clustalw nor Bio.Align.Applications from Biopython have tools to determine whether a sequence is protein or dna.

ADD REPLY
0
Entering edit mode

Okei i will do that.

ADD REPLY

Login before adding your answer.

Traffic: 1655 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6