Assign Pwm In Variable In Biopython
2
1
Entering edit mode
11.3 years ago
bioLife ▴ 50

I have a PWM (position weight matrix with the site specific frequencies) for a motif, how can I use it in Biopython? In module Motif from biopython, the Bio.motif does not seem to parse the PWM format right away.

Update: Here it is:

>consec1 
A [ 0.0726 0.3307 0.9284 0.4731 -0.0761 1.9941 -0.8980 -0.8980 ]
C [ 0.7140 0.9354 -0.0167 1.0279 1.1967 -0.7772 -0.8743 -0.8743 ]
G [ 0.5377 0.3913 0.7350 -0.0072 0.3856 -0.8254 -0.8254 1.6783 ]
T [ 0.3675 -0.1780 -0.4293 0.1086 -0.3954 -0.9879 2.3190 -0.9879 ]

I tried different things and I received different errors. One of these was about the header, then when I removed the header I received: "UnboundLocalError: local variable 'inst' referenced before assignment"

The code I used:

>from Bio import Motif
>motif = Motif.read(open("consec1.pfm"),"jaspar-sites")

and I get the following error:

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
/mnt/XI/home/png/ant/<ipython-input-25-d60b25d64c48> in <module>()
----> 1 m = Motif.read(open("consec1.pfm"),"jaspar-sites")

/usr/prog/python/2.6.6_gnu/lib/python2.6/site-packages/Bio/Motif/__init__.pyc in read(handle, format)
    121     iterator = parse(handle, format)
    122     try:
--> 123         first = iterator.next()
    124     except StopIteration:
    125         first = None

/usr/prog/python/2.6.6_gnu/lib/python2.6/site-packages/Bio/Motif/__init__.pyc in parse(handle, format)
     74             raise ValueError("Wrong parser format")
     75         else: #we have a proper reader
---> 76             yield reader(handle)
     77     else: # we have a proper reader
     78         for m in parser(handle).motifs:

/usr/prog/python/2.6.6_gnu/lib/python2.6/site-packages/Bio/Motif/__init__.pyc in _from_sites(handle)
     23
     24 def _from_sites(handle):
---> 25     return Motif()._from_jaspar_sites(handle)
     26
     27 _readers={"jaspar-pfm": _from_pfm,

/usr/prog/python/2.6.6_gnu/lib/python2.6/site-packages/Bio/Motif/_Motif.pyc in _from_jaspar_sites(self, stream)
    560             self.add_instance(inst)
    561
--> 562         self.set_mask("*"*len(inst))
    563         return self
    564

UnboundLocalError: local variable 'inst' referenced before assignment
-----------------------------------------------------------------------------------------

One other time that python parsed it without any error messages, then the consensus sequence (motif.consensus()) was not correct according to the matrix values. At that point, I created the matrix as columns and rows with tab separated numbers, without any other letters/headers.

The matrix then was:

0.0726 0.3307 0.9284 0.4731 -0.0761 1.9941 -0.8980 -0.8980
0.7140 0.9354 -0.0167 1.0279 1.1967 -0.7772 -0.8743 -0.8743
0.5377 0.3913 0.7350 -0.0072 0.3856 -0.8254 -0.8254 1.6783
0.3675 -0.1780 -0.4293 0.1086 -0.3954 -0.9879 2.3190 -0.9879

The code I used the second time:

>from Bio import Motif
>motif = Motif.read(open("consec1.pfm"),"jaspar-sites")
>motif.consensus()
Output: Seq('CCACCATT', IUPACUnambiguousDNA())

While I am expecting ccaccatG.

Thanks for the help.

biopython • 4.5k views
ADD COMMENT
0
Entering edit mode

Could you clarify - do you get an error from Biopython? If so what is the error. Which version of Biopython do you have? Where did the PWM come from, and what format is it in? etc.

See also: How to ask Good Questions on Technical and Scientific Forums

ADD REPLY
0
Entering edit mode

I found the PWM in a book which is just in rows and columns. Then I try to create a jaspar-like format and use Motif.read() from Biopython to parse it. But it doesn't work.

ADD REPLY
0
Entering edit mode

Just saying "it doesn't work" isn't enough for anyone to help you. At a minimum, please add the error message (traceback) Python gives you.

Also, it sounds like the problem is your input data is not exactly in the expected format - can you share your PWM file?

ADD REPLY
0
Entering edit mode

OK, better - now about about the FULL error message, and the Python code used to try and load this example?

ADD REPLY
0
Entering edit mode

I edited your question to mark the sample file, code snippets, and error message with the "code" style (the icon with 0 and 1 in it) because otherwise it was impossible to read.

Regarding this part, "At that point, I created the matrix as columns and rows with tab separated numbers, without any other letters/headers." could you include that sample data file too please?

ADD REPLY
0
Entering edit mode

You can parse this file with the latest code in Bio.motifs:

>>> from Bio import motifs
>>> handle = open("consec1.pfm")
>>> motif = motifs.read(handle, 'jaspar')

You probably need Biopython 1.62b for this to work. Otherwise, the latest version of Biopython on github.

ADD REPLY
0
Entering edit mode

Michiel - can you post that below as a suggested answer, rather than here as a comment? Does it give the expected consensus sequence?

ADD REPLY
0
Entering edit mode
10.0 years ago

Can you try using motility for your work?

ADD COMMENT

Login before adding your answer.

Traffic: 1606 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6