Critical Issues In Pswms Usage
2
0
Entering edit mode
13.4 years ago
Anima Mundi ★ 2.9k

Hello,

I would like you to share your personal opinion about the current critical issues in the PSWMs usage. Specific literature is also wellcome.

pssm subjective matrix matrix • 2.6k views
ADD COMMENT
1
Entering edit mode

what do you mean by "critical issues"? please give some examples

ADD REPLY
0
Entering edit mode

For "critical issues" I mean limits of the approach and/or advantages of other techniques in respect of PSWMs usage in motifs definition/discovery. A critical issue, for example, could be the problem of the lack of correlation in matrices.

ADD REPLY
3
Entering edit mode
13.4 years ago
Lyco ★ 2.3k

The first thing I teach biology students in bioinformatics class is my own variant of Dobzhansky's adage:

Nothing in bioinformatics makes sense except in the light of statistics.

This rule applies here as well (and should take care of Ian's complaint about PSWMs) The most critical issue is how to derive a proper cut-off value. A threshold can only be based on a statistical consideration of what can be expected by chance alone. In the simplest approximation, you could try to calculate the probability from the amino acid or nucleotide distribution. However, I wouldn't recommend doing this. What usually works much better is to run the PSWM against a database of similar size that is guaranteed to NOT contain a relevant instance of your motif (e.g. bacterial sequences when scanning for an eukaroytic motif). If such a database is not available, you could try to create a random database by one of the available methods (taking into account things like runs of nucleotides/amino acids)

ADD COMMENT
1
Entering edit mode
13.4 years ago
Ian 6.1k

I have a fairly negative opinion about using PSWMs for representing binding motifs. My primary objection, when scanning sequences, is that the 'hits' are highly dependent on the match cut off. If i have used Weeder, for example, for motif discovery i will use the dominant IUPAC (DNA pattern) to scan sequences of interest. The matches (even with IUPAC ambiguity) is much more simple to interpret.

Sorry this is just my quick opinion on this subject. I would be interested if anyone else has views on this issue as well.

ADD COMMENT
0
Entering edit mode

I share your concerns.

ADD REPLY

Login before adding your answer.

Traffic: 2107 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6