Hello,
I would like you to share your personal opinion about the current critical issues in the PSWMs usage. Specific literature is also wellcome.
Hello,
I would like you to share your personal opinion about the current critical issues in the PSWMs usage. Specific literature is also wellcome.
The first thing I teach biology students in bioinformatics class is my own variant of Dobzhansky's adage:
Nothing in bioinformatics makes sense except in the light of statistics.
This rule applies here as well (and should take care of Ian's complaint about PSWMs) The most critical issue is how to derive a proper cut-off value. A threshold can only be based on a statistical consideration of what can be expected by chance alone. In the simplest approximation, you could try to calculate the probability from the amino acid or nucleotide distribution. However, I wouldn't recommend doing this. What usually works much better is to run the PSWM against a database of similar size that is guaranteed to NOT contain a relevant instance of your motif (e.g. bacterial sequences when scanning for an eukaroytic motif). If such a database is not available, you could try to create a random database by one of the available methods (taking into account things like runs of nucleotides/amino acids)
I have a fairly negative opinion about using PSWMs for representing binding motifs. My primary objection, when scanning sequences, is that the 'hits' are highly dependent on the match cut off. If i have used Weeder, for example, for motif discovery i will use the dominant IUPAC (DNA pattern) to scan sequences of interest. The matches (even with IUPAC ambiguity) is much more simple to interpret.
Sorry this is just my quick opinion on this subject. I would be interested if anyone else has views on this issue as well.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
what do you mean by "critical issues"? please give some examples
For "critical issues" I mean limits of the approach and/or advantages of other techniques in respect of PSWMs usage in motifs definition/discovery. A critical issue, for example, could be the problem of the lack of correlation in matrices.