I've been asked to create an algorithm that will compare two genotyping platforms.
The algorithm would compare two genotypes and produce a message ("this is the same genotype, anti-parallele, incompatible, inversed, ....").
Each allele can be encoded as: 'A', 'T', 'G', 'C' , '0' (undefined), '-' (indel) /[ATGC]{2,}/ (indel)
and all the conditions must be considered
A1----B1
| \/ |
| /\ |
A2----B2
... but many returns the same message:
A/C vs G/T "antiparallele"
A/G vs C/T "antiparallele"
A/A vs A/A "same"
G/G vs G/G "same"
A/T vs A/A user-message-1
A/C vs C/T "inverted"
A/0 vs T/T user-message-2
Currently, my code is just an ugly series of imbricated 'else if' statements like
if(same(A0,B0) && same(A1,B1) && !revcomp(A0,A1)) { ... }
and I'm afraid some conditions have been ignored, or I would like to be able to quickly change the way a pattern is handled.
Do you know if there is any design pattern to handle this problem ? Would you have any elegant solution for this ?
Pierre
PS: It looks like a problem for a "rules engine" like jboss drools but I don't like this library ( requires too many dependencies).
I also think that this is the right way to go forward. I had to implement a similar algorithm (albeit much simpler) as Pierre and worked my way backwards in the way you described here too.