Hello all,
This is a very general inquiry just to brainstorm some ideas. I'm trying to create what I would like to call a "Prioritization Matrix". What I would like to do is more or less give a score to gene clusters that are of more interest (I am aware that this is very subjective to what the use will want), where a high score has better priority.
In this case I have a matrix with something like this:
Gene Cluster (GC) | Gene Cluster Family (GCF) | GC Type | Associated Product | Priority Score*
Example_GC1 | GCF 134 | Terpene | Bacteriocin | 8
Example_GC2 | GCF134 | Saccharide | Bacteriocin | 8
Example_GC5 | GCF145 | Other | Penicilin | 5
- Gene Cluster: Biosynthetic Gene Cluster
- GCF: Gene Clusters that have similar domains
- GC Type: Can vary from polyketide to terpenes etc.
- Associated Product: Product that is known to express associated to Gene Cluster
- Priority Score: If interested in investigating or not this gene cluster
I want to determine my gene score based on 4 main metrics:
- Characterized or Not Characterized by LCMS (I have a database for this)
- Data Source: If it is present or not in the NCBI
- BGC Class Type: I will give a higher score to the first three classes I am interested in searching b. Score can vary according to interest
- Similar Domain or Not: If these are part of the same GCF
If anyone has any particular Ideas or suggestions this would be great. :)