I and my group have two lists of genes (with related gene networks): one from Arabidopsis and one from Vitis Vinifera. Those lists were found by expanding ERF gene network in both organisms starting from expression profiles (therefore we also gene networks for both organisms).
Our goal is to infer V. Vinifera's gene functions using sequences homologies with Arabidopsis genes, but I'm struggling to find a decent pipeline for our task. Are there any pipeline/papers that describe a method to infer gene function using the material that we have?
Our first idea was to cluster Arabidopsis genes and then infer those cluster of Vitis knowing which gene is homologous to which other gene, but we do not know exactly how a gene list can be clustered.
See figure 2 and table 2 from https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0001326
It may help.
There are more than 700 articles citing this one above, there should be some updates.
https://journals.plos.org/plosone/article/citation?id=10.1371/journal.pone.0001326
What do you mean a pipeline for this?
You can literally just use BLAST of the V. vinifera sequence to find what it's similar too. You'd probably also want to look at Interproscan, and some HMM methods (e.g. hhpred) to identify domains of function within the sequence etc.
If the databases don't contain informative hits though, you have to go in the lab.