My professor that I do research with has just proposed a new project for me, but if I accept we need to get this thing done in around 5 weeks so I want to know what it will entail before I sign on. He wants to identify the gene responsible for regulating flower color (purple vs white) in a species of mustard plant, but we know that the actual gene(s) coding for pigment production is identical in white vs purple plants (just not transcribed in white-flowered plants). He thinks the difference is in a mutation in some kind of regulatory gene because there aren't any other major differences in gene transcription. Our data set consists of four transcriptomes, each derived from about five individuals, from white and purple plants in two different populations. We're thinking we should be able to go through and find sites where white and purple-flowered plants differ but are the same among both populations, but he doesn't know exactly how to do this. I'm sure that I could write some kind of script to do this in an approximate manner but I feel like the real thing would use some kind of algorithm actually based off of proper math and statistics. Does anyone know of any methods to accomplish this sort of thing?
Do flower colors segregate in a Mendelian fashion, with complete penetrance? That would provide evidence that there is a single locus controlling color. Failing that, what is the evidence that this is a single mutation? I suspect you're signing up for more than 5 weeks worth of work.
Do you have a sequenced genome for the plant? What type of sequencing do you have?
Do you know the gene responsible for flower colour? And want to find its regulator? Microarray? rna-seq?
Hey! maybe RNAi is playing a role in the colour regulation. In fact RNAi was discovered in experiments concerning flower colors in petunias (http://en.wikipedia.org/wiki/RNA_interference#History_and_discovery), so I'm afraid it must be quite common in plants.
I agree with @David Quigley that this seems more than a 5-weeks work