During a discussion with my supervisor about a goal to publish a paper within the next 3 months, she told me we may not have time to do any experimental validation. I've only just finished my first year experience as a post grad so I'm very new as a bioinformatician. I have always assumed that 100% of bioinformatics projects require experimental validation before they will be accepted for publication. However my supervisor says no, there are lots of publications that basically say..."this is the data ... this is the pipeline we developed ... this is what we found" ....with no experimental validation of the findings.
I'd appreciate thoughts from some of the more experienced biostars. I know we cant put a number on it but is it really that common that bioinf papers don't need validation experimentally?
The pipeline calculates tissue specificity of genes based on expression data from many tissues. The future goal is to use it as part of a larger project to predict suitable targets for gene editing/phenotyping
I really like that Pierre thank you especially the reply from "Drecate":
Biological validation doesn't necessarily mean that you need to do experiments to verify the computational results. There are many computational papers published without any accompanying experimental results (of course it would be fantastic if you can, either by yourself or in collaboration with experimentalists.) The real question here is whether you provide a biological context in which to put and assess your work. No matter whether your work is experimental or quantitative, you need to demonstrate that you understand the previous work done on the biological system that you are trying to study: what has been discovered, what are the interesting questions, how do your results build on/confirm/disprove previous work etc. You need to demonstrate that your work is relevant to the biologists working in the field in the sense that it attempts to address the relevant biological questions (or asks a new question that despite its importance has never been considered) and provide unique insight that is difficult if not impossible to obtain from experiments. Biologists are not interested in theory/computation for its own sake, and the failure to connect such work to the experimental reality is one of the biggest stumbling block for people with a "hard science" background working in biology."
This is heavily dependent on the type of story you want to tell and the type of data you have. At least for the sorts of things I've usually been involved in, bioinformatics has just provided a piece of the overall story we wanted to show. For example, you might do RNAseq and then downstream functional assays in the wetlab dependent on the results. The counter example to that would be a lot of current HiC papers, where they tend to show things changing or not over time. That's then the final functional assay and there's typically no validation because there's not a great way to even do that (on occasion you see people doing FISH, but that's about it).
The other thing to consider is the journal you're targetting. If you're going for something in Cell then you're going to need a very thorough story, which will pretty much necessitate some wet-lab experiments just to tell the story.
If you have a ground breaking prediction that your group does not have resources/expertise to validate, then you may be able to publish it with an open invitation for others to prove/disprove it.
That said, there are two aspects to this question. Can you publish predictions and should you validate those predictions? It may boil down to what you are comfortable with since you will be the first author on this paper(?). Former may certainly be possible, if you are willing to step down the impact factor ladder over time. If your prediction is biological, validation will likely be required, before you can hope to publish in a good journal.
And what's the type of analysis that you performed?
The pipeline calculates tissue specificity of genes based on expression data from many tissues. The future goal is to use it as part of a larger project to predict suitable targets for gene editing/phenotyping
Another validation for this would be to use another (public) dataset and see if the same genes replicate.
Yes - that is the alternative idea to experimental validation being considered.