Question

Paper publication without qPCR validation

1

Entering edit mode

10.5 years ago

eddie.im ▴ 140

Hello,

I'm running a study on predicting ncRNA's in a bacterium and I used a public rna-seq data (with tree biological replicates) to see if my predicted rna's are being expressed. I got some very good results, however I'm afraid that in order to publish, a qPCR analysis or nothern blot will be required. The problem is since I used public data I don't have access to biological material to do such analysis, how should I proceed? it's seems that almost alI journals require some kind of "experimental" validation, is there some ("good ones" IF > 1.5 maybe?) that do not?

Also, can someone explain me why rna-seq data is not enough to prove that a gene is being expressed?

Thanks in advance.

publication ncRNA qpcr RNA-Seq • 4.6k views

ADD COMMENT • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

2

Entering edit mode

I think a more serious issue with your project is demonstrating that the transcripts are not translated. You would need ribosome profiling data to show that there is an absence of translation for those transcripts.

ADD REPLY • link 10.5 years ago by dario.garvan ▴ 520

1

Entering edit mode

Hi dario, thanks for your answer.

Yes I thought about that, but on all papers on predicting new ncRNAs that I read none demonstrated that they are not being translated, just the expression. Also the sequences are very small to code a protein I believe, around ~130nt long and they are located on intergenic regions and do not contain any know start codon nor stop codon. Of course that in order to afirm that they are undoubtedly ncRNAs such analysis would be required, but i believe that it's not required on this field. I have reads dozens of papers on miRNA, mirtrons, sRNAs and none did indeed prove that the transcripts weren't being translated.

ps: also they have a really high score in SVM's predictions...

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

0

Entering edit mode

Any experienced scientist cannot advice you to go without experimental validation of some kind. However, I do think you can publish these results without it. Try some of the less rigorous journals like PlosOne (http://www.plosone.org/) and act based on reviewers' comments.

ADD REPLY • link 10.5 years ago by Biomonika (Noolean) 3.2k

0

Entering edit mode

Hi Noolean, Thank you for the tip! I will take a look at it.

ADD REPLY • link 10.5 years ago by eddie.im ▴ 140

Ram · Answer 1 · 2014-07-03

3

Entering edit mode

10.5 years ago

Charles Warden 8.3k

I would strongly recommend trying to find an experimental collaborator to assist with validation, which could be technical (such as qPCR) or functional (such as the lack of translation, as mentioned in the comment)

ADD COMMENT • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by Charles Warden 8.3k

0

Entering edit mode

Hi thanks for you feedback,

I may have to explain better, sorry, I already got an experimental collaborator I just used this public data because we got a very close species of the same genera, and since we don't have enough money (yes, we do have money to host a world cup, but the investment on research is being cut, sad..) to do a large scale prediction trough rna-seq and qPCR of or own species, so I thought about using public data to detect ncRNAs in large scale on a close related bacterium and then choose a few ones that are similar on the species we got to validate through qPCR, because qPCR is really expansive I wanted to have more confidence before doing it, to not waste any money. I just didn't wanted to "waste" this result, since a lot of predicted ncRNA where not annotated on this bacterium and the public data is very complete (data on different stages of development with replicates), so I also did differential expression through stages of development. I thought about putting these results together as a methodology in our final paper, but the discussion on two different species would be very confusing to just one paper imo (one with rna-seq differential expression without qpcr, other with just qpcr validation), so I was looking to do two papers. Thank you for your feedback.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

0

Entering edit mode

So you don't have the sequences of the ncRNAs in your species of interest, just in the related species?

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by pld 5.1k

0

Entering edit mode

I have on both, one of the steps to detected those ncRNA was intergenic conservation among close species, so I got the sequences on both species and some others, but only one of then (which is the data I used) have public data available.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

0

Entering edit mode

I think it is common to include analysis of public data. For example, researchers with small cohorts and/or pre-clinical data can use public datasets to see if a given gene acts as a marker for a clinically relevant feature (such as survival in cancer patients). In that scenario, you would still probably need the qPCR validation of your own data, even for a journal like PLOS ONE.

As long as you do more detailed characterization (with validation like qPCR) in your collaborator's species, I think that is OK. However, I think it will be hard to publish a purely bioinformatics paper without that extra data.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by Charles Warden 8.3k

Ram · Answer 2 · 2014-07-03

3

Entering edit mode

10.5 years ago

Josh Herr 5.8k

I agree with Charles Warden here. There is a lot of public data out there, but you'll have to think its use through -- you should always do this before you spend a lot of time on a project.

If this is your first paper, hasn't your advisor or mentor given you some input here or encouraged you to establish collaborations in this area?

I would assume that each strain/species/genera has it's own suite of ncRNA, so to validate you should get a hold of the specific strains you have mined publically to do the lab validation.

ADD COMMENT • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by Josh Herr 5.8k

0

Entering edit mode

Hi Josh, Thank you for your feedback.

I explained it better on Charles Answer, I will just paste it here.

I may have to explain better, sorry, I already got an experimental collaborator I just used this public data because we got a very close species of the same genera, and since we don't have enough money (yes, we do have money to host a world cup, but the investment on research is being cut, sad..) to do a large scale prediction trough rna-seq and qPCR of or own species, so I thought about using public data to detect ncRNAs in large scale on a close related bacterium and then choose a few ones that are similar on the species we got to validate through qPCR, because qPCR is really expansive I wanted to have more confidence before doing it, to not waste any money. I just didn't wanted to "waste" this result, since a lot of predicted ncRNA where not annotated on this bacterium and the public data is very complete (data on different stages of development with replicates), so I also did differential expression through stages of development. I thought about putting these results together as a methodology in our final paper, but the discussion on two different species would be very confusing to just one paper imo (one with rna-seq differential expression without qpcr, other with just qpcr validation), so I was looking to do two papers. Thank you for your feedback.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

Ram · Answer 3 · 2014-07-03

3

Entering edit mode

10.5 years ago

pld 5.1k

Make some qPCR probes! If you have the data, you have all the information you need to obtain the biologics needed to validate. There are plenty of guides and tools out there, or find a collaborator!

As for the source material, unless you're working with major human or animal pathogens, just ask a bacteria lab at your uni or a collaborator for some help in extracting the RNA and doing the qPCR. You can probably just order whatever bacteria you need from ATCC if the lab you connect with doesn't have it already. Refer to the paper behind the data to get the conditions you need. The same would apply with the ribosome profiling.

If anything, I'd see if the ribosome profiling might be more cost effective. qPCR can get expensive quickly.

You don't always have to do biological validation, but IMO the validation is what prevents your work from becoming "just another algorithm". It adds to the quality of the work, you didn't just verify your method with known ncRNA, you found new ones and validated them. So your further science on two fronts. Plus it is an opportunity to leverage some quick wet work into training and collaboration opportunities.

ADD COMMENT • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by pld 5.1k

0

Entering edit mode

Hi Joe, thank you for your feedback!

I didn't know about the ATCC!, Thank you for this tip! I explained it better on the other answers. As you can see I'm looking towards to do a biological validation, but on another close related species. It's because the public data was so complete, that i got excited and did some deep discussion through differential expression analysis, however this was not the main objective of the "final" paper which is about the pipeline/algorithm per se. so I thought about do another paper.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by eddie.im ▴ 140

0

Entering edit mode

If it you want to publish the discovery of novel ncRNAs in some species of bacteria, you will without a doubt have to validate them. If your argument is that these novel ncRNAs play roles at various points in the life cycle of the bacteria, you will probably have to validate that they are ncRNAs and then validate that they play some role in life cycle/development. That is a massive task.

I'm not sure that such a "deep discussion" is a good idea. Developing software and demonstrating performance is a totally different issue than showing that ncRNAs play a role in the bacterial life cycle. If you have to put it in there, make it very brief. Show that your software is capable of dealing with ncRNA identification in relatively uncharacterized bacterial genomes. You're asking for trouble if you go beyond that.

ADD REPLY • link updated 3.1 years ago by Ram 44k • written 10.5 years ago by pld 5.1k