It is always difficult using public data as there are all sorts of confounders. I would never attempt to compare samples from two different protocols if one is the control and one is the test because you have not blocked one of you confounders. If you have some controls of each and some test of each you might be OK but the analysis will be very difficult and could well cost more than running a new experiment that is properly designed.
By mRNA I assume you mean poly A selection vs some sort of Total RNA extraction where the rRNA is otherwise depleted. But the principles are the same for anything. Here are some differences that will introduce slop into your quantifications:
The poly A extraction will pull out almost exclusively processed RNA. The total RNA protocol will pull out both processed and unprocessed (with and without introns), so you will have a big difference there from the start.
You will also introduce a length bias. If the poly A selected RNA is degraded at all you will lose reads from the start of your long transcripts because the 5' end will not longer be attached to Poly A tail. This will show up as differential expression compared to the total RNA which doesn't have this problem.
Depending on the method, the difference in GC content can be enormous. Here are two methods with the same data: http://michelebusby.tumblr.com/image/62718357939 Points represent the reads aligned to transcripts in each method. In the DSNLite method transcripts with high GC content were systematically lost.
Here is a paper we did on some of the issues you see when you do different methods, where that data was from.
http://www.nature.com/nmeth/journal/v10/n7/fig_tab/nmeth.2483_F6.html
If you do that chart where you highlight high and low GC genes you will see if you have GC artifact. Do the same with long and short transcripts and you will see if you have length bias. Then run everything through RNASeqQC and look at your intron/exon/intergenic rates.
Unless everything is clean, that would suggest substantial artifact and you can't prove much biology. You might be able to get it past a reviewer (people do) but you shouldn't.
You can, perhaps, use this for pilot data for a grant, though, if you convey that you controlled for these things and understood what you were looking at.
Ok,Thank you. I have tried it.
Sorry it didn't help. The issue is not batch effect to get rid of it as per the solution, still it's an open question
Sorry it didn't help. The issue is not batch effect to get rid of it as per the solution, still it's an open question