I'm trying to design an experiment comparing the effects of various expression levels of Gene X between Sample A and Sample B. If measurements in Gene X expression are from RNA-seq in Sample A and qPCR in Sample B, what assumptions must be made in comparing the effects of expression on a trait between these two samples?
Long version:
Expression levels of Gene X in Sample A can only be measured by RNA-seq (standard Illumina high-throughput). Expression levels of Gene X in Sample B can only be measured by qPCR – a standard qPCR protocol in which RNA is extracted and used to produce single-strand cDNA, from which the target gene is amplified and measured with a real-time quantitative PCR machine. In either case, with qPCR or RNA-seq, expression of Gene X is standardized against expression of the same reference gene.
The reason for the methods for each is that:
- There are too many SNPs in Sample A to design qPCR primers that would work for enough samples
- RNA-seq data is already available for Sample A, so cost not a factor
- RNA-seq of Sample B would be too costly and is unnecessary since qPCR primers will work for all samples
I wish to build linear models showing the effects of expression on the trait for each sample. In considering whether this is a valid approach, there is concern regarding possible bias from the two different methods of measurement. Where can bias be introduced in either qPCR or RNA-seq? How can either method be more or less accurate in measuring gene expression levels?
I'm working on a proposal for possible future research and trying to make the most of available resources and a limited budget.
There's no reason we can't perform both qPCR and RNA-seq for some number of samples in Sample A. It is possible for us to find enough samples in Sample A without SNPs that would prevent use of the same good qPCR primers. We could then compare reference-normalized expression levels between qPCR and RNA-seq to calculate a conversion factor. What is unclear to me, however, is why that conversion factor would be anything other than 1. Furthermore, I don't know if/why the conversion factor would change over a range of expression levels.
I am trying to ascertain specifically where bias can come from in order to reach an evidence-based conclusion on whether this approach can be valid. Maybe it will be best to refute this idea and instead RNA-seq everything with baits to select the gene of interest and standards, but the rejection of the cheaper and faster approach will need to be justified.