Hi,
I have RnaSeq expression in RPKM for 100 samples, and I would like to run an eQTL analysis in R. I only have RPKM so I have to use these. I have done a linear regression regression with RPKM raw expression for now as outcome and the SNP genotypes as explanatory variable: RPKM ~ genotype. However just running this linear regression does not seem to give me the correct results (everything comes out as significant).
I must be missing some steps for QC and prep of the data before linear regression (I am also not sure how much QC was done on these samples, so would still have to go through that). For example maybe I should be using log of RPKM, or maybe raw, or log10, or log2. Also should the data be quantile normalized?
What is the starndard process before doing a linear association for eQTL analysis using RnaSeq RPKM values?
I can't find good guidance on how to prepare data (i.e. sample QC and gene QC required) and run an eQTL analysis using RnaSeq RPKM values in R. If there are any papers or any manuals in R that can help understand the steps to do this analysis, or if anybody knows how I can learn the steps it would be of great help.
Thank you
May be these papers can help. One of the papers mentioned that using whole counts rather than RPKM is better. Anwyays good luck.
http://www.ncbi.nlm.nih.gov/pubmed/23667399
http://onlinelibrary.wiley.com/doi/10.1111/j.1541-0420.2011.01654.x/abstract