I have some standard polyA RNA-seq data from whole human tissue that I generated. I want to compare the relative abundance of genes in my tissue dataset with all the other tissues of the body. GTEx seems like the right source and indeed, on the site, the standard plot it puts out is a lot like what I want to do, but inclusive of my tissue data.
What is the right way to do this? I suspect it is not just to use the TPM data as inputs for DE analysis with DEseq2 or edgeR, but get the normalized counts for GTEx and then also process my data in the same way to get normalized counts and then combine them?
Sorry if this is a very naive question.
Is the GTEx data also normalized counts data? check the type of the data and then you can accordingly generate the relative abundance count expression by using cufflinks/ salmon etc for your comparitive study
There is little point comparing completely independent datasets as RNA-seq is a relative measure, not an absolute one. What you can compare is datasets created in the same lab, same kit etc. but not different ones. Any differences you see can, and often will, be due to various batch effects you cannot account for. What exactly is the analysis goal?
The goal is to see how genes in my tissue compare to the tissues of the rest of the body. So for example, I want to see if there are any genes that are enriched (or exclusively) expressed i my tissue compared to other tissues.
Because I cannot (and most cannot) generate RNA-seq data for every tissue of the body, I have no choice but to use something like GTEx.