I would very much like feedback on this DESEq2 issue from a biostat perspective.
I have 19 individual RNAseq samples corresponding to different locations. They were sequenced separately using Illumina Hiseq (barcoded), and then all reads were used to generate an assembled metatranscriptome.
Reads from each of the 19 samples were mapped to this metatranscriptome. Based on our annotations, we have many different taxa groups present.
What I would like to do is compare transcript abundance across locations for each major taxonomic group. I have subsetted out transcripts and their raw counts corresponding to the groups A, B and C. I have normalized counts for each group in DESeq2 and exported the pseudo-counts.
I would like to generate heatmaps for each of these groups and look for changes in transcript abundance among groups. The question is, is it incorrect to display these pseudo-counts side-by-side each other in the same heat map? (I think yes.) Is there a better way to do this if you wish to directly compare normalized transcript abundance across independently-normalized taxonomic groups?
I don't think the answer is to normalize groups A,B,C all together and then later subset them out, because there may be important differences in expression tendencies between groups.
I am not actually interested in a pairwise analysis here, mainly just the changes in transcript abundance with location and group.
Apologies if there is an obvious answer here. Thank you so much for your time and thoughts.
I don't think it's valid to compare normalized counts across data sets. Think about this: you have 1 data set consisting of 2 treatments and 3 reps each (EXPT1) and another data set with 2 different treatments and 3 reps (EXPT2). You normalize both independently. It wouldn't make any sense to then compare all pseudo counts between EXPT1 and EXPT2.
So how would I get around this problem in my metatranscriptome where I normalized each subset (taxa) separately?
Oh right, I overlooked you're using count data while writing my reply, so disregard my last. Well, in your case and your current data setup, you can use Venn Diagrams, bar plots and heatmaps or other graphical displays like radar plots. In other words, there is no more statistical analysis to do, only plot the data you have and write your results.