Hello everyone,
I am currently working on reverse engineering Gene Regulatory Networks from RNA-seq data. I have data from different stages of heart development. My question is if you have different transcripts from the same gene in the RNA-seq data (which is ofcourse unavoidable), should one use information from both transcripts to build the network? because obviously the network is at the gene level and therefore using transcript-level information can be tricky because the transcripts will have different expression values. How will this affect the overall network? plus won't the network inference programs get confused as there will be 2 entries with the same gene name but different values? One will have to make the gene names unique and so these will become 2 different nodes in the network. In such a case, is it OK to use this kind of information as people do build networks from RNA-seq data? Should one keep 1 transcript per gene? but how to select one transcript and on what basis?
Any thoughts?
Thanks!!
For WGNCA, which type of input for RNA-Seq do you suggest? Normalized read count? DESeq's
varianceStabilizingTransformation
? RSEM normalized? Like you said input is very important as the clustering will be differentFrom the WGCNA FAQ, the authors suggested VST or simply doing Log transformation. They did said that the main goal is to make sure you have the same input. However, I did experience a slightly different answer when I use TPM when compared to VST. That is up to your liking. Use whichever transformation you deems fit and make sure you are consistent with all of the samples.