I just downloaded "EBPlusPlusAdjustPANCAN_IlluminaHiSeq_RNASeqV2.geneExp.tsv" file from pancanceratlas supplemental data section of the NIH website. https://gdc.cancer.gov/about-data/publications/pancanatlas
During processing, I realized that this file contains negative RSEM values that can be summarized as follows:
Min 1st Qu. Median Mean 3rd Qu. Max.
-0.9912 -0.8951 -0.8633 -0.8613 -0.8168 -0.6687
According to my understanding the value calculated via RSEM means the expected number of mapped reads, so I guess the value should be 0 or positive. Can anybody explain why the expression matrix contains negative values?
Thank you in advance.
This previous discussion might be helpful in understanding the contents of that file: What batch correction was applied to pan-Cancer mRNA expression data?
whether the exp matrix log transformed?
Please use
, the answer box is for answers only. That keeps it logically organized.