Hi!
I downloaded a differential gene expression summary data table like this from brainSCOPE.
All I need are the gene
and log2FoldChange
column. However, each gene's data is split into multiple different cell types, as indicated in the cell_type
column. These are too many cell types to work with- so I just want to merge some cell subtypes into larger cell types- e.g., I want to merge Vip, Sst, Pvalb cells into just "interneuron".
In this case, can I simply get the mean of log2FoldChange values of Vip, Sst, Pvalb for each gene? I don't care about p-values because I will analyze log2FC values with threshold-agnostic methods.
logFCs without stats are meaningless since logFCs can have large standard errors. Combining individual logFCs can introduce bias as high logFCs with large SEs can overwrite reliable but moderate logFCs, I don't see the point. If the data format does not allow your analysis then get the underlying data and analyse as needed.
Makes sense. Thank you very much for replying!