Hi all,
Some deconvolution algorithms on RNA-seq data requires a list of marker genes as reference for different cell types. However, does anyone know whether any of these have the option to specify the "importance" or weight of a marker gene when the deconvolution is performed, so that the more important markers will have greater impact on the calculation?
What I mean by weight is for example,
GeneA
is expressed in a higher and more exclusive level only in CelltypeA, whereasGeneB
is expressed at a lower level, but still a representative marker of CelltypeA. As a result, when I specify a list of marker genes for deconvolution, I would like GeneA to have a higher weight in the calculation than GeneB. What I'm not sure about, is whether these deconvolution algorithms considers all marker genes as "equally important", and if there is a chance to rank or specify their importance.Read for example the method section of CIBERSORTx paper to understand the concept behind.
For example:
So, markers are taken all together. Matrices (i.e, genes across cells) are used at the same time. The weight you're referring to is a distance between an expected expression level and the observed one which, in this case, must be computed for all the genes of the list. Given that you could imagine to compute this distance within the expected expression level of every (expected) cell type, if a given gene is expressed at a high level only in a given cell type, this will dramatically increase the resulting fraction.
Thanks for the information. So my interpretation is, if a gene is more representative of a cell type, let's say expressed at a high level or exclusively expressed in a cell type, then this would inherently increase the resulting fraction from the deconvolution without having to account for the "weight" naturally? Whereas those less representative would increase the fraction less?
Yes, your intepretation is right.