Question

Creating a signature matrix using CibersortX

0

Entering edit mode

5 months ago

Aspire ▴ 370

(1) It is possible to create a signature matrix with CibersortX from single-cell data, or alternatively from RNA-seq of sorted cell populations. Is there an expectation, at least a general one, as to usage of which type of data would be more precise?

(2) In the case of creating the signature matrix from single-cell data, the article describing the usage of CibersortX states :

if scRNA-seq data are used to build a signature matrix, it is straightforward to characterize its performance using synthetic tissues created from single-cell transcriptomes. To ensure an unbiased assessment, these source scRNA-seq transcriptomes used for the creation of a synthetic tissue should be held out from the creation of the signature matrix.

Does splitting the dataset into two, building the signature matrix from one half, and then validating the proportions on the other half of the dataset sound like a reasonable procedure?

CibersortX deconvolution • 402 views

ADD COMMENT • link updated 5 months ago by jared.andrews07 ★ 18k • written 5 months ago by Aspire ▴ 370

score 1 · Accepted Answer · 2024-08-12

(1) It is possible to create a signature matrix with CibersortX from single-cell data, or alternatively from RNA-seq of sorted cell populations. Is there an expectation, at least a general one, as to usage of which type of data would be more precise?

Pretty much impossible to say given it depends on the accuracy of the reference annotations or sorting, respectively. From experience, both seem to work well.

Does splitting the dataset into two, building the signature matrix from one half, and then validating the proportions on the other half of the dataset sound like a reasonable procedure?

That seems fine, though it's still somewhat double dipping, and I'd be mixing the proportions up a few different ways in your hold out set to see how well they're recovered. I'd try to find a separate, unassociated dataset with the same cell types to make mixtures from to help validate.