PCA for count data
1
0
Entering edit mode
7 weeks ago
QX ▴ 60

Hi all,

I am working on a copy-number-variation count data, which is not continuous data. For example:

     a b c
reg1 1 2 1
reg2 1 1 2
reg3 3 3 1

Does any one know if I can apply traditional PCA on this count table? I think it is possible as the value is not categorical or ordinal, but I am not sure if PCA is allowed on count data

PCA • 539 views
ADD COMMENT
2
Entering edit mode

Depends on how many CNV you have in your matrix and your objectives I would say. If you have too few CNV no need to reduce your dimensions more. I would expect the distribution to be more or less similar to gene expression, so you could pick your most variable CNV and run a PCA on them.

ADD REPLY
0
Entering edit mode

the CNV range from 1 - 5 (not log / normalization), for a matrix of 20,000 x 8

ADD REPLY
0
Entering edit mode

I would do like it is done in single cell, where your reg1,2,3 are cells and a,b,c are genes. You don't need to log transform your counts but you can selecting a number of variable CNVs (a,b,c...) by plotting the standard deviation over the mean. Then, you scale your new matrix and you can run a PCA.

ADD REPLY
0
Entering edit mode

thank you!

ADD REPLY
1
Entering edit mode
7 weeks ago

Typically a PCA is calculated on a matrix of correlation coefficients so the robustness of a PCA depends on certain assumptions that come with the methodology used - see page 55 on the list of assumptions underlying PCA. Given your data is not continuous and takes discrete states, you'll have to accept that the resulting PCA will not be robust, although this is partially mitigated by having larger datasets with more observations.

That said, this really depends on what you're trying to do with the data? Are these raw counts or have they been previously normalised/transformed in some way? What are the aims of your analysis?

ADD COMMENT
0
Entering edit mode

I have a matrix of 20,000 x 8. They are CNV states which are transformed from read counts.

ADD REPLY

Login before adding your answer.

Traffic: 1877 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6