Usage of selectLab in PCAtools, R
1
2
Entering edit mode
4.6 years ago

Hi guys I just can not figure out how to use the selectLab option in the biplot of PCAtools. I tried creating a logical vector (withF for every sample I don't want to have a lab for) as well as a vector with all the sample_ids that should have a label. But nothing works nd I can't wrap my head around how to make it work.

Thanks a lot for any comments! Sebastian

R PCAtools • 2.6k views
ADD COMMENT
2
Entering edit mode

Can you explain a bit more what the problem is? From what I know selectLab must be a subset of pca()$yvars so most commonly the column names of the matrix that was used for pca().

If you use the example data of ?biplot then a possible choice could be biplot(p, selectLab = c("sample28")).

With Kevin Blighe you have the expert (author of the tool) here at biostars.

ADD REPLY
0
Entering edit mode

Hey, indeed, it should just be a character vector of samples that you want to label. If you do not define the lab variable, then the default labels are:

lab = rownames(pcaobj$metadata)

So, if you use selectLab, be careful about what you are passing to lab, too.

ADD REPLY
0
Entering edit mode

Thanks for your answers, unfortunately it doesn't work.

Maybe I am wrong to assume that I can use the option to label only some of the data-points in the PCA? If I hand it a character vector with just a few of the row names from my metadata it doesn't work though but instead disables all labels.

Is it correct that selectLab allows you to show labels only for a few, selected data points instead of labels for every point or am I just totally wrong with this?

Example of what im trying to do:

biplot(data_PCA, 
       colby = "date_processed", 
       legendPosition = "bottom", 
       lab = data_PCA$metadata$genotype, 
       selectLab = c("s1020", "s1031"),
       pointSize = 6,
       title = "PCA title",
       caption = 'There is a clear seperation by date_processed')
ADD REPLY
1
Entering edit mode

It definitely works, but is designed specifically for sample IDs, which should be unique.

Using the data from the vignette:

p1 <- biplot(p)
p2 <- biplot(p, selectLab = c('GSM65776','GSM65779','GSM65781'))
p3 <- biplot(p, lab = rownames(p$metadata), selectLab = c('GSM65776','GSM65779','GSM65781'))
cowplot::plot_grid(p1, p2, p3, ncol = 3)

ghfghg

For 'grouped' variable names, as is perhaps your genotype data, the way to go would be via colby or shape.

ADD REPLY
0
Entering edit mode

Ah, ok. This explains it! It seems that selectLab is expecting the exact labels to include, not the corresponding row names / samples.

If I give selectLab a character vector with all metadata chars to include it works. Actually it needs a little workaround still with as.character(metadata) as otherwise it prints factor levels.

biplot(data_PCA, 
       colby = "date_processed", 
       legendPosition = "bottom", 
       lab = as.character(data_PCA$metadata$genotype), 
       selectLab = c("ELANE", "HAX1"),
       pointSize = 6,
       title = "PCA title",
       caption = 'There is a clear seperation by date_processed')

This works fine now. Thanks a lot for your help! (will put this into an answer to the question below)

ADD REPLY
0
Entering edit mode

What is the content of data_PCA$metadata$genotype?

ADD REPLY
0
Entering edit mode

data_PCA$metadata$genotype contains a character vector that I use as a label. If I use it without the selectLab option all runs fine but labels are shown for each data point which is a bit overwhelming. Thats why im trying to limit the label just to a few selected points.

ADD REPLY
3
Entering edit mode
4.6 years ago

ATpoint and Kevin Blighe brought the solution!

selectLab expects a character vector with the exact labels we are handing to lab, not the corresponding row names. If your metadata are factors, labs need as.character(), otherwise it prints the factor levels only.

biplot(data_PCA, 
       colby = "date_processed", 
       legendPosition = "bottom", 
       lab = as.character(data_PCA$metadata$genotype), 
       selectLab = c("Genotype1", "Genotype2"), #select here from $metadata$genotype!
       pointSize = 6,
       title = "PCA title",
       caption = 'There is a clear seperation by date_processed')

Thanks a bunch guys, without you I couldn't have solved this!

ADD COMMENT
0
Entering edit mode

Okay, great that it now works. Yes, the 'battle' between characters and factors is ongoing, and causes quite a few problems for developers. There are minor 'usability' issues with PCAtools that I am hoping to improve over time, but it functions fine and is well tested.

ADD REPLY

Login before adding your answer.

Traffic: 1828 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6