PCAtools data file not linking to metadata file
2
0
Entering edit mode
4.4 years ago
bryce.plu ▴ 10

Hi, I am attempting to use the PCAtools R package. I have imported my own data matrix (pca.matrix) and metadata (metadata) files. When just running this code, everything works great and I get plots:

p <- pca(pca.matrix, removeVar = 0.1)

-- removing the lower 10% of variables based on variance

screeplot(p)
biplot(p)

When I try to link and check the metadata, all seems to be working:

pca.matrix <-pca.matrix[,which(colnames(pca.matrix) %in% rownames(metadata))]
all(colnames(pca.matrix) == rownames(metadata))

[1] TRUE

However, when I try to run the PCA with the metadata, I get the following:

p <- pca(pca.matrix, metadata = metadata, removeVar = 0.1)

Error in pca(pca.matrix, metadata = metadata, removeVar = 0.1) : 'colnames(mat)' is not identical to 'rownames(metadata)'

Shouldn't it be trying to match up 'colnames(pca.matrix)' with 'rownames(metadata)'? What is 'colnames(mat)'? I feel like I'm totally missing some key information.

Any help would be great! Thank you!

R PCA RNA-Seq PCAtools • 3.8k views
ADD COMMENT
0
Entering edit mode
4.4 years ago
bryce.plu ▴ 10

Ok, I fixed it!

When importing the data file and the metadata file with the

read.csv(file = ....

function, both data files needed to have

row.names = 1

as part of the argument. Classic rookie mistake! Leaving the post up in case others run into the same issue.

ADD COMMENT
0
Entering edit mode

Hi, I developed this package. Sure that it is all now okay? There will likely be a variety of reasons for that error message to appear - yours is just one particular use case

ADD REPLY
1
Entering edit mode

Hi Kevin,

Thank you for the response! Yes! Everything is working wonderfully now - just a rookie user error on my part!

ADD REPLY
0
Entering edit mode

Hi Kevin, what other reasons might lead to this error? I'm having a similar issue to op, but their solution isn't working for me

ADD REPLY
0
Entering edit mode

Can you show a sample of your input metadata and the command you are using to import it?

ADD REPLY
0
Entering edit mode
12 months ago

Hi kevin,

I also having the similar issue,

counts_data <- read.csv('count_values_featurecounts.csv', row.names = 1)

head(counts_data) colnames(counts_data)

colnames(counts_data) [1] "Sample1_1387_S76_FE_1B" "sample2_1391_S80_FE_1B" "sample3_1388_S77_MM9_2B" [4] "sample4_1389_S78_MM9_1B" "sample5_1390_S79_BIP_1C" "sample6_1392_S81_BIP_2B"

read in sample info

colData <- read.csv('metadata.csv', row.names = 1) rownames(colData) [1] "Sample1_1387_S76_FE_1B" "sample2_1391_S80_FE_1B" "sample3_1388_S77_MM9_2B" [4] "sample4_1389_S78_MM9_1B" "sample5_1390_S79_BIP_1C" "sample6_1392_S81_BIP_2B"

making sure the row names in colData matches to column names in counts_data (smaller dataset %in% larger dataset)

all(rownames(colData) %in% colnames(counts_data)) [1] TRUE

are they in the same order?

all(colnames(counts_data1) == rownames(colData)) [1] TRUE

Step 2: construct a DESeqDataSet object ----------

dds <- DESeqDataSetFromMatrix(countData = counts_data,

  • colData = colData,
  • design = ~ treatment) Warning message: In DESeqDataSet(se, design = design, ignoreRank) : some variables in design formula are characters, converting to factors dds class: DESeqDataSet dim: 5406 6 metadata(1): version assays(1): counts rownames(5406): R65_hybrid_00001 R65_hybrid_00002 ... R65_hybrid_05462_gene R65_hybrid_05463_gene rowData names(0): colnames(6): Sample1_1387_S76_FE_1B sample2_1391_S80_FE_1B ... sample5_1390_S79_BIP_1C sample6_1392_S81_BIP_2B colData names(1): treatment

transform the data to variance-stabilised expression levels

vst <- vst(dds) -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time.

generate a PCA plot

p <- pca(vst,c("iron","mm9","bip") ) Error in t.default(mat) : argument is not a matrix

ADD COMMENT
0
Entering edit mode

Kevein I have 3 treatments- mm9, bip and iron, for such a case that same command can be used for plotting the data?

ADD REPLY
0
Entering edit mode

I tried with another code:

vsd <- vst(dds, blind = FALSE)

head(assay(vsd), 3) Sample1_1387_S76_FE_1B sample2_1391_S80_FE_1B sample3_1388_S77_MM9_2B R65_hybrid_00001 12.040811 11.832936 11.711007 R65_hybrid_00002 3.163354 1.903476 2.956567 R65_hybrid_00003 11.630781 11.263561 11.434657 sample4_1389_S78_MM9_1B sample5_1390_S79_BIP_1C sample6_1392_S81_BIP_2B R65_hybrid_00001 11.079648 12.55149 13.050876 R65_hybrid_00002 3.186474 3.59161 3.289681 R65_hybrid_00003 10.113185 12.55919 12.039214

plotPCA(vsd, intgroup = c("iron","mm9","bip"))

using ntop=500 top features by variance Error in .local(object, ...) : the argument 'intgroup' should specify columns of colData(dds)

ADD REPLY
0
Entering edit mode

Please stop spamming comments to the answer field and put formatting. It's the 10101 button. Select code, then press the button to trigger markdown highlighting. This has nothing to do with the toplevel question. intgroup must be part of the colData of dds. So dds$iron must exist and mm9 and bip must be levels of iron.

ADD REPLY

Login before adding your answer.

Traffic: 1988 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6