read in sample info

Question

PCAtools data file not linking to metadata file

0

Entering edit mode

4.5 years ago

bryce.plu ▴ 10

Hi, I am attempting to use the PCAtools R package. I have imported my own data matrix (pca.matrix) and metadata (metadata) files. When just running this code, everything works great and I get plots:

p <- pca(pca.matrix, removeVar = 0.1)

-- removing the lower 10% of variables based on variance

screeplot(p)
biplot(p)

When I try to link and check the metadata, all seems to be working:

pca.matrix <-pca.matrix[,which(colnames(pca.matrix) %in% rownames(metadata))]
all(colnames(pca.matrix) == rownames(metadata))

[1] TRUE

However, when I try to run the PCA with the metadata, I get the following:

p <- pca(pca.matrix, metadata = metadata, removeVar = 0.1)

Error in pca(pca.matrix, metadata = metadata, removeVar = 0.1) : 'colnames(mat)' is not identical to 'rownames(metadata)'

Shouldn't it be trying to match up 'colnames(pca.matrix)' with 'rownames(metadata)'? What is 'colnames(mat)'? I feel like I'm totally missing some key information.

Any help would be great! Thank you!

R PCA RNA-Seq PCAtools • 3.8k views

ADD COMMENT • link updated 13 months ago by ATpoint 86k • written 4.5 years ago by bryce.plu ▴ 10

score 0 · Answer 1 · 2020-06-19

0

Entering edit mode

4.5 years ago

bryce.plu ▴ 10

Ok, I fixed it!

When importing the data file and the metadata file with the

read.csv(file = ....

function, both data files needed to have

row.names = 1

as part of the argument. Classic rookie mistake! Leaving the post up in case others run into the same issue.

ADD COMMENT • link 4.5 years ago by bryce.plu ▴ 10

0

Entering edit mode

Hi, I developed this package. Sure that it is all now okay? There will likely be a variety of reasons for that error message to appear - yours is just one particular use case

ADD REPLY • link 4.5 years ago by Kevin Blighe 88k

1

Entering edit mode

Hi Kevin,

Thank you for the response! Yes! Everything is working wonderfully now - just a rookie user error on my part!

ADD REPLY • link 4.5 years ago by bryce.plu ▴ 10

0

Entering edit mode

Hi Kevin, what other reasons might lead to this error? I'm having a similar issue to op, but their solution isn't working for me

ADD REPLY • link 2.3 years ago by bhan059 • 0

0

Entering edit mode

Can you show a sample of your input metadata and the command you are using to import it?

ADD REPLY • link 2.3 years ago by Kevin Blighe 88k

score 0 · Answer 2 · 2023-11-23

Hi kevin,

I also having the similar issue,

counts_data <- read.csv('count_values_featurecounts.csv', row.names = 1)

head(counts_data) colnames(counts_data)

colnames(counts_data) [1] "Sample1_1387_S76_FE_1B" "sample2_1391_S80_FE_1B" "sample3_1388_S77_MM9_2B" [4] "sample4_1389_S78_MM9_1B" "sample5_1390_S79_BIP_1C" "sample6_1392_S81_BIP_2B"

read in sample info

colData <- read.csv('metadata.csv', row.names = 1) rownames(colData) [1] "Sample1_1387_S76_FE_1B" "sample2_1391_S80_FE_1B" "sample3_1388_S77_MM9_2B" [4] "sample4_1389_S78_MM9_1B" "sample5_1390_S79_BIP_1C" "sample6_1392_S81_BIP_2B"

making sure the row names in colData matches to column names in counts_data (smaller dataset %in% larger dataset)

all(rownames(colData) %in% colnames(counts_data)) [1] TRUE

are they in the same order?

all(colnames(counts_data1) == rownames(colData)) [1] TRUE

Step 2: construct a DESeqDataSet object ----------

dds <- DESeqDataSetFromMatrix(countData = counts_data,

colData = colData,

design = ~ treatment) Warning message: In DESeqDataSet(se, design = design, ignoreRank) : some variables in design formula are characters, converting to factors dds class: DESeqDataSet dim: 5406 6 metadata(1): version assays(1): counts rownames(5406): R65_hybrid_00001 R65_hybrid_00002 ... R65_hybrid_05462_gene R65_hybrid_05463_gene rowData names(0): colnames(6): Sample1_1387_S76_FE_1B sample2_1391_S80_FE_1B ... sample5_1390_S79_BIP_1C sample6_1392_S81_BIP_2B colData names(1): treatment

transform the data to variance-stabilised expression levels

vst <- vst(dds) -- note: fitType='parametric', but the dispersion trend was not well captured by the function: y = a/x + b, and a local regression fit was automatically substituted. specify fitType='local' or 'mean' to avoid this message next time.

generate a PCA plot

p <- pca(vst,c("iron","mm9","bip") ) Error in t.default(mat) : argument is not a matrix