Question

Create singlecellexperiment object with colData

0

Entering edit mode

2.5 years ago

tien ▴ 40

Hello all,

I'm trying to create a singlecellexperiment with following command

SingleCellExperiment(assays = list(logcounts = logtpm),
                                 colData = df[colnames(logtpm),])

However it prompted error

Error in if (!ok) { : missing value where TRUE/FALSE needed
Calls: SingleCellExperiment -> SummarizedExperiment
Execution halted

I think it due to colData = df[colnames(logtpm),], since it worked fine when I excluded this line. However I cannot debug this because I don't unsderstand why it causes error. I follow the example of creating singlecellexperiment object from this https://bioconductor.org/packages/devel/bioc/vignettes/SingleCellExperiment/inst/doc/intro.html#5_Adding_alternative_feature_sets

If anybody experienced this error before can give me any tips for debugging?

Thanks for your help.

singlecellexperiment • 2.4k views

ADD COMMENT • link 2.5 years ago by tien ▴ 40

1

Entering edit mode

Does logtpm have column names, and does df have rownames that match the column names? You may want to include a snippet of your matrix logtpm[1:10, 1:10] and column data head(df).

ADD REPLY • link 2.5 years ago by rpolicastro 13k

0

Entering edit mode

Here are result for

head(df)

DataFrame with 6 rows and 8 columns
                  WGCNAcluster Cluster.Interpretation                Cell
                   <character>            <character>         <character>
Hi_GW21_4.Hi_GW21_4         Glyc             Glycolysis Hi_GW21_4.Hi_GW21_4
Hi_GW21_5.Hi_GW21_5          tRG  Truncated Radial Glia Hi_GW21_5.Hi_GW21_5
Hi_GW21_8.Hi_GW21_8      RG-div1 Dividing Radial Glia.. Hi_GW21_8.Hi_GW21_8
Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
                         Name       Age  RegionName     Laminae        Area
                  <character> <numeric> <character> <character> <character>
Hi_GW21_4.Hi_GW21_4     Sample1        19      Cortex         All           0
Hi_GW21_5.Hi_GW21_5     Sample1        19      Cortex         All           0
Hi_GW21_8.Hi_GW21_8     Sample1        19      Cortex         All           0
Hi_GW21_1.Hi_GW21_1     Sample1        19      Cortex         All           0
Hi_GW21_2.Hi_GW21_2     Sample1        19      Cortex         All           0
Hi_GW21_3.Hi_GW21_3     Sample1        19      Cortex         All           0

colnames(logtpm)

[1] "Hi_GW21_4.Hi_GW21_4"   "Hi_GW21_5.Hi_GW21_5"   "Hi_GW21_8.Hi_GW21_8"  
 [4] "Hi_GW21_1.Hi_GW21_1"   "Hi_GW21_2.Hi_GW21_2"   "Hi_GW21_3.Hi_GW21_3"  
 [7] "Hi_GW21_7.Hi_GW21_7"   "Hi_GW21_6.Hi_GW21_6"   "Hi_GW16_11.Hi_GW16_11"
[10] "Hi_GW16_3.Hi_GW16_3"   "Hi_GW16_4.Hi_GW16_4"   "Hi_GW16_1.Hi_GW16_1"  
[13] "Hi_GW16_10.Hi_GW16_10" "Hi_GW16_2.Hi_GW16_2"   "Hi_GW16_8.Hi_GW16_8"  
[16] "Hi_GW16_9.Hi_GW16_9"   "Hi_GW16_13.Hi_GW16_13" "Hi_GW16_14.Hi_GW16_14"
[19] "Hi_GW16_15.Hi_GW16_15" "Hi_GW16_16.Hi_GW16_16" "Hi_GW16_17.Hi_GW16_17"
[22] "Hi_GW16_18.Hi_GW16_18" "Hi_GW16_20.Hi_GW16_20" "Hi_GW16_25.Hi_GW16_25"
[25] "Hi_GW16_12.Hi_GW16_12" "Hi_GW16_19.Hi_GW16_19" "Hi_GW16_24.Hi_GW16_24"
...

ADD REPLY • link 2.5 years ago by tien ▴ 40

0

Entering edit mode

so, what happens when you do head(df[colnames(logtpm),]?

ADD REPLY • link 2.5 years ago by Friederike 9.0k

0

Entering edit mode

It's as following for head(df[colnames(logtpm),]. It's actually weird to me that running this alone is fine but when putting them in creating SCE command, it prompts error. Do you know if I need to set anything to true or false when adding colData to SCE? Or is there anyway to add colData after creating SCE. Since when I run SingleCellExperiment(assays = list(logcounts = logtpm), it worked normally.

DataFrame with 6 rows and 8 columns
                  WGCNAcluster Cluster.Interpretation                Cell
                   <character>            <character>         <character>
Hi_GW21_4.Hi_GW21_4         Glyc             Glycolysis Hi_GW21_4.Hi_GW21_4
Hi_GW21_5.Hi_GW21_5          tRG  Truncated Radial Glia Hi_GW21_5.Hi_GW21_5
Hi_GW21_8.Hi_GW21_8      RG-div1 Dividing Radial Glia.. Hi_GW21_8.Hi_GW21_8
Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
                         Name       Age  RegionName     Laminae        Area
                  <character> <numeric> <character> <character> <character>
Hi_GW21_4.Hi_GW21_4     Sample1        19      Cortex         All           0
Hi_GW21_5.Hi_GW21_5     Sample1        19      Cortex         All           0
Hi_GW21_8.Hi_GW21_8     Sample1        19      Cortex         All           0
Hi_GW21_1.Hi_GW21_1     Sample1        19      Cortex         All           0
Hi_GW21_2.Hi_GW21_2     Sample1        19      Cortex         All           0
Hi_GW21_3.Hi_GW21_3     Sample1        19      Cortex         All           0

ADD REPLY • link 2.5 years ago by tien ▴ 40

1

Entering edit mode

Could you share a small subset of both datasets using dput(head(logtpm)) and dput(head(df)) ? As it stand I do not see errors in your code but there seem to be something wrong with data formatting

ADD REPLY • link 2.5 years ago by Basti ★ 2.0k

0

Entering edit mode

Here are results for dput:

dput(head(df[1:5, 1:5]))

new("DFrame", rownames = c("Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2", 
"Hi_GW21_3.Hi_GW21_3", "Hi_GW21_7.Hi_GW21_7", "Hi_GW21_6.Hi_GW21_6"
), nrows = 5L, listData = list(WGCNAcluster = c("nEN-early2", 
"nEN-early2", "nEN-late", "EN-V1-2", "EN-V1-2"), Cluster.Interpretation = c("Newborn Excitatory Neuron - early born", 
"Newborn Excitatory Neuron - early born", "Newborn Excitatory Neuron - late born", 
"Early and Late Born Excitatory Neuron V1", "Early and Late Born Excitatory Neuron V1"
), Cell = c("Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2", "Hi_GW21_3.Hi_GW21_3", 
"Hi_GW21_7.Hi_GW21_7", "Hi_GW21_6.Hi_GW21_6"), Name = c("Sample1", 
"Sample1", "Sample1", "Sample1", "Sample1"), Age = c(19, 19, 
19, 19, 19)), elementType = "ANY", elementMetadata = NULL, metadata = list())
DataFrame with 5 rows and 5 columns
                    WGCNAcluster Cluster.Interpretation                Cell
                     <character>            <character>         <character>
Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
Hi_GW21_7.Hi_GW21_7      EN-V1-2 Early and Late Born .. Hi_GW21_7.Hi_GW21_7
Hi_GW21_6.Hi_GW21_6      EN-V1-2 Early and Late Born .. Hi_GW21_6.Hi_GW21_6
                           Name       Age
                    <character> <numeric>
Hi_GW21_1.Hi_GW21_1     Sample1        19
Hi_GW21_2.Hi_GW21_2     Sample1        19
Hi_GW21_3.Hi_GW21_3     Sample1        19
Hi_GW21_7.Hi_GW21_7     Sample1        19
Hi_GW21_6.Hi_GW21_6     Sample1        19

dput(head(logtpm[1:5, 1:5]))

structure(c(0, 0, 6.81954856968559, 0, 0, 0, 0, 6.14963663024324, 
0, 0, 0, 0, 6.0877400108758, 0, 0, 0, 0, 7.13622016683789, 0, 
0.133200664189903, 0, 0, 6.65784728406714, 0, 0), .Dim = c(5L, 
5L), .Dimnames = list(c("DDC8", "AC092057.1", "RPS11", "CREB3L1", 
"RPL10P14"), c("Hi_GW21_4.Hi_GW21_4", "Hi_GW21_5.Hi_GW21_5", 
"Hi_GW21_8.Hi_GW21_8", "Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2"
)))
         Hi_GW21_4.Hi_GW21_4 Hi_GW21_5.Hi_GW21_5 Hi_GW21_8.Hi_GW21_8
DDC8                  0.000000            0.000000             0.00000
AC092057.1            0.000000            0.000000             0.00000
RPS11                 6.819549            6.149637             6.08774
CREB3L1               0.000000            0.000000             0.00000
RPL10P14              0.000000            0.000000             0.00000
         Hi_GW21_1.Hi_GW21_1 Hi_GW21_2.Hi_GW21_2
DDC8                 0.0000000            0.000000
AC092057.1           0.0000000            0.000000
RPS11                7.1362202            6.657847
CREB3L1              0.0000000            0.000000
RPL10P14             0.1332007            0.000000

Do you think if input structure is inappropriate to SCE?

ADD REPLY • link 2.5 years ago by tien ▴ 40

2

Entering edit mode

I suggest that not all colnames(logtpm) are present in rownames(df). Could you have a look at length(intersect(colnames(logtpm),rownames(df)))and length(colnames(logtpm))? Indeed they should be equivalent otherwise you can't create the SCE object