Create singlecellexperiment object with colData
0
0
Entering edit mode
2.5 years ago
tien ▴ 40

Hello all,

I'm trying to create a singlecellexperiment with following command

SingleCellExperiment(assays = list(logcounts = logtpm),
                                 colData = df[colnames(logtpm),])

However it prompted error

Error in if (!ok) { : missing value where TRUE/FALSE needed
Calls: SingleCellExperiment -> SummarizedExperiment
Execution halted

I think it due to colData = df[colnames(logtpm),], since it worked fine when I excluded this line. However I cannot debug this because I don't unsderstand why it causes error. I follow the example of creating singlecellexperiment object from this https://bioconductor.org/packages/devel/bioc/vignettes/SingleCellExperiment/inst/doc/intro.html#5_Adding_alternative_feature_sets

If anybody experienced this error before can give me any tips for debugging?

Thanks for your help.

singlecellexperiment • 2.4k views
ADD COMMENT
1
Entering edit mode

Does logtpm have column names, and does df have rownames that match the column names? You may want to include a snippet of your matrix logtpm[1:10, 1:10] and column data head(df).

ADD REPLY
0
Entering edit mode

Here are result for

  • head(df)
    DataFrame with 6 rows and 8 columns
                      WGCNAcluster Cluster.Interpretation                Cell
                       <character>            <character>         <character>
    Hi_GW21_4.Hi_GW21_4         Glyc             Glycolysis Hi_GW21_4.Hi_GW21_4
    Hi_GW21_5.Hi_GW21_5          tRG  Truncated Radial Glia Hi_GW21_5.Hi_GW21_5
    Hi_GW21_8.Hi_GW21_8      RG-div1 Dividing Radial Glia.. Hi_GW21_8.Hi_GW21_8
    Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
    Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
    Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
                             Name       Age  RegionName     Laminae        Area
                      <character> <numeric> <character> <character> <character>
    Hi_GW21_4.Hi_GW21_4     Sample1        19      Cortex         All           0
    Hi_GW21_5.Hi_GW21_5     Sample1        19      Cortex         All           0
    Hi_GW21_8.Hi_GW21_8     Sample1        19      Cortex         All           0
    Hi_GW21_1.Hi_GW21_1     Sample1        19      Cortex         All           0
    Hi_GW21_2.Hi_GW21_2     Sample1        19      Cortex         All           0
    Hi_GW21_3.Hi_GW21_3     Sample1        19      Cortex         All           0
    
  • colnames(logtpm)
    [1] "Hi_GW21_4.Hi_GW21_4"   "Hi_GW21_5.Hi_GW21_5"   "Hi_GW21_8.Hi_GW21_8"  
     [4] "Hi_GW21_1.Hi_GW21_1"   "Hi_GW21_2.Hi_GW21_2"   "Hi_GW21_3.Hi_GW21_3"  
     [7] "Hi_GW21_7.Hi_GW21_7"   "Hi_GW21_6.Hi_GW21_6"   "Hi_GW16_11.Hi_GW16_11"
    [10] "Hi_GW16_3.Hi_GW16_3"   "Hi_GW16_4.Hi_GW16_4"   "Hi_GW16_1.Hi_GW16_1"  
    [13] "Hi_GW16_10.Hi_GW16_10" "Hi_GW16_2.Hi_GW16_2"   "Hi_GW16_8.Hi_GW16_8"  
    [16] "Hi_GW16_9.Hi_GW16_9"   "Hi_GW16_13.Hi_GW16_13" "Hi_GW16_14.Hi_GW16_14"
    [19] "Hi_GW16_15.Hi_GW16_15" "Hi_GW16_16.Hi_GW16_16" "Hi_GW16_17.Hi_GW16_17"
    [22] "Hi_GW16_18.Hi_GW16_18" "Hi_GW16_20.Hi_GW16_20" "Hi_GW16_25.Hi_GW16_25"
    [25] "Hi_GW16_12.Hi_GW16_12" "Hi_GW16_19.Hi_GW16_19" "Hi_GW16_24.Hi_GW16_24"
    ...
    
ADD REPLY
0
Entering edit mode

so, what happens when you do head(df[colnames(logtpm),]?

ADD REPLY
0
Entering edit mode

It's as following for head(df[colnames(logtpm),]. It's actually weird to me that running this alone is fine but when putting them in creating SCE command, it prompts error. Do you know if I need to set anything to true or false when adding colData to SCE? Or is there anyway to add colData after creating SCE. Since when I run SingleCellExperiment(assays = list(logcounts = logtpm), it worked normally.

DataFrame with 6 rows and 8 columns
                  WGCNAcluster Cluster.Interpretation                Cell
                   <character>            <character>         <character>
Hi_GW21_4.Hi_GW21_4         Glyc             Glycolysis Hi_GW21_4.Hi_GW21_4
Hi_GW21_5.Hi_GW21_5          tRG  Truncated Radial Glia Hi_GW21_5.Hi_GW21_5
Hi_GW21_8.Hi_GW21_8      RG-div1 Dividing Radial Glia.. Hi_GW21_8.Hi_GW21_8
Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
                         Name       Age  RegionName     Laminae        Area
                  <character> <numeric> <character> <character> <character>
Hi_GW21_4.Hi_GW21_4     Sample1        19      Cortex         All           0
Hi_GW21_5.Hi_GW21_5     Sample1        19      Cortex         All           0
Hi_GW21_8.Hi_GW21_8     Sample1        19      Cortex         All           0
Hi_GW21_1.Hi_GW21_1     Sample1        19      Cortex         All           0
Hi_GW21_2.Hi_GW21_2     Sample1        19      Cortex         All           0
Hi_GW21_3.Hi_GW21_3     Sample1        19      Cortex         All           0
ADD REPLY
1
Entering edit mode

Could you share a small subset of both datasets using dput(head(logtpm)) and dput(head(df)) ? As it stand I do not see errors in your code but there seem to be something wrong with data formatting

ADD REPLY
0
Entering edit mode

Here are results for dput:

  • dput(head(df[1:5, 1:5]))
new("DFrame", rownames = c("Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2", 
"Hi_GW21_3.Hi_GW21_3", "Hi_GW21_7.Hi_GW21_7", "Hi_GW21_6.Hi_GW21_6"
), nrows = 5L, listData = list(WGCNAcluster = c("nEN-early2", 
"nEN-early2", "nEN-late", "EN-V1-2", "EN-V1-2"), Cluster.Interpretation = c("Newborn Excitatory Neuron - early born", 
"Newborn Excitatory Neuron - early born", "Newborn Excitatory Neuron - late born", 
"Early and Late Born Excitatory Neuron V1", "Early and Late Born Excitatory Neuron V1"
), Cell = c("Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2", "Hi_GW21_3.Hi_GW21_3", 
"Hi_GW21_7.Hi_GW21_7", "Hi_GW21_6.Hi_GW21_6"), Name = c("Sample1", 
"Sample1", "Sample1", "Sample1", "Sample1"), Age = c(19, 19, 
19, 19, 19)), elementType = "ANY", elementMetadata = NULL, metadata = list())
DataFrame with 5 rows and 5 columns
                    WGCNAcluster Cluster.Interpretation                Cell
                     <character>            <character>         <character>
Hi_GW21_1.Hi_GW21_1   nEN-early2 Newborn Excitatory N.. Hi_GW21_1.Hi_GW21_1
Hi_GW21_2.Hi_GW21_2   nEN-early2 Newborn Excitatory N.. Hi_GW21_2.Hi_GW21_2
Hi_GW21_3.Hi_GW21_3     nEN-late Newborn Excitatory N.. Hi_GW21_3.Hi_GW21_3
Hi_GW21_7.Hi_GW21_7      EN-V1-2 Early and Late Born .. Hi_GW21_7.Hi_GW21_7
Hi_GW21_6.Hi_GW21_6      EN-V1-2 Early and Late Born .. Hi_GW21_6.Hi_GW21_6
                           Name       Age
                    <character> <numeric>
Hi_GW21_1.Hi_GW21_1     Sample1        19
Hi_GW21_2.Hi_GW21_2     Sample1        19
Hi_GW21_3.Hi_GW21_3     Sample1        19
Hi_GW21_7.Hi_GW21_7     Sample1        19
Hi_GW21_6.Hi_GW21_6     Sample1        19
  • dput(head(logtpm[1:5, 1:5]))
    structure(c(0, 0, 6.81954856968559, 0, 0, 0, 0, 6.14963663024324, 
    0, 0, 0, 0, 6.0877400108758, 0, 0, 0, 0, 7.13622016683789, 0, 
    0.133200664189903, 0, 0, 6.65784728406714, 0, 0), .Dim = c(5L, 
    5L), .Dimnames = list(c("DDC8", "AC092057.1", "RPS11", "CREB3L1", 
    "RPL10P14"), c("Hi_GW21_4.Hi_GW21_4", "Hi_GW21_5.Hi_GW21_5", 
    "Hi_GW21_8.Hi_GW21_8", "Hi_GW21_1.Hi_GW21_1", "Hi_GW21_2.Hi_GW21_2"
    )))
             Hi_GW21_4.Hi_GW21_4 Hi_GW21_5.Hi_GW21_5 Hi_GW21_8.Hi_GW21_8
    DDC8                  0.000000            0.000000             0.00000
    AC092057.1            0.000000            0.000000             0.00000
    RPS11                 6.819549            6.149637             6.08774
    CREB3L1               0.000000            0.000000             0.00000
    RPL10P14              0.000000            0.000000             0.00000
             Hi_GW21_1.Hi_GW21_1 Hi_GW21_2.Hi_GW21_2
    DDC8                 0.0000000            0.000000
    AC092057.1           0.0000000            0.000000
    RPS11                7.1362202            6.657847
    CREB3L1              0.0000000            0.000000
    RPL10P14             0.1332007            0.000000
    

Do you think if input structure is inappropriate to SCE?

ADD REPLY
2
Entering edit mode

I suggest that not all colnames(logtpm) are present in rownames(df). Could you have a look at length(intersect(colnames(logtpm),rownames(df)))and length(colnames(logtpm))? Indeed they should be equivalent otherwise you can't create the SCE object

ADD REPLY
0
Entering edit mode

Hello, indeed, some genes are missing in df. Thanks for your suggestion.

ADD REPLY

Login before adding your answer.

Traffic: 1304 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6