Hi,
I have attempted to run ImpulseDE2 (R package) for my Small RNA-Seq time series data. I have checked that my counts data matrix and sample information data frame contain the same names. However, I receive the following error.
Error in .validate_names(colnames, ans_colnames, "assay colnames()", "colData rownames()") :
assay colnames() must be NULL or identical to colData rownames()
Could this be due to the way that I am importing my read counts matrix or sample data frame? Any advice that you can give will be greatly appreciated!
My code is below:
library(ImpulseDE2)
# Read Data into R as a table
countdata <- read.table("counts_sncRNA_DE.tsv", header=T, sep='\t',row.names=1)
# Convert table into a counts matrix
countdata <- as.matrix(countdata)
# Import my time series metadata into R and convert to data frame
annot <- read.table("coldata_3.txt", header=TRUE, sep='\t')
annot <- as.data.frame.matrix(annot)
# DEG analysis with ImpulseDE2
impulse_results <- runImpulseDE2(matCountData = countdata,
dfAnnotation = annot,
boolCaseCtrl = TRUE,
vecConfounders = c("Batch"),
scaNProc = 2,
scaQThres = 0.05,
boolIdentifyTransients = TRUE)
Process input
Processing Details:
ImpulseDE2 runs in case-ctrl mode.
Found time points: 2,6,16
Case: Found the samples at time point 2: M1C_2hr,M2C_2hr,M3C_2hr,M4C_2hr
Case: Found the samples at time point 6: M1C_6hr,M2C_6hr,M3C_6hr,M4C_6hr
Case: Found the samples at time point 16: M1C_16hr,M2C_16hr,M3C_16hr,M4C_16hr
Control: Found the following samples at time point 2:M1D_2hr,M2D_2hr,M3D_2hr,M4D_2hr
Control: Found the following samples at time point 6:M1D_6hr,M2D_6hr,M3D_6hr,M4D_6hr
Control: Found the following samples at time point 16:M1D_16hr,M2D_16hr,M3D_16hr,M4D_16hr
Found the following samples for confounder Batch and batch M1: M1C_2hr,M1C_6hr,M1C_16hr,M1D_2hr,M1D_6hr,M1D_16hr
Found the following samples for confounder Batch and batch M2: M2C_2hr,M2C_6hr,M2C_16hr,M2D_2hr,M2D_6hr,M2D_16hr
Found the following samples for confounder Batch and batch M3: M3C_2hr,M3C_6hr,M3C_16hr,M3D_2hr,M3D_6hr,M3D_16hr
Found the following samples for confounder Batch and batch M4: M4C_2hr,M4C_6hr,M4C_16hr,M4D_2hr,M4D_6hr,M4D_16hr
Input contained 667 genes/regions.
Selected 667 genes/regions for analysis.
Run DESeq2: Using dispersion factorscomputed by DESeq2.
<simpleError in .validate_names(colnames, ans_colnames, "assay colnames()", "colData rownames()"): assay colnames() must be NULL or identical to colData rownames()>
[1] "WARNING: DESeq2 failed on full model - dispersions may be inaccurate.Estimating dispersions on reduced model. Supply externally generated dispersion parameters via vecDispersionsExternal if there is a more accurate model for your data set."
Error in .validate_names(colnames, ans_colnames, "assay colnames()", "colData rownames()") :
assay colnames() must be NULL or identical to colData rownames()
In addition: Warning message:
In value[[3L]](cond) :
Warning generated in dispersion factor estimation. Eead stdout.
Timing stopped at: 0.072 0.004 0.077
Timing stopped at: 0.495 0.012 0.507
I have had the same experience, the row.names() should be the same as sample id, and more interestingly it is better to sort the time-points before making ImpulseDE object.