DEXseq batch effect removal error
0
0
Entering edit mode
2.1 years ago
ta_awwad ▴ 350

Hi everyone,

first, I am not expert in using DEXseq. I am trying to analyze DEU from two different datasets. I am just worried about the batch effect. I tried to use svaseq function as suggested by others to remove this effect but ended up with an error message:

> sampleAnno
file condition libType type
1 ES.1.txt ESC paired-end 1
2 ES.2.txt ESC paired-end 1
3 LV_rep1.txt LV paired-end 2
4 LV_rep2.txt LV paired-end 2

> dxd = DEXSeqDataSetFromHTSeq(
  countfiles = sampleAnno$file, 
  sampleData = sampleAnno,
  design= ~ sample + exon + condition:exon,
  flattenedfile = gffFile)
> dxd = estimateSizeFactors( dxd )
> dxd <- dxd[rowMeans(counts(dxd))>1,]
> dat <- counts(dxd, normalized=TRUE)
> library(sva)
> mod1 <- model.matrix(~ sample + exon , colData(dxd))
> mod0 <- model.matrix(~ 1, colData(dxd))
> svseq <- svaseq(dat, mod1, mod0, n.sv=2, B=5)
> dxdsva = dxd
> dxdsva$SV1=svseq$sv[,1]
> dxdsva$SV2=svseq$sv[,2]
> design(dxdsva) <- ~ SV1 + SV2 + sample + exon + condition:exon
> dxdsva <- estimateDispersions( dxdsva )

Error: BiocParallel errors
  1 remote errors, element index: 1
  0 unevaluated and other errors
  first remote error:
Error in estimateDispersionsGeneEst(x, maxit = maxit, quiet = quiet, modelMatrix = modelMatrix, : the number of samples and the number of model coefficients are equal,
  i.e., there are no replicates to estimate the dispersion.
  use an alternate design formula




sessionInfo( )
R version 4.2.0 (2022-04-22)
Platform: x86_64-apple-darwin17.0 (64-bit)
Running under: macOS Monterey 12.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats4    stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] sva_3.44.0                  genefilter_1.78.0           mgcv_1.8-40                 nlme_3.1-159               
 [5] forcats_0.5.2               stringr_1.4.1               dplyr_1.0.10                purrr_0.3.5                
 [9] readr_2.1.3                 tidyr_1.2.1                 tibble_3.1.8                ggplot2_3.3.6              
[13] tidyverse_1.3.2             DEXSeq_1.42.0               RColorBrewer_1.1-3          AnnotationDbi_1.58.0       
[17] DESeq2_1.36.0               SummarizedExperiment_1.26.1 GenomicRanges_1.48.0        GenomeInfoDb_1.32.4        
[21] IRanges_2.30.1              S4Vectors_0.34.0            MatrixGenerics_1.8.1        matrixStats_0.62.0         
[25] Biobase_2.56.0              BiocGenerics_0.42.0         BiocParallel_1.30.3        

loaded via a namespace (and not attached):
 [1] fs_1.5.2               bitops_1.0-7           lubridate_1.8.0        bit64_4.0.5            filelock_1.0.2        
 [6] progress_1.2.2         httr_1.4.4             tools_4.2.0            backports_1.4.1        utf8_1.2.2            
[11] R6_2.5.1               DBI_1.1.3              colorspace_2.0-3       withr_2.5.0            tidyselect_1.1.2      
[16] prettyunits_1.1.1      bit_4.0.4              curl_4.3.3             compiler_4.2.0         rvest_1.0.3           
[21] cli_3.4.1              xml2_1.3.3             DelayedArray_0.22.0    scales_1.2.1           rappdirs_0.3.3        
[26] digest_0.6.29          Rsamtools_2.12.0       XVector_0.36.0         pkgconfig_2.0.3        limma_3.52.4          
[31] dbplyr_2.2.1           fastmap_1.1.0          readxl_1.4.1           rlang_1.0.6            rstudioapi_0.14       
[36] RSQLite_2.2.18         generics_0.1.3         jsonlite_1.8.2         hwriter_1.3.2.1        vroom_1.6.0           
[41] googlesheets4_1.0.1    RCurl_1.98-1.9         magrittr_2.0.3         GenomeInfoDbData_1.2.8 Matrix_1.5-1          
[46] Rcpp_1.0.9             munsell_0.5.0          fansi_1.0.3            lifecycle_1.0.3        edgeR_3.38.4          
[51] stringi_1.7.8          zlibbioc_1.42.0        BiocFileCache_2.4.0    grid_4.2.0             blob_1.2.3            
[56] parallel_4.2.0         crayon_1.5.2           lattice_0.20-45        haven_2.5.1            Biostrings_2.64.1     
[61] splines_4.2.0          annotate_1.74.0        hms_1.1.2              KEGGREST_1.36.3        locfit_1.5-9.6        
[66] pillar_1.8.1           geneplotter_1.74.0     codetools_0.2-18       biomaRt_2.52.0         reprex_2.0.2          
[71] XML_3.99-0.11          glue_1.6.2             modelr_0.1.9           tzdb_0.3.0             png_0.1-7             
[76] vctrs_0.4.2            cellranger_1.1.0       gtable_0.3.1           assertthat_0.2.1       cachem_1.0.6          
[81] xtable_1.8-4           broom_1.0.1            survival_3.4-0         googledrive_2.0.0      gargle_1.2.1          
[86] memoise_2.0.1          statmod_1.4.37         ellipsis_0.3.2

any help?

many many thanks TA

RNA-Seq DEXseq R DEseq2 • 712 views
ADD COMMENT
0
Entering edit mode

the number of samples and the number of model coefficients are equal, i.e., there are no replicates to estimate the dispersion. use an alternate design formula

This error from DESeq2::estimateDispersions is saying that there aren't any replicate samples in your design and so an average dispersion value for a given condition can't be calculated. If you Google the error message you will get some more information, some good examples include https://support.bioconductor.org/p/134505/ and https://support.bioconductor.org/p/89746/

Based on sampleAnno it does look like you have replicates so my guess is that design(dxdsva) <- ~ SV1 + SV2 + sample + exon + condition:exon is not producing an appropriate model or perhaps the model design is confounded?

The other thing to note is that you appear to have performed estimateSizeFactors before running sva. I believe the recommendation is to use raw counts data for sva then run through estimateSizeFactors, estimateDispersions, etc. with the new model design https://bioconductor.org/packages/release/workflows/vignettes/rnaseqGene/inst/doc/rnaseqGene.html#using-sva-with-deseq2 (though I'm not sure if the recommendation is different for DEXSeq)

ADD REPLY

Login before adding your answer.

Traffic: 2392 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6