DESeq after ComBat_seq
0
0
Entering edit mode
2.6 years ago
AStudent • 0

Hello,

I'm relatively new to RNAseq and have ok experience with R. I'm having trouble performing DESeq on ComBat_seq processed data.


Background: I'm comparing my bulk RNAseq data to other published data I have collected. For pre-processing, I aligned my trimmed fastq files with STARaligner and used the built in --quantMode GeneCounts option to get raw counts for each sample. This outputted a *ReadsPerGene.out.tab file for each file, which I extracted the first (gene name) and second (unstranded counts) columns for analysis. Using DESeqDataSetFromHTSeqCount, I have successfully produced a PCA plot and heatmap, where it became clear I should not go further without performing batch correction.

I backpedaled and performed batch correction using Combat_seq on the raw counts which now outputted a matrix of corrected raw reads. Now to run DESeq on this, I believe I must now use DESeqDataSetFromMatrix.


The Problem: I run DESeqDataSetFromMatrix as follows:

counts_STAR_batch_corrected <- DESeqDataSetFromMatrix(countData = adjusted_counts,
                                                      colData = sample_table_matrix,
                                                      design = ~condition,
                                                      tidy = F)

This results in the following error:

Error in validObject(.Object) : 
  invalid class “SummarizedExperiment” object: 1: invalid object for slot "NAMES" in class "SummarizedExperiment": got class "matrix", should be or extend class "character_OR_NULL"
invalid class “SummarizedExperiment” object: 2: 
    'names(x)' must be NULL or a character vector with no attributes

The input data I have looks as follows:

adjusted_counts: (class is "matrix" "array")

Sample1 Sample2 Sample3 Sample4 Sample5 Sample6 Sample7

DDX11L1 0 0 0 0 0 0 0

WASH7P 304 526 414 403 448 420 347

MIR6859-3 0 0 0 0 0 0 0

MIR6859-2 0 0 0 0 0 0 0

MIR6859-1 0 0 0 0 0 0 0

MIR6859-4 0 0 0 0 0 0 0

sample_table_matrix: (class is data.frame)

group condition batch

Sample1 Cell1 2D 2

Sample2 Cell1 2D 2

Sample3 Cell1 2D 2

Sample4 Cell2 2D 2

Sample5 Cell2 2D 2

Sample5 Cell2 2D 2

The first columns were both created with colnames

Any thoughts on how to troubleshoot this or what I'm doing wrong would be greatly appreciated (or alternatively how I can use the output from ComBat_seq with DESeqDataSetFromHTSeqCount) would be greatly appreciated.

star combat sva • 1.1k views
ADD COMMENT
0
Entering edit mode

Are your gene names and sample names their own column or the rownames?

ADD REPLY
0
Entering edit mode

They are rownames and columnnames. When I tried the former it threw an error for inconsistent matrix sizes.

ADD REPLY
0
Entering edit mode

The typical way to deal with this is not to alter the counts, but to just include batch in the design.

ADD REPLY
0
Entering edit mode

Do you mean not to use ComBat_seq? My reasoning is to follow the same SOP as where I got the additional data originally and include my data as an additional batch from what they had (we both had sequencing of some of the same cell types and culture conditions). I'd then follow to see changes based on a new different condition I have to see potential significance and/or corroborate my results.

ADD REPLY

Login before adding your answer.

Traffic: 2415 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6