Batch-correction on tximport data from salmon
1
0
Entering edit mode
4.0 years ago
ic23oluk • 0

Hello,

I am analyzing two disting BioProjects together from SRA. My pipeline was: Fastp > salmon > tximport into R

Now I would like to perform batch correction using the combat-seq() comand from the sva package based on the BioProject. However, since the tximport data frame is more complex than a simple count-matrix, I am not sure how to do that. Which column of the txi dataframe is DESeq2 using for calculating DE genes?

My idea was to using the following code:

txi <- tximport(files, type = "salmon", tx2gene = tx2gene, ignoreTxVersion = T)

### Batch Correction ###
batch = colData$BioProject

ComBat_seq(txi$counts, batch=batch, group=NULL)

After which I would proceed with DEseq2 as follows:

ddsTxi <- DESeqDataSetFromTximport(txi,
                                   colData = colData1,
                                   design = ~  BioProject + condition)

Could somebody tell me if that is the correct approach?

Thank you in advance.

salmon sva combat RNA-Seq DESeq2 • 2.4k views
ADD COMMENT
5
Entering edit mode
4.0 years ago
ATpoint 85k

Put it into the design as recommended in the vignette of DESeq2, is there a problem with that? In case you want to explicitely correct the counts outside of the dds object you would need to first make a dds object with DESeqDataSetFromTximport, extract gene level (raw) counts, apply the correction and feed the corrected counts back to the dds, but why the bother, just put it into the design.

ADD COMMENT
0
Entering edit mode

Are you saying that the DESeq2 internal batch correction does the job and I dont need theComBat_seq function?

ADD REPLY
1
Entering edit mode

Yes, please read the manual and scan for the buzzword batch, this is discussed there. https://www.bioconductor.org/packages/devel/bioc/vignettes/DESeq2/inst/doc/DESeq2.html

ADD REPLY
0
Entering edit mode

Thank you.

If I got it right, the way i formulate my design in my DESeqDataSetFromTximport should be correct, as the BioProject variable contains the batch. Again:

ddsTxi <- DESeqDataSetFromTximport(txi,
                               colData = colData1,
                               design = ~  BioProject + condition)
ADD REPLY
1
Entering edit mode

Probably yes, would help to see content of colData1.

ADD REPLY

Login before adding your answer.

Traffic: 2746 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6