Find surrogate variables with svaseq()
0
0
Entering edit mode
8.5 years ago

Hi, I'm currently struggling with an RNA-Seq experiment, especially with the batch-effects which potentially effects my analysis. I want to use svaseq() from the sva-package like recommended here (chapter "Removing hidden batch effects"), to find and account for hidden surrogate variables.

Because I do not see a clear batch-effect clustering with my raw-data I thought to estimate the batch-effects with num.sv() and then use this result with svaseq(). Interestingly, num.sv() gives me 10 surrogate variables which confused me a bit. Additionally svaseq() runs into an error when using 10 surrogate variables. Because of that I wanted to define the n.sv-argument by hand using the number of my assumed batch-effects which is 3.

My question is now, what should I put into the n.sv-argument within svaseq()? Is it simply the 3? In the above-mentioned manual they write:

As we described above, we are trying to recover any hidden batch effects, supposing that we do not know the cell line information... Finally we specify that we want to estimate 2 surrogate variables.

Here is what they define for svaseq:

svseq <- svaseq(dat, mod, mod0, n.sv=2)

So they want to add the cell line as possible surrogate variable but then define n.sv with two possible surrogate variables. Why? Are they assuming, beside the cell line, another batch-effect? At the end, they add these two variables to the DESeq2-design which seems to represent the cell line effect. Maybe I missed it. However, it is not described very clearly.

Thanks for all your help in advance.

RNA-Seq R • 3.3k views
ADD COMMENT

Login before adding your answer.

Traffic: 1987 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6