Harmony integration group.by.var parameter
1
0
Entering edit mode
6 months ago
Picasa ▴ 650

Hi,

I am looking for advice about parameter for hamony integration. My question is to look for differences between WT and KO

I have paired data:

For each donor I have 2 samples (WT and KO)

Example of data:

Sample_ID   Donor_ID    Condition
1_WT    1   WT
1_KO    1   KO
2_WT    2   WT
2_KO    2   KO

The donor data were sequenced at different time (ex: Donor 1 at 10 days, Donor 2 at 60 days)

This is my original command for Harmony

RunHarmony(seu_obj, group.by.vars = c("orig.ident"))

I am not sure about group.by.vars parameter.

orig.ident corresponds to the Sample_ID, but should I include Donor_ID or Condition in group.by.vars ?

Thanks

single-cell harmony • 1.4k views
ADD COMMENT
1
Entering edit mode
6 months ago

The variability explained by the variables provided to group.by.vars is what Harmony will try to remove. Assuming you want to remove the differences between the donors, that's what I'd be feeding to it.

If you're trying to remove variation between all the samples, then yes, "Sample_ID" may be appropriate. In some cases, doing so prior to clustering/cell type annotation (and then re-doing it with the variable you'd actually like removed) can be helpful in order to annotate consistently across samples.

ADD COMMENT
0
Entering edit mode

Thanks jared.andrews07 for your answer.

So, you are suggesting to use only "Donor" in the integration?

RunHarmony(seu_obj, group.by.vars = c("Donor"))

Since I am interested in the differences between the conditions (WT vs KO), I am wondering if using "Sample_ID" is appropriate:

RunHarmony(seu_obj, group.by.vars = c("Sample_ID", "Donor"))

I am not sure, but using "Sample_ID" might remove the differences between the conditions right ?

ADD REPLY
1
Entering edit mode

I am not sure, but using "Sample_ID" might remove the differences between the conditions right ?

More than likely, it'd at least impact them, yes. You can always try both and see which looks better.

Also not sure what you plan to do downstream, as generally integration isn't going to impact differential expression (as you can always include unwanted variables in your model). I generally just use integration to cram identical cell types together between conditions to make them easier to cluster/annotate.

ADD REPLY
0
Entering edit mode

For downstream analysis, after integration, my plan is to annotate each cluster/cell type and then perform a DGE analysis of KO versus WT for each cluster/cell type separately.

ADD REPLY
0
Entering edit mode

jared.andrews07

If I integrate using Donor:

RunHarmony(seu_obj, group.by.vars = c("Donor_ID"))

Should I also normalize my data by splitting it by Donor_ID so that each object in the list contains both samples (WT and KO)?

seu <- SplitObject(seurat_obj, split.by = "Donor_ID")
seu <- lapply(X = seu, 
              FUN = SCTransform, 
              return.only.var.genes = FALSE)
ADD REPLY
0
Entering edit mode

Couldn't tell you, I avoid Seurat like the plague.

ADD REPLY

Login before adding your answer.

Traffic: 1773 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6