Hi,
I am looking for advice about parameter for hamony integration. My question is to look for differences between WT and KO
I have paired data:
For each donor I have 2 samples (WT and KO)
Example of data:
Sample_ID Donor_ID Condition
1_WT 1 WT
1_KO 1 KO
2_WT 2 WT
2_KO 2 KO
The donor data were sequenced at different time (ex: Donor 1 at 10 days, Donor 2 at 60 days)
This is my original command for Harmony
RunHarmony(seu_obj, group.by.vars = c("orig.ident"))
I am not sure about group.by.vars
parameter.
orig.ident
corresponds to the Sample_ID
, but should I include Donor_ID
or Condition
in group.by.vars
?
Thanks
Thanks jared.andrews07 for your answer.
So, you are suggesting to use only "Donor" in the integration?
Since I am interested in the differences between the conditions (WT vs KO), I am wondering if using "Sample_ID" is appropriate:
I am not sure, but using "Sample_ID" might remove the differences between the conditions right ?
More than likely, it'd at least impact them, yes. You can always try both and see which looks better.
Also not sure what you plan to do downstream, as generally integration isn't going to impact differential expression (as you can always include unwanted variables in your model). I generally just use integration to cram identical cell types together between conditions to make them easier to cluster/annotate.
For downstream analysis, after integration, my plan is to annotate each cluster/cell type and then perform a DGE analysis of KO versus WT for each cluster/cell type separately.
jared.andrews07
If I integrate using Donor:
Should I also normalize my data by splitting it by
Donor_ID
so that each object in the list contains both samples (WT and KO)?Couldn't tell you, I avoid Seurat like the plague.