Hi,
I'm studying the connectivity map approaches and data. Looking at the instances table I noted that there are some instances related to the same drug in same cell type, at same concentration and duration, while the perturbation scan id and the instance id attributes change. My question is, how should I consider these instances? Since the corresponding list of probe set positions is enough different. Actually, I tried to observe the difference but I did not find a correct method to affirm the real similarity or difference, I appreciate any suggestion about some methods. I expected that these kind of instances must be very similar since have same perturbation conditions. Correct me if I am wrong, is this problem related to batch effect?
I attach some instances that have same perturbation conditions but different perturbation scan id:
instance_id batch_id cmap_name INN1 concentration (M) duration (h) cell2 array3 perturbation_scan_id 1 1 metformin INN 0,00001 6 MCF7 HG-U133A EC2003090503AA 2 1 metformin INN 0,00001 6 MCF7 HG-U133A EC2003090504AA 1480 632 idoxuridine INN 0,0000112 6 MCF7 HT_HG-U133A '5500024024211121606513.E02 1899 610 idoxuridine INN 0,0000112 6 PC3 HG-U133A '610611110806.E02 5262 726 azathioprine INN 0,0000144 6 MCF7 HT_HG-U133A 5500024030403071907253.D09 5627 758 azathioprine INN 0,0000144 6 MCF7 HT_HG-U133A 5500024035100021608460.D09 1528 633 azathioprine INN 0,0000144 6 MCF7 HT_HG-U133A 5500024024211121606513.D09
Can someone explain me where the perturbation scan id come from(piratically,technologically) and its meaning? This should explain the reason of different between the probe set list of these instances. Finally, how I should consider these instances when I'm looking the cmap results?
Thanks a lot for your help.
Best regards
Elisa
Thanks for the answer. Yes indeed these instances are biological replicates, I was perplexed just about the "slight" difference between lists of probe set positions, but this is the reproducibility. Thanks again.
Just one more question: is anyone aware of what the different but similar batch ids indicate? For instance,3 instances are in
batch_id=2
, and one inbatch_id=2a
, although the samevehicle_id
is shared between all instances inbatch_id 2
and2a
.I noticed the same issue too, but I'm not sure exactly what is the best way to manage it. I suppose that because the vehicle_id is the same also the batch id must be the same.