This is Aimin, I am using the following data to perform some analysis
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE15061
I note that this data includes 164 MDS, 202 AML and 69 non-leukemia bone marrow samples(that is to say: they are 435 samples totally)
After I download CEL.gz files, I got 870 CEL.gz files, for example : for “N1_0895, AML,DQN3” and “N1_0895, AML, DS”, they are 2 CEL.gz files like:
GSM376049 N1_0895, AML, DQN3
GSM376500 N1_0895, AML, DS
As my understanding, DQN3 is DQN the signal normalized with quantiles of the beta distribution with parameters p=1.2 and q=3. My question is :
What does “GSM376500 N1_0895,AML,DS” means?
For differential analysis between AML and “non-leukemia bone marrow samples”, can I just use DQN3 value directly, or I need to use both DQN3 and DS?
If I use DQN3 value directly, do I still need to perform normalization(for, example, use RMA), I was thinking these value are already normalized?
Thank you,
Aimin