i am dealing with certain illumina microarray data.
i working on genomestudio and proceeded the way it is given in user guide. But i am facing problem, while loading files (.idat) in repository tab, once i click on folder appearing(Barcode) on sentrix array it is not recognizing the sample (idat files). So what all files do require to keep in folder where my idat files are saved? and what is the reason that it is not recognizing my files.
Actually just going through manual of genome studio gene expression I kept idat file only in one folder. But after that I kept all files in same folder so it works. I will try bioconductor also for my data.
Anyways thanks andrew and poisonalien for helping me.
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
Not sure about GenomeStudio. But if you are comfortable using R, use this script. It takes idat files as input, does normalization and performs differential expression between two groups. (Assuming there are no batch effects)
Just to tack onto poisonAlien's answer, the behaviour you're seeing in GenomeStudio is just a quirk of their software, and I'm sure there was a reason for it once upon a time, you need the IDATs to be in a folder separated by chip ID (each folder is the chip ID number), in addition you'll need SDF files in the folder too. GenomeStudio is not as flexible as bioconductor methods for analysing microarray data, so I'd second poisonAlien's answer, to try the analysis in R, you'll get more of an appreciation for what actually occurs in a typical differential expression analysis. If you still have trouble with GenomeStudio, I'd suggest you contact Illumina support, you've paid for a license, so you should make use of the support they provide.
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
0
Entering edit mode
You gotta be more specific. That code assumes that you're working on human arrays (to be specific HT12 v4 chip, because that's what we use frequently in our lab). If you're using another array, you will need to change the annotation. Do you have replicates ? And do you have all the libraries installed? (beadarray, limma, illuminaHumanv4.db)
source("Microarray/AnalyzeBead.R")
result = beadAnalyze(idats = c("/dummy_data/Image Data/9666412702/9666412702_A_Grn.idat" , "/dummy_data/Image Data/9666412702/9666412702_B_Grn.idat"), names = c("control","treated1"), condition = c("control","control","treated","treated"), ref.condition = "treated")
Error in `[<-.data.frame`(`*tmp*`, , "sampleFac", value = c("control", :
replacement has 4 rows, data has 2
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
1
Entering edit mode
Ahh! See you are providing two idat files (one treated and one control) but your condition says two control and two treated. That's what your error report says.
If I have replicates then what command should I use? Same as the above you provided
I tried to understand the source code of this but it's going out of my understanding
Thanks
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
0
Entering edit mode
What is understood from this command is names=c("control","treated1) will refer to object of control and treated condition = ("control" , "treated") will handle error or warning?
What ref.condition this?
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
file1.idat, file2.idat are replicates for treated and file3.idatfile4.idat are replicates of control? Am I correct?
As I said already I m dealing with dummy data I tried some more sample so after running this command I got message :-
Annotating control probes using package illuminaHumanv4.db Version:1.26.0
Calculating array weights
Array weight
After typing
result
it is showing certain values with column
ID logFC AveExpr t P.Value adj.P.Val B
ILMN_XXXXX
ADD REPLY
• link
updated 2.4 years ago by
Ram
44k
•
written 8.8 years ago by
Kritika
▴
270
1
Entering edit mode
idats is vector of your dat files (in the above example there are 4 dat files)
names is sample names for those dat files (above they are named as control1, control2, treated1 and treated2). Yes, they're replicates.
condition is sample characteristics. First two are control and last two are treated. It can be anything based on your experiment. (like knockdown, over expression, etc.)
ref.condition is which one of the condition to use as a reference. Here I am comparing everything with treated. All up or down genes are with respect to treated samples.
Output is typical limma results. You may want to read limma manual. In short, logFC is fold change with respect to control samples, AveExpr is average expression across all your samples, t statistics, p-value, adj.P.Val is FDR, B is odd ratios. Also there are other stuffs like Probe sequence, probe quality, its locus on genome, where it lies on transcript, etc.
Script itself is well commented, so you should be able to follow. However life will be easier if you know how expressionset object is represented and its slots in Bioconductor. Tomorrow I will update the script with PCA, you can check again.
Actually just going through manual of genome studio gene expression I kept idat file only in one folder. But after that I kept all files in same folder so it works. I will try bioconductor also for my data.
Anyways thanks andrew and poisonalien for helping me.