Question

illumina gene expression

1

Entering edit mode

9.3 years ago

Kritika ▴ 270

Hello all

i am dealing with certain illumina microarray data.

i working on genomestudio and proceeded the way it is given in user guide. But i am facing problem, while loading files (.idat) in repository tab, once i click on folder appearing(Barcode) on sentrix array it is not recognizing the sample (idat files). So what all files do require to keep in folder where my idat files are saved? and what is the reason that it is not recognizing my files.

genomestudio microarray geneexpression illumina • 5.6k views

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by Kritika ▴ 270

0

Entering edit mode

Actually just going through manual of genome studio gene expression I kept idat file only in one folder. But after that I kept all files in same folder so it works. I will try bioconductor also for my data.

Anyways thanks andrew and poisonalien for helping me.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by Kritika ▴ 270

Ram · Answer 1 · 2016-02-12

3

Entering edit mode

9.3 years ago

poisonAlien ★ 3.2k

Not sure about GenomeStudio. But if you are comfortable using R, use this script. It takes idat files as input, does normalization and performs differential expression between two groups. (Assuming there are no batch effects)

Usage:

source("AnalyzeBead.R")
result = beadAnalyze(idats = c("file1.idat","file2.idat","file3.idat","file4.idat"),names = c("control1","control2","treated1","treated2"),condition = c("control","control","treated","treated"),ref.condition = "treated")

ADD COMMENT • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by poisonAlien ★ 3.2k

1

Entering edit mode

Just to tack onto poisonAlien's answer, the behaviour you're seeing in GenomeStudio is just a quirk of their software, and I'm sure there was a reason for it once upon a time, you need the IDATs to be in a folder separated by chip ID (each folder is the chip ID number), in addition you'll need SDF files in the folder too. GenomeStudio is not as flexible as bioconductor methods for analysing microarray data, so I'd second poisonAlien's answer, to try the analysis in R, you'll get more of an appreciation for what actually occurs in a typical differential expression analysis. If you still have trouble with GenomeStudio, I'd suggest you contact Illumina support, you've paid for a license, so you should make use of the support they provide.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by andrew.j.skelton73 6.6k

0

Entering edit mode

Hello poisonAlien

The script which you shown above is giving error:

Error in idatData$Quants[, "CodesBinData"] : subscript out of bounds

Please tell me what this error means and how to rectify it.

Thank you

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by Kritika ▴ 270

0

Entering edit mode

What platform are you using? chip ID?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

Currently I am working on one dummy sample.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by Kritika ▴ 270

0

Entering edit mode

You gotta be more specific. That code assumes that you're working on human arrays (to be specific HT12 v4 chip, because that's what we use frequently in our lab). If you're using another array, you will need to change the annotation. Do you have replicates ? And do you have all the libraries installed? (beadarray, limma, illuminaHumanv4.db)

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

Yes, the chip is HT12v4 I confirmed from where I got the samples. Yes, all libraries are installed

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by Kritika ▴ 270

0

Entering edit mode

Can you post your command?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.3 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

source("Microarray/AnalyzeBead.R")

result = beadAnalyze(idats = c("/dummy_data/Image Data/9666412702/9666412702_A_Grn.idat" , "/dummy_data/Image Data/9666412702/9666412702_B_Grn.idat"),  names = c("control","treated1"), condition = c("control","control","treated","treated"), ref.condition = "treated")

 Error in `[<-.data.frame`(`*tmp*`, , "sampleFac", value = c("control",  :
  replacement has 4 rows, data has 2

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by Kritika ▴ 270

1

Entering edit mode

Ahh! See you are providing two idat files (one treated and one control) but your condition says two control and two treated. That's what your error report says.

Try:

result = beadAnalyze(idats = c("/dummy_data/Image Data/9666412702/9666412702_A_Grn.idat" , "/dummy_data/Image Data/9666412702/9666412702_B_Grn.idat"),  names = c("control","treated1"), condition = c("control","treated"), ref.condition = "treated")

Note, you don't have replicates so you wont get any p-values.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

Oh! Thank you :)

poisonAlien can you please explain me this line

names = c("control","treated1"), condition = c("control","treated"), ref.condition = "treated")

If I have replicates then what command should I use? Same as the above you provided

I tried to understand the source code of this but it's going out of my understanding

Thanks

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by Kritika ▴ 270

0

Entering edit mode

What is understood from this command is names=c("control","treated1) will refer to object of control and treated condition = ("control" , "treated") will handle error or warning?

What ref.condition this?

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by Kritika ▴ 270

0

Entering edit mode

According to this commands

result = beadAnalyze(idats = c("file1.idat","file2.idat","file3.idat","file4.idat"),names = c("control1","control2","treated1","treated2"),condition = c("control","control","treated","treated"),ref.condition = "treated")

file1.idat, file2.idat are replicates for treated and file3.idat file4.idat are replicates of control? Am I correct?

As I said already I m dealing with dummy data I tried some more sample so after running this command I got message :-

Annotating control probes using package illuminaHumanv4.db Version:1.26.0
Calculating array weights
Array weight

After typing

result

it is showing certain values with column

  ID                        logFC      AveExpr             t      P.Value adj.P.Val         B
 ILMN_XXXXX

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by Kritika ▴ 270

1

Entering edit mode

idats is vector of your dat files (in the above example there are 4 dat files)
names is sample names for those dat files (above they are named as control1, control2, treated1 and treated2). Yes, they're replicates.
condition is sample characteristics. First two are control and last two are treated. It can be anything based on your experiment. (like knockdown, over expression, etc.)
ref.condition is which one of the condition to use as a reference. Here I am comparing everything with treated. All up or down genes are with respect to treated samples.

Output is typical limma results. You may want to read limma manual. In short, logFC is fold change with respect to control samples, AveExpr is average expression across all your samples, t statistics, p-value, adj.P.Val is FDR, B is odd ratios. Also there are other stuffs like Probe sequence, probe quality, its locus on genome, where it lies on transcript, etc.

Script itself is well commented, so you should be able to follow. However life will be easier if you know how expressionset object is represented and its slots in Bioconductor. Tomorrow I will update the script with PCA, you can check again.

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by poisonAlien ★ 3.2k

0

Entering edit mode

Very useful information

Thanks a lot PoisonAlien

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 9.2 years ago by Kritika ▴ 270

0

Entering edit mode

Hi Kritika,

Same like yours I need to do gene expression analysis with IDAT files. Could you please tell me how did you do your analysis? Workflow and packages.

Thank you

ADD REPLY • link 8.7 years ago by Vasu ▴ 790

0

Entering edit mode

Hi poisonAlien,

I am trying to use your Source script (AnalyzeBead.R), but am running to this error. What could be the problem?

result = beadAnalyze(idats = c("4487653088_J_Grn.idat","4487653088_K_Grn.idat","4487653088_L_Grn.idat","4487653151_A_Grn.idat"),
                      names = c("4487653088_J","4487653088_K","4487653088_L","4487653151_A"),
                    condition = c("day0","day0","day2","day2"),
                      ref.condition = "day0", fdr = 0.05, plotPCA = T)

Annotating control probes using package illuminaHumanv3.db Version:1.26.0
Calculating array weights
Array weights

Error in `levels<-`(`*tmp*`, value = if (nl == nL) as.character(labels) else paste0(labels,  : 
  factor level [4] is duplicated

Hope to hear from you.

Cheers

ADD REPLY • link updated 2.8 years ago by Ram 45k • written 6.7 years ago by evanskataka • 0

0

Entering edit mode

Please use ADD COMMENT/ADD REPLY when responding to existing posts to keep threads logically organized.

This should be posted as a comment under poisonAlien answer.

ADD REPLY • link 6.7 years ago by GenoMax 151k