Question

Normalization Or Filtering Using Bioconductor

1

Entering edit mode

12.4 years ago

deeksha.malhan ▴ 10

Hi

I am actually working on one GEO dataset and I have done its normalization but i am stucked in its filtering part,I am not sure how to go for it.It will be great if you will help me

Thanx

bioconductor microarray geo • 2.8k views

ADD COMMENT • link updated 12.4 years ago by Obi Griffith 20k • written 12.4 years ago by deeksha.malhan ▴ 10

3

Entering edit mode

Hi. Which dataset do you have, what normalization you have done, where are you stuck, what do you want to do? I am sure someone can help you if you can provide more details. You can edit your post easily by clicking on edit. Also, you should change your tag "ddd" to something meaningful like "GEO" etc.

ADD REPLY • link 12.4 years ago by Vikas Bansal ★ 2.4k

0

Entering edit mode

Yes, we need more details to answer this question. "Filtering" is very ambiguous.

ADD REPLY • link 12.4 years ago by Neilfws 49k

0

Entering edit mode

In addition to Obi's answer (below), you should just read through the vignettes for the genefilter package

ADD REPLY • link 12.4 years ago by Steve Lianoglou 5.2k

score 2 · Answer 1 · 2012-07-03

This is the filtering I typically do in R for an Affymetrix gene expression microrarray dataset after normalization (e.g., gcrma/rma). The same general concept can be applied to other platforms (e.g., RNAseq) but the thresholds would need to be modified.

#Preliminary probe/gene filtering
#If there are predictor variables that are constant/invariant, consider removing them
library(genefilter)
X=rawdata[,4:length(header)]
#Take values and un-log2 them, then filter out any genes according to following criteria (recommended in multtest/MTP documentation): 
#At least 20% of samples should have raw intensity greater than 100 
#The coefficient of variation (sd/mean) is between 0.7 and 10
ffun=filterfun(pOverA(p = 0.2, A = 100), cv(a = 0.7, b = 10))
filt=genefilter(2^X,ffun)
filt_Data=rawdata[filt,]