I am actually working on one GEO dataset and I have done its normalization but i am stucked in its filtering part,I am not sure how to go for it.It will be great if you will help me
Hi. Which dataset do you have, what normalization you have done, where are you stuck, what do you want to do? I am sure someone can help you if you can provide more details. You can edit your post easily by clicking on edit. Also, you should change your tag "ddd" to something meaningful like "GEO" etc.
This is the filtering I typically do in R for an Affymetrix gene expression microrarray dataset after normalization (e.g., gcrma/rma). The same general concept can be applied to other platforms (e.g., RNAseq) but the thresholds would need to be modified.
#Preliminary probe/gene filtering
#If there are predictor variables that are constant/invariant, consider removing them
library(genefilter)
X=rawdata[,4:length(header)]
#Take values and un-log2 them, then filter out any genes according to following criteria (recommended in multtest/MTP documentation):
#At least 20% of samples should have raw intensity greater than 100
#The coefficient of variation (sd/mean) is between 0.7 and 10
ffun=filterfun(pOverA(p = 0.2, A = 100), cv(a = 0.7, b = 10))
filt=genefilter(2^X,ffun)
filt_Data=rawdata[filt,]
Hi. Which dataset do you have, what normalization you have done, where are you stuck, what do you want to do? I am sure someone can help you if you can provide more details. You can edit your post easily by clicking on edit. Also, you should change your tag "ddd" to something meaningful like "GEO" etc.
Yes, we need more details to answer this question. "Filtering" is very ambiguous.
In addition to Obi's answer (below), you should just read through the vignettes for the genefilter package