I'm working with GSE28551_RAW. In this dataset, each of the 37 samples has a flag which seems to indicate whether the program calls the probe detected (flag=0) or non-detected (flag>0).
Currently, my first filtering step is to throw out any probes when any of the 37 samples has a "not detected" flag. However, this is quite stringent - I go from having 37,842 probes to analyze, to 9,102.
Is this the appropriate way to filter out undetected probes? If not, what do I do if the probe was not detected in some samples and detected in others? Should I include a probe in my final analysis if more than 50% of the samples detected the probe? I'm not sure how I would do the analysis if I include the data for that probe in some samples and not for other samples...
Thank you for your help!