Hello. Please can anyone help me with how to obtain the actual gene expression data,that is, the levels of expression of genes probed in a sample dataset on GEO. I am currently working on the dataset GSE12945.
Thanks.
Hello. Please can anyone help me with how to obtain the actual gene expression data,that is, the levels of expression of genes probed in a sample dataset on GEO. I am currently working on the dataset GSE12945.
Thanks.
In R, I'd suggest using the GEOquery
package (see tutorial). When using this package, it will download the dataset directly into R. This will usually be in the form of an specific R class object (similar to an ExpressionSet
), which will contain the matrix with expression data, metadata, etc. It is helpful to become familiarized with the specific R objects, so you know how to access each feature.
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
Hi everyone. Is it better to use the actual expression data of my genes or the normalized data (their z values) to analyze which genes were lowly or highly expressed. I was planning on making a table of this or using box plot analysis to find for example the mean expression for each, median, their standard deviation, minimumand max values.
I have scarce experience with expression array analysis. In the GSE you cite, they use Affymetrix, and I seem to remember the most common workflows used to normalize using the RMA algorithm. You may find this tutorial helpful. Aside from that, once you take care of background and distribution differences by normalization, if your goal is just an exploratory analysis of the genes, you can try different methods to summarize them (mean, median, sds.., and also Z-scores) and see what you find.
Ok thank you Papyrus and patelk26. I'm still trying to wrap my mind around these tutorials but I'm sure I will find something.