Hello everyone,
I have a simple question, the file series_matrix.txt
in GEO are always the normalized data of the study?
Best,
Leite
Hello everyone,
I have a simple question, the file series_matrix.txt
in GEO are always the normalized data of the study?
Best,
Leite
Hey Leite,
The answer is that, yes, the series matrix files should contain normalised, log2 values. However, the GEO provide situations in which these files may not contain normalised data:
GEO2R operates on Series Matrix files which contain data extracted directly from the VALUE column of Sample tables. Submitters are asked to supply normalized data in the VALUE column, rendering the Samples cross-comparable. The majority of GEO data do conform to this rule. GEO applies no further processing other than to perform a log2 transformation on values determined not to be in log space (see Options section). However, some studies, such as dual channel loop design data, may generate values that do not have a common reference and are not directly comparable. Some studies may contain Sample value data that are not normalized, or have a design such that the Samples were never intended to be directly compared. Yet other studies do not have sufficient replicate Samples to perform a robust statistical analysis. Users should examine the original Series to understand the experimental design, and check the 'Data processing' field or VALUE description in the original Sample records for information on what the values represent. The box plot feature on the Value distribution tab is provided to help users assess whether the distributions of values across Samples are median-centered, which is generally indicative that the data are normalized and cross-comparable.
[source: https://www.ncbi.nlm.nih.gov/geo/info/geo2r.html]
When you obtain data, you should always check the distribution with box- and scatter plots, and histograms, in order to gauge whether thy are normalsed or not.
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.
https://ibb.co/B2RV1CD is this comparbale? thanks a lot code copy from you
Some samples appear to be outliers. Please check
thanks a lot, how did you come to this conclusion,just because the median of 712058 and 712060 is much higher?