69% missing genotypes in GEO processed matrix file
0
0
Entering edit mode
9.6 years ago

Dear all,

I downloaded the GEO processed matrix file (GSE66157) and extracted the GType column for each individual, the missing genotypes (NC in GType column) for each individual had reached 69%. It's incredible for Illumina HumanOmni1-quad beadchip having such high missing genotypes.

My questions are,

  1. Why does it have such high missing genotypes? The author declared less than 1.5% missing genotypes for each individual in the original publication. My guess is that the author submitted the 'signal_intensities' file to GEO and the GEO used this file as a input to re-calculate the 'matrix file' with default threshold. Anyway, I have no experience of submitting genotypes to GEO.
  2. If I still want to get the individual genotypes, how can I set a lower threshold to get the appropriate genotypes? Or can I call the genotypes from the signal_intensities file?

The processed matrix file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_processed.txt.gz

The signal_intensities file: ftp://ftp.ncbi.nlm.nih.gov/geo/series/GSE66nnn/GSE66157/suppl/GSE66157_1M_Matrix_signal_intensities.txt.gz

SNP • 1.6k views
ADD COMMENT

Login before adding your answer.

Traffic: 2156 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6