According to their website they have 12M customers.
Number of SNPs on SNParray is less than WGS, but still c.a. 600,000 SNPs according to their website.
12.000.000 * 600.000 = 7.200.000.000.000 = 7.2 Trillion data points
Easiest to store that would be per sample or per plate as separate files.
But for joint analysis (e.g. gwas) a a table is needed.
Do they store that table as:
- As a VCF file
- In a relational database
- Some sort of NoSQL database
- plink files
- Some dedicated SNP array storage format