Hi
How can I find how much data are on GDC portal for all cancer types and samples?
or is there any statistic for the volume of NGS data generated in cancer so far?
Thanks S
Hi
How can I find how much data are on GDC portal for all cancer types and samples?
or is there any statistic for the volume of NGS data generated in cancer so far?
Thanks S
Writing a report or something?
The two main repositories of cancer data that come to mind are the GDC (Genomic Data Commons), which is mostly TCGA data, and the ICGC (International Cancer Genome Consortium) Data Portal, which is an amalgamation of multiple global repositories that also includes the GDC.
Even if you just go to these websites, you can see an estimate of the size of the controlled and open access data that each contains:
Genomic Data Commons (GDC) ( https://portal.gdc.cancer.gov/repository ):
International Cancer Genome Consortium (ICGC) Data Portal ( https://dcc.icgc.org/repositories ):
So, even in those 2, you're looking at upward of 1.8 petabytes. I cannot confirm that this actually represent all files that have ever been produced - most likely not. Also, this, of course, does not reflect the cancer data that is also stored in all of the other online repositories, such as:
Kevin
Use of this site constitutes acceptance of our User Agreement and Privacy Policy.