I am using digital pathology slides from CPTAC, downloaded from the Cancer Imaging Archive (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70227748). However, I am having difficulty mapping the slide filenames to metadata. There was no metadata file present in the download from Cancer Imaging Archive.
How can I map the slide filenames to sample IDs or other metadata? These filenames do not follow the same schema as TCGA barcodes. Some case IDs have multiple slides, and I would like to know how they differ.
This is an example of a CPTAC BRCA whole slide image file:
01BR031-223a7222-d281-4bc3-865b-f774e2.svs
This filename is the case ID ("01BR031") followed by a UUID. If I search for the case on the GDC portal, I am brought to https://portal.gdc.cancer.gov/cases/717b4704-523e-4b75-8bc9-24dde2812c8e but this has no mention of the digital pathology slides.
It seems you've already found the metadata in gdc " If I search for the case on the GDC portal, I am brought to https://portal.gdc.cancer.gov/cases/717b4704-523e-4b75-8bc9-24dde2812c8e but this has no mention of the digital pathology slides.".
As the data is not in the GDC, of coz GDC is not aware of the data and "has no mention of the digital pathology slides".