Question

How to map CPTAC digital pathology slides to sample ID and metadata

0

Entering edit mode

18 months ago

Jakub • 0

I am using digital pathology slides from CPTAC, downloaded from the Cancer Imaging Archive (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70227748). However, I am having difficulty mapping the slide filenames to metadata. There was no metadata file present in the download from Cancer Imaging Archive.

How can I map the slide filenames to sample IDs or other metadata? These filenames do not follow the same schema as TCGA barcodes. Some case IDs have multiple slides, and I would like to know how they differ.

This is an example of a CPTAC BRCA whole slide image file:

01BR031-223a7222-d281-4bc3-865b-f774e2.svs

This filename is the case ID ("01BR031") followed by a UUID. If I search for the case on the GDC portal, I am brought to https://portal.gdc.cancer.gov/cases/717b4704-523e-4b75-8bc9-24dde2812c8e but this has no mention of the digital pathology slides.

cptac digital-pathology • 672 views

ADD COMMENT • link updated 18 months ago by Zhenyu Zhang ★ 1.3k • written 18 months ago by Jakub • 0

0

Entering edit mode

It seems you've already found the metadata in gdc " If I search for the case on the GDC portal, I am brought to https://portal.gdc.cancer.gov/cases/717b4704-523e-4b75-8bc9-24dde2812c8e but this has no mention of the digital pathology slides.".

As the data is not in the GDC, of coz GDC is not aware of the data and "has no mention of the digital pathology slides".

ADD REPLY • link 18 months ago by Zhenyu Zhang ★ 1.3k