Hey all,
I am currently trying to run an analysis in r using the skin dataset available in recount2 https://jhubiostatistics.shinyapps.io/recount/
While analysing the metadata of the samples, I found that information on stage of the patient (stages 1-4) as well as the type of tissue sample is available (e.g. from primary tumor, metastasis, healthy tissue).
However when I look into the staging + tissue sample I run into the following problem: Samples from patients in stages 0-2 are tagged as being from metastatic tissue, however as far as I know metastasis only occurs starting stage 3.
Here is my code in r:
rse_gene <- load(http://duffel.rail.bio/recount/v2/TCGA/rse_gene_skin.Rdata)
table(colData(rse_gene)[,"gdc_cases.samples.sample_type"], rse_gene$gdc_cases.diagnoses.tumor_stage)
stage 0 stage i stage ia stage ib stage ii stage iia stage iib stage iic
Additional Metastatic 0 0 0 1 0 0 0 0
Metastatic 7 29 18 28 26 14 19 15
Primary Tumor 0 1 0 1 4 4 9 49
stage iii stage iiia stage iiib stage iiic stage iv
Additional Metastatic 0 0 0 0 0
Metastatic 39 15 35 55 21
Primary Tumor 2 1 12 12 3
Can anyone explain what is happening here? I would appreciate any input!