Entering edit mode
13 months ago
Ozymandeus
▴
30
I am doing differential expression analysis of isoform quantification data in R studio. I am looking at the floor of mouth, tongue, and tonsil subsites of HNSCC and I am using TCGA for my data. The issue with this is that there is not enough subsite-specific control data. I was wondering if there were any other places where I could get control data that I could just substitute in or if there was more subsite-specific data that I was missing.
I think very very few people would know that HNSCC stands for Head Neck Squamous Cell Carcinoma in order to understand the context, and therefore what would constitute a control. Is this the data from Sayans et al. 2019 J Clin Med. 8(11): 1896? The problem with swapping in other data sets is that there will be a batch effect confounded with your variable of interest.
The data is taken entirely from The Cancer Genome Atlas(https://portal.gdc.cancer.gov/projects/TCGA-HNSC). If swapping in data isn't a viable option, what do you think I should do in this situation?
Compare within the dataset. Between subtypes or between samples that express certain markers highly/lowly. You cannot compare between studies quantitatively.