Hi,
I had downloaded TCGA RNA-seq data using tcgabiolinks package. In the data I see that there few samples with same barcode. Which sample should I prefer?
For eg:
TCGA-A6-2684-01A-01R-1410-07
TCGA-A6-2684-01A-01R-A278-07
From TCGA barcode I see that plate and center are different but the sample ID (TCGA-A6-2684-01) is same. Which one should I prefer for the analysis? Do I need to keep both the samples? When I consider sample ID it will be duplicate samples.
Thanks for the reply. I'm thinking to check the number of genes with zero read counts. I will select the sample having less number of genes with zero read counts for further analysis. Do you think this is a good idea?
Yes, that is also a good idea.
I would also just check some of the other IDs that are like this, just to be sure.
Yes, in the first step I'm checking the sample ID in cbioportal and then in the next step I will check the gene count.