Hi everyone, I am new to looking at TCGA data and I think I might have missed something here. I used the Data Matrix option to check out RNASeqV2 samples for LUAD. And selected only "Tumor-matched normal (TN)" and "Normal - matched tumor(NT)". I was expecting to see the same number of TN and NT since they are supposed to match . However I am seeing a lot more TN than NT.
I checked out the TCGA barcode and did a quick match between TN and NT and found that there are a lot of TNs taht don't have matching patient IDs in NTs. Am I missing anything here? Or if anyone can point me to any information that I should be looking at?
Many thanks in advance!
You are right. When I was searching for matched samples, I observed the same thing. They don't have equal number of matched samples unfortunately. What you can do is download everything that comes up in the matched samples' results and then compare the barcodes to get the "true" matched samples.
Tip: If you use the TCGA assembler R package, it is easier to download everything and then filter out the matched samples.
Thank you! Yes I am also just setting up the TCGA Assembler now.
Another tool that you can use is the package TCGAbiolinks.