Question

TCGA Tumor matched Normal and Normal matched Tumor do not have the same number of samples?

2

Entering edit mode

10.4 years ago

cafelumiere12 ▴ 80

Hi everyone, I am new to looking at TCGA data and I think I might have missed something here. I used the Data Matrix option to check out RNASeqV2 samples for LUAD. And selected only "Tumor-matched normal (TN)" and "Normal - matched tumor(NT)". I was expecting to see the same number of TN and NT since they are supposed to match . However I am seeing a lot more TN than NT.

I checked out the TCGA barcode and did a quick match between TN and NT and found that there are a lot of TNs taht don't have matching patient IDs in NTs. Am I missing anything here? Or if anyone can point me to any information that I should be looking at?

Many thanks in advance!

RNA-Seq TCGA • 6.8k views

ADD COMMENT • link updated 3.1 years ago by Ram 45k • written 10.4 years ago by cafelumiere12 ▴ 80

3

Entering edit mode

You are right. When I was searching for matched samples, I observed the same thing. They don't have equal number of matched samples unfortunately. What you can do is download everything that comes up in the matched samples' results and then compare the barcodes to get the "true" matched samples.

Tip: If you use the TCGA assembler R package, it is easier to download everything and then filter out the matched samples.

ADD REPLY • link 10.4 years ago by komal.rathi ★ 4.1k

0

Entering edit mode

Thank you! Yes I am also just setting up the TCGA Assembler now.

ADD REPLY • link 10.4 years ago by cafelumiere12 ▴ 80

1

Entering edit mode

Another tool that you can use is the package TCGAbiolinks.

ADD REPLY • link updated 5.3 years ago by Ram 45k • written 9.4 years ago by tiagochst ▴ 70

Ram · Answer 1 · 2015-10-28

The matched normal for most TCGA tumors was a sample of peripheral blood, preferred for somatic variant calling from DNA-seq. Only for ~600 TCGA tumors did the surgeons take a sample of tumor adjacent tissue, later classified by pathologists as normal tissue... on which RNA-seq is appropriate. The expression profile from a matched blood normal would be very different from the tumor... so they don't do RNA-seq on those. Also see this related qn -Tcga: "Tumor, Matched Normal" Vs. "Normal, Matched Tumor"

Ram · Answer 2 · 2015-02-10

I followed the instructions being (1) unpacking the files, and (2) run the source command as mentioned in the quick start guide...not exactly rocket science you would think, however, this code below does not work for me.

setwd("C:/TCGA-Assembler")
source("/Module_A.r");
source("/Module_B.r");

gives:

> source("/Module_A.r");
Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file '/Module_A.r': No such file or directory
> source("/Module_B.r");
Error in file(filename, "r", encoding = encoding) : 
  cannot open the connection
In addition: Warning message:
In file(filename, "r", encoding = encoding) :
  cannot open file '/Module_B.r': No such file or directory

...so that is not very useful...

Does this assembler actually work? Any suggestion on how to accomplish that? Maybe it's not made for windows?