TCGA RNA-Seq, Expression levels in comparison with normal and connection with CNVs
1
0
Entering edit mode
9.6 years ago
thmourikis ▴ 10

Hello,

I am new to RNA-Seq pipelines/analysis and TCGA data. I am trying to identify genes with differential expression between tumor and normal. The ultimate goal is to connect differential expression with copy number alterations in a patient specific manner. I cannot find RNA-Seq data for many normal samples (matched with their corresponding tumor sample) per cancer type. Is it true that there aren't many normal RNA-Seq expression data? If yes, is there any alternative approach to reach the above-described goal?

Thanks a lot in advance.

RNA-Seq • 5.6k views
ADD COMMENT
0
Entering edit mode

About 10% f the RNASeq samples are from normal tissue. To perform differential expression you can still use all the samples, but you have to put in a patient-matching indicator and a tumour / normal indicator into your design matrix.

ADD REPLY
0
Entering edit mode

thanks for your reply russ_hyde. Could you please elaborate a little bit more? Could you give an example? Is that like using pooled normal?

ADD REPLY
2
Entering edit mode
9.6 years ago
TriS ★ 4.7k

You can find the normal samples by checking the sample's barcode. Basically if the 14th character is a "1" that's a tumor, if it's a "0" is a normal.

For example a matched normal-tumor samples pair would be:

TCGA-12-4567-01-blah-blah --> this is normal
TCGA-12-4567-11-blah-blah --> this is tumor
             ^

However, the number of "normal" samples depends from the tumor type, some have more some have less.

Another way to download separately tumor vs normal is to use TCGA Assembler that you can download from here. It's an R package that allows you to download a variety of TCGA datasets including clinical data or CNVs

In the clinical data sheet you will be able to see which patients are which and which ones have clinical data available. you can download clinical data from the Data Matrix following the tutorial here.

ADD COMMENT
0
Entering edit mode

thanks for your answer and the useful links. I think blah-01-blah is the tumor and everything >09 is the normal. Anyway, i guess the answer to my question is that indeed there aren't many "normal" RNASeq samples.

ADD REPLY
0
Entering edit mode

so, 01-09 is tumor, 10-19 is normal, 20-29 controls, 5x cell lines, 60-61 xenograft

depends from what you mean for "many", but yes, there are not as many as tumors. if you need a wider cohort you might want to tap into the GTEx project

ADD REPLY
0
Entering edit mode

thanks once again for your answer TriS!

ADD REPLY

Login before adding your answer.

Traffic: 2836 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6