Analysis of gene expression data
0
0
Entering edit mode
6.4 years ago

Hi

I have list of transcripts (~70,000) with their expression values in cancer and normal adjacent samples, I want to calculate p-value for each pair of transcripts. How to do that? Thanks in advance.

Regards

p-value • 3.9k views
ADD COMMENT
2
Entering edit mode

You should provide further information, such as:

  • the source of your data (e.g. web resource, microarray, RNA-seq, NanoString, or something else)
  • how many samples you have
  • any pre-processing that has been performed

Nobody can help you sufficiently with the information that you have currently provided.

ADD REPLY
0
Entering edit mode

OK, thanks for the information. The data is raw TCGA expression data and contains transcript id, expression value (TPM) in cancer and adjacent normal samples

Transcript_id       Expression value(TPM) in cancer     Expression value(TPM) in normal adjacent
uc001aaa.3      0.519993743     2.07946602736613E-95
uc001aab.3      1.195267176     0.4213373079
uc001aac.3      0.2816408622        0
uc001aae.3      4.16436457205392E-67        6.37487575711405E-207
uc001aah.3      1.24270443179236E-241       0.1938754949
uc001aai.1      16.1663933608       5.3783318364
uc001aak.2      0       0.0286585119
ADD REPLY
0
Entering edit mode

Hello again . With this data, you cannot produce a p-value per gene. With your data, you just have a single value for each gene in the tumour and normal samples. Can you elaborate (describe further) the source of the data? Many third-party websites (i.e. outside of the National Cancer Institute of the USA) host TCGA data, which is in various stages of processing.

ADD REPLY
0
Entering edit mode

Hi The data is taken from TCGA by cancerrna nexus and we have got from there. Can I apply student's t test for the same?

ADD REPLY
1
Entering edit mode

No, with the data that you have, you cannot use the Student's t-test. For example, if you wanted to derive a p-value for the uc001aaa.3 gene, your comparison would just be 0.519993743 Vs. 2.07946602736613E-95. A p-value cannot be derived from just 2 values.

From Cancer RNA-seq Nexus, you should try to obtain the expression values for the genes across all tumours and all normal samples. Then, you could begin to think about conducting differential expression analysis.

ADD REPLY
0
Entering edit mode

If you do not have much experience with bioinformatics, then can I suggest that you reach out to (that is, contact) a local collaborator (in your university / college, or some other), and ask them for assistance.

Also, there are web-based GUIs that allow you to analyse TCGA data, such as cBioPortal

ADD REPLY
1
Entering edit mode

If you have count data, try following one of the RNA-seq expression tutorials online. Here's a good one to start with: https://f1000research.com/articles/4-1070/v1

ADD REPLY
0
Entering edit mode

I want to do it using t-test or any other statistical test in spss.

ADD REPLY
1
Entering edit mode

Those tests are not appropriate for expression data.

ADD REPLY

Login before adding your answer.

Traffic: 996 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6