How do I visualize rna seq data?
4
0
Entering edit mode
9.8 years ago

I have downloaded various sample file from TCGA db, containing gene, spljxn, exon quantification files, but unable visualize it specifically gene quantified file. how to see the data it contains?

RNA-Seq • 5.7k views
ADD COMMENT
1
Entering edit mode

What do you mean by visualise ? Are you unable to see the quantified values or do some heat maps etc?

ADD REPLY
0
Entering edit mode

Actually, some files are opening and some are not, I have to find out pearson correlation coefficient in the genes for that data.

ADD REPLY
0
Entering edit mode

How may files do you have ? Which files are opening, which file are not opening ?

ADD REPLY
0
Entering edit mode

I have 100 sample files, how to find pearson correlation coefficient of genes.

As I think firstly, I have to locate that specific genes for which I need to find pearson correlation coefficient and its calculation.

Please suggest me the better way to do it.

ADD REPLY
1
Entering edit mode

That's a simple enough way to proceed. It's probably easiest to just make a matrix of expression metrics and run the correlation on that. You can then just use cor().

ADD REPLY
0
Entering edit mode

Thank you Devon Ryan..is there any other way to do calculation, rather than doing calculation for taking pairs individually because its a huge data.

ADD REPLY
0
Entering edit mode

Err, you asked about looking at the correlation, so no.

ADD REPLY
0
Entering edit mode

Its the simplest way to do. You get a matrix of gene names and their values across all the samples. Just do cor() . It will calculate correlations for all the pairs and the output can be visualised in matrix2png web interface.

ADD REPLY
0
Entering edit mode

is there any way to work this task using some scipts??

ADD REPLY
1
Entering edit mode

Yes, there is.

ADD REPLY
0
Entering edit mode

but i really don't know how to search it, i am not known to scripting work.

as there is 1 file which contains my desired genes and another sample file data from database, I have to locate that desired gene in data and after that have to calculate p-value of all the desired gene in data with respect to 1 particular gene.

ADD REPLY
0
Entering edit mode

Right, so write a script to do that.

Edit: If you don't know how to write the script, then either find/pay someone to write it for you or, better yet, just learn a bit about how to write scripts.

ADD REPLY
1
Entering edit mode
9.8 years ago

They're likely just text files, so use a text editor (e.g., less, or more, or even grep). If by "visualize" you mean plot values, then R is useful.

ADD COMMENT
1
Entering edit mode
9.8 years ago
pstew ▴ 50

This is a loaded, very broad question. Each of the tasks you mention are non-trivial exercises in data processing. To receive help in a place like this, you need to say specifically what you have tried/what you need as well as examples of the data. If you have a very specific set of TCGA data you need to look at, then you will probably need some experience in Python or R in order to do this yourself. If you have a very broad question about X gene in Y cancer, then you can probably check out the sites I mention below. If you have no experience with working with sequencing data and don't know where to start, then you will probably need to enlist help from a bioinformatician/biostatistician mentor/collaborator.

Here is a list of RNA-seq visualization tools to give you a better idea about what you want.

For TCGA data, I find the cBioPortal to be helpful. You can't specify specific patient samples, but you can break it down pretty well as far as cancer, subtypes, and specific genes.

ADD COMMENT
0
Entering edit mode
9.8 years ago
EagleEye 7.6k
Hello Payal, there are many online courses and tutorials. First you should learn basic unix: http://www.ee.surrey.ac.uk/Teaching/Unix/ And; Some scripting like PERL/PYTHON: http://books.google.se/books?id=Z0K504TfkkUC&output=html_text OR; R language: http://www.cyclismo.org/tutorial/R/
ADD COMMENT
0
Entering edit mode
9.8 years ago
ivivek_ngs ★ 5.2k

It is important to learn R and unix script to play with RNA-Seq data, specially when you need to make correlation analysis and then want to visualize them. Once you have the matrix with the genes across all samples with your expression values or read counts , simply charge them in R and then you can use cor(..) and heatmap(cor(..)) , to plot and visualize the data. Now there is a very nice visualization tool in R known as pheatmap, it takes up a lot of effort from R coding which you might require in heatmap.2. I find it quite useful for visualization. There is also MeV for online heatmap creation but then you have to first create the matrxi of genes across your 100 samples with their respective count or expression values. Without unix scripting and R it is not easy to do desirable stuff in RNA-Seq data unless you have some in-built or commercial software for producing plots and outputs for you.

ADD COMMENT

Login before adding your answer.

Traffic: 1292 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6