Plotting Exon Statistics In R
2
4
Entering edit mode
11.8 years ago

Hi,

I have a table in the following format which provides some information (counts of mapped reads) about exons in some genes across some samples.

Exon     Gene      sampleA      sampleB     sampleC
  E1       A         43          52          12   
  E2       A         0           24          34
  E3       A         19          48          32
  E4       A         76          0           23
  E5       A         5           87          12  
  E1       B         12          109         98
  E2       B         32          76          11
  E1       C         12          0            5
  E2       C          4          8           76 
  E3       C          0          0           32

That is, information about every exon of every gene. I wish to generate a per sample plot (therefore, 3 plots) of counts of all exons in all 3 genes. Within each sample plot, my X-axis would be the exon number and the Y-axis would be the count. And so, I would have 3 "data series" lines (since there are 3 genes) within each plot.

I am new to R and I have no clue how to go about it.

I am wondering if I have to "factor" the gene column in any way to get the exons specific for that gene.?

Any suggestions would be much appreciated.

exon r • 4.6k views
ADD COMMENT
0
Entering edit mode

Don't forget to normalize your read counts by sequence depth per sample.

Also, if you want to run statistics on differential exon usage (seems to be where you are going with this), you should look at the DEXSeq package ... an added bonus is that it includes functionality to plot expression over exons

ADD REPLY
5
Entering edit mode
11.8 years ago
Irsan ★ 7.8k
# Install ggplot2 and reshape
install.packages(c("ggplot2","reshape"))

# load the packages:
library(reshape)
library(ggplot2)

# melt the dataframe so that ggplot can handle it. I assume you have the data in object called counts
melt_count<-melt(count,id.vars=c("Exon","Gene"))
colnames(melt_count)<-c("exon","gene","sample","count")

# and plot counts for each gene for each exon
ggplot(melt_count)+geom_point(aes(x=exon,y=count,color=sample))+facet_grid(exon ~ gene,scales="free_x")

See resulting image here

ADD COMMENT
1
Entering edit mode
11.8 years ago
Dan ▴ 540

If you're new to R, you should definitely read Chapter 1 (Introduction) of 'S Poetry': http://www.burns-stat.com/documents/books/s-poetry/

I can't recommend it enough!

For manipulating data frames, you should look at tapply and friends. I don't quite understand what you want to do, but I'm sure you can do it with tapply ;-)

ADD COMMENT
2
Entering edit mode

If you're new to R, you should in general read as much documentation, including online tutorials as you can. You can't expect to have a "clue how to go about it" with no background knowledge whatsoever.

ADD REPLY

Login before adding your answer.

Traffic: 2781 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6