How to plot that the certain exons are affected in the genebody?
0
1
Entering edit mode
24 months ago
PK ▴ 130

Hi,

I have WT and KO and I did the differential exon usage (Dexseq). I saw a pattern which means in most the gene the first exon seems to be affected with Padj value. I want to show that in a plot but I don't know how to do that. Assume that I have 300 gene that has the evidence for differential exon usage. In such scenario most of the genes first exon seems to be affected (or first few exons).

I have the final matrix of DEXseq and I added certain information. This is just a example I have ~ 300 rows

GENE_ID.           TOTAL-EXON           AFFECTED_EXON.     Log2foldchange
ENSG00000142145        61                    E008          2.19717286971528

I have like this. Also I have extra information as well.

RNA-seq • 763 views
ADD COMMENT
0
Entering edit mode

Are you looking for suggestions how such a dataset can be visually represented or do you already have a particular plot in mind, but don't know how to create it? In the latter case, it would be helpful, if you could include a scrabbled version of the plot.

In the former case, it is helpful to decide first what you want to show. What is your message? If your message is for example "The first exon is affected in most genes", that would e.g. suggest an aggregate plot. If you would rather like to show details for particular genes and subtle differences or an effect depending on the distance from the TSS, a per gene representation with accurate distances may be needed. You can e.g. use colour, position or size to encode a fold change.

To how many genes do your 300 rows correspond? What about genes with multiple transcripts?

ADD REPLY
0
Entering edit mode

yeah. I have in my mind density plot but the each gene has different length right so i don't know how to do that. I'm looking at the gene level not at the transcript level. Here the exons are separated if they overlap at transcript level and you give the fraction number for the each exon based on the gene (E001,E002....). enter image description here

Something like this picture. using the log2foldchage in the y-axis and x gene red and green for up and down regulated exons. It is not first exon, first few exons.

ADD REPLY
2
Entering edit mode

Maybe some inspiration:

library(tidyverse)

# sampledata generation
genelengths <- sample(c(1:65),25)

sampledata <- data.frame("GeneID"=rep(LETTERS[1:25],times=genelengths),
                         "ExonNr"=unlist(lapply(genelengths,seq)),
                         "Foldchange"=unlist(lapply(genelengths,function(x){sort(rweibull(x,1.5,rnorm(1,1,0.3)),decreasing = TRUE)})))


baseplot <- ggplot(sampledata,aes(x=ExonNr,y=Foldchange))

#plotting options

baseplot + geom_line(aes(group=GeneID),alpha=0.5) + geom_smooth(method="gam")

baseplot + geom_boxplot(aes(x=factor(ExonNr)))

baseplot + geom_tile(aes(x=factor(ExonNr),y=GeneID,fill=Foldchange))

# Not very helpful now, but might work well for real data (adjust bins according to number of observations)
baseplot + geom_bin2d(bins = 1e2) + stat_density2d(aes(fill = ..level.., alpha = ..level..), geom = "polygon", bins = 5) + scale_alpha(range = c(0.4, 1), guide = "none")
ADD REPLY

Login before adding your answer.

Traffic: 2501 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6