Hello Biostars-Community,
I would need your kind suggestions regarding an issue with the Gviz package from Bioconductor. I am attempting plot Enhancer-Promoter connections and tried to (mis)use the AlignmentsTrack() function and the "sashimi" plots display type for it.
Typically, alignment tracks are generated from BAM files and if I do that, I can nicely display the sashimi plots as shown in the manual. However I would like to visualize enhancer-promoter connections and thus attempted to create a fake alignment file from simulated reads spanning the coordinates of the enhancers and promoters and connecting them with introns.
Unfortunately for this fake file, I can't get the sashimi plots to work. It will display the fake reads in the "pileup" mode, but when I choose type="sashimi", the track remains blank and I get a strange error. I suppose the sashimi plot depends on additional columns like the mapping quality or such, that would automatically be read from a BAM file, but which I didn't provide with the simulated data. Any suggestions?
Thanks a lot Matthias
Minimal working (not working) example:
library("Gviz")
library("GenomicRanges")
eppairs <-
structure(
list(
ID_trscpt = c("Ppp3cb.NM_001310426", "Ppp3cb.NM_001310427"),
ID_enh = c(
"WTonly_chr14:21460154-21460564",
"WTonly_chr14:21297822-21298296"
),
Relevance_int = c(21.32, 10.66),
chr = c("chr14", "chr14"),
start = c(21460154L, 21297822L),
stop = c(21460564L, 21298296L),
chr_trscpt = c("chr14", "chr14"),
start_trscpt = c(21365694L,
21365694L),
stop_trscpt = c(21366295L, 21366295L),
Gene = c("Ppp3cb",
"Ppp3cb")
),
.Names = c(
"ID_trscpt",
"ID_enh",
"Relevance_int",
"chr",
"start",
"stop",
"chr_trscpt",
"start_trscpt",
"stop_trscpt",
"Gene"
),
row.names = c(7981L, 7984L),
class = "data.frame"
)
# construct a fake alignment file for sashimi plots. For each relevance point connection, add 10 fake aligned read pairs
read_weights <- floor(eppairs[,"Relevance_int"]*10)
sashimidata <- eppairs[,c("start","stop","start_trscpt","stop_trscpt")]
sashimidata <- apply(sashimidata,1,function(x){
y <- sort(as.numeric(x))
data.frame("start"=y[1],"stop"=y[4],"cigar"=paste0(y[2]-y[1],"M",y[3]-y[2],"N",y[4]-y[3],"M"))
})
sashimidata <- data.frame("chr"=rep(eppairs[,"chr"],read_weights),do.call("rbind",sashimidata[rep(names(sashimidata),read_weights)]))
track_ep_pairs <- AlignmentsTrack(range=with(sashimidata,GRanges(chr,IRanges(start,stop),"*",cigar)),genome="mm9",isPaired=FALSE)
# works, I see gapped reads
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1)
# Error
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1,type="sashimi")
#Error in GAlignments(seqnames = seqnames(range), pos = start(range), cigar = range$cigar, :
# 'cigar' must be a character vector with no NAs
# BUT:
#table(sashimidata$cigar)
#601M93859N410M 474M67398N601M
#213 106
#anyis.na(sashimidata$cigar))
#[1] FALSE