R+Gviz: Fake AlignmentsTrack() to visualize Enhancer-Promoter connections
1
1
Entering edit mode
5.6 years ago

Hello Biostars-Community,

I would need your kind suggestions regarding an issue with the Gviz package from Bioconductor. I am attempting plot Enhancer-Promoter connections and tried to (mis)use the AlignmentsTrack() function and the "sashimi" plots display type for it.

Typically, alignment tracks are generated from BAM files and if I do that, I can nicely display the sashimi plots as shown in the manual. However I would like to visualize enhancer-promoter connections and thus attempted to create a fake alignment file from simulated reads spanning the coordinates of the enhancers and promoters and connecting them with introns.

Unfortunately for this fake file, I can't get the sashimi plots to work. It will display the fake reads in the "pileup" mode, but when I choose type="sashimi", the track remains blank and I get a strange error. I suppose the sashimi plot depends on additional columns like the mapping quality or such, that would automatically be read from a BAM file, but which I didn't provide with the simulated data. Any suggestions?

Thanks a lot Matthias

Minimal working (not working) example:

    library("Gviz")
    library("GenomicRanges")


    eppairs <-
      structure(
        list(
          ID_trscpt = c("Ppp3cb.NM_001310426", "Ppp3cb.NM_001310427"),
          ID_enh = c(
            "WTonly_chr14:21460154-21460564",
            "WTonly_chr14:21297822-21298296"
          ),
          Relevance_int = c(21.32, 10.66),
          chr = c("chr14", "chr14"),
          start = c(21460154L, 21297822L),
          stop = c(21460564L, 21298296L),
          chr_trscpt = c("chr14", "chr14"),
          start_trscpt = c(21365694L,
                           21365694L),
          stop_trscpt = c(21366295L, 21366295L),
          Gene = c("Ppp3cb",
                   "Ppp3cb")
        ),
        .Names = c(
          "ID_trscpt",
          "ID_enh",
          "Relevance_int",
          "chr",
          "start",
          "stop",
          "chr_trscpt",
          "start_trscpt",
          "stop_trscpt",
          "Gene"
        ),
        row.names = c(7981L, 7984L),
        class = "data.frame"
      )


# construct a fake alignment file for sashimi plots. For each relevance point connection, add 10 fake aligned read pairs
read_weights <- floor(eppairs[,"Relevance_int"]*10)

sashimidata <- eppairs[,c("start","stop","start_trscpt","stop_trscpt")]
sashimidata <- apply(sashimidata,1,function(x){
  y <- sort(as.numeric(x))
  data.frame("start"=y[1],"stop"=y[4],"cigar"=paste0(y[2]-y[1],"M",y[3]-y[2],"N",y[4]-y[3],"M"))
  })

sashimidata <- data.frame("chr"=rep(eppairs[,"chr"],read_weights),do.call("rbind",sashimidata[rep(names(sashimidata),read_weights)]))


track_ep_pairs <- AlignmentsTrack(range=with(sashimidata,GRanges(chr,IRanges(start,stop),"*",cigar)),genome="mm9",isPaired=FALSE)

# works, I see gapped reads
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1)
# Error
plotTracks(list(track_ep_pairs),extend.left = 0.1, extend.right = 0.1,type="sashimi")

#Error in GAlignments(seqnames = seqnames(range), pos = start(range), cigar = range$cigar,  : 
#                        'cigar' must be a character vector with no NAs

# BUT:
#table(sashimidata$cigar)
#601M93859N410M 474M67398N601M 
#213            106 

#anyis.na(sashimidata$cigar))
#[1] FALSE
R Gviz Sashimi GenomicRanges • 2.0k views
ADD COMMENT
1
Entering edit mode
5.6 years ago

Thanks to everybody, who read this post and tried to help. I could solve the problem now.

I was unfamiliar with the data format of Gviz tracks, but it turned out that the underlying data can be manipulated just as a regular GRanges object. After I finally found out that a simple

ranges(track_ep_pairs)

gives you access to the data used to generate a track, I could analyze it and track down issues. While mcols() won't work on tracks, ranges() does. It turns out that a simple factor to character conversion of the cigar column will do:

track_ep_pairs <- AlignmentsTrack(range=with(sashimidata,GRanges(chr,IRanges(start,stop),"*",cigar)),genome="mm9",isPaired=FALSE)
ranges(track_ep_pairs)$cigar <- as.character(ranges(track_ep_pairs)$cigar)

Now the sashimi plots render like a charm. Yeah!

ADD COMMENT

Login before adding your answer.

Traffic: 2455 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6