Can't display protein domain data from UCSC table unipDomain using Gviz
0
1
Entering edit mode
4.2 years ago

Hi all, Using Gviz I am trying to create a figure where I have the exons from a gene in one track and below that the protein domains mapped to the genome sequence. I am able to map the transcript/exon information easily by grabing info from the UCSC site using the UcscTrack() function but cannot seem to properly grab the data out of the protein domain table 'unipDomain' represented by the schema here.

The problem seems to be the way the data is organized within the table with no easy 1:1 mapping between the 'chromStart' and 'chromEnd' data columns because there are multiple start positions relative to chromStart represented in the 'chromStarts'. That is, you need to offset a certain amount from the chromStart value using the chromStarts values (comma separated values) and then represent the width of the feature by the 'blockSizes' data, stored as comma-separated values. I don't see any way to do this with the GeneRegionTrack, AnnotationTrack, or DataTrack classes. Any help would be greatly appreciated, either solving this problem or pointing me towards an easier way to represent domain information on the chromosome with another method. Thanks!

This is what I would expect it to look like based on the track in UCSC browser ucsc browser image

This is what mine looks like (one big box instead of split along with exons) Gviz image

Code used to generate image above

library(Gviz)
library(GenomicRanges)

gen <- "hg38"
chr <- "chr11"

# Create the Ideogram track
itrack <- IdeogramTrack(genome = gen, chromosome = chr)

# Create the GenomeAxisTrack
# GenomeAxisTrack class
gtrack <- GenomeAxisTrack()

## example of FKBP2
from <- 64240500
to <- 64244500
knownGenes <- UcscTrack(genome = "hg38", 
                        chromosome = "chr11",
                        track = "knownGene", 
                        from = from,
                        to = to, 
                        trackType = "GeneRegionTrack",
                        rstarts = "exonStarts", 
                        rends = "exonEnds", 
                        gene = "name",
                        symbol = "name",
                        transcript = "name",
                        strand = "strand",
                        fill = "#8282d2",
                        name = "UCSC Genes")

domains <- UcscTrack(genome = "hg38",
                     chromosome = "chr11",
                     track = "uniprot",
                     table = "unipDomain",
                     from = from,
                     to = to,
                     trackType = "GeneRegionTrack",
                     rstarts = "chromStart",
                     rends = "chromEnd",
                     gene = "name",
                     symbol = "name",
                     strand = "strand",
                     name = "domain")

plotTracks(list(itrack, gtrack, knownGenes, domains), transcriptAnnotation = "gene" )
R Gviz protein domain protein visualization • 1.3k views
ADD COMMENT
0
Entering edit mode
ADD REPLY
0
Entering edit mode

Thanks, didn't know I needed to do that.

ADD REPLY
0
Entering edit mode

The idea is that you wouldn't normally do that. Cross-posting is considered bad form.

ADD REPLY

Login before adding your answer.

Traffic: 2500 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6