How do I create a loop for each unique genes to subset the GRanges objects only by the gene?
0
0
Entering edit mode
2.8 years ago

I have a txf_df which I subset by gene.list$entrez and then found the list of unique number of transcripts. The txf_df is then converted to txf_grange.

Now, I want to create a for loop of the 15 unique genes, where upon each iteration, subset the txf_grange objects by only the specific gene.

Traceback:

Error in (function (classes, fdef, mtable)  :    unable to find an inherited method for function ‘subsetByOverlaps’ for signature ‘"GRanges", "character"’

Code:

# Subset by the Entrez IDs
txf_df <- txf_df %>% filter(geneName %in% gene.list$entrez)

# Find the number of common transcripts
unique <- unique(txf_df$geneName)
length(unique)

# Recast this dataframe back to a GRanges object
txf_grange <- makeGRangesFromDataFrame(txf_df, keep.extra.columns=T)

# For each of the 15 genes, subset the Granges objects by only the gene
for (i in unique) {
  subsetByOverlaps(txf_grange, i)
}

Data:

> dput(head(txf_df))
structure(list(seqnames = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "16", class = "factor"), 
    start = c(12058964L, 12059311L, 12059311L, 12060052L, 12060198L, 
    12060198L), end = c(12059311L, 12060052L, 12061427L, 12060198L, 
    12060877L, 12061427L), width = c(348L, 742L, 2117L, 147L, 
    680L, 1230L), strand = structure(c(1L, 1L, 1L, 1L, 1L, 1L
    ), .Label = c("+", "-", "*"), class = "factor"), type = structure(c(3L, 
    1L, 1L, 2L, 1L, 1L), .Label = c("J", "I", "F", "L", "U"), class = "factor"), 
    txName = structure(list(c("uc002dbv.3", "uc010buy.3", "uc010buz.3"
    ), c("uc002dbv.3", "uc010buy.3"), "uc010buz.3", c("uc002dbv.3", 
    "uc010buy.3"), "uc010buy.3", "uc002dbv.3"), class = "AsIs"), 
    geneName = structure(list("608", "608", "608", "608", "608", 
        "608"), class = "AsIs")), row.names = c(NA, 6L), class = "data.frame")

> dput(head(gene.list))
structure(list(Name = c("AQP8", "CLCA1", "GUCA2B", "ZG16", "CA4", 
"CA1"), Pvalue = c(3.24077275512836e-22, 2.57708986670727e-21, 
5.53491656902485e-21, 4.14482213350182e-20, 2.7795892896524e-19, 
1.23890644641685e-18), adjPvalue = c(8.3845272720681e-18, 6.66744690314504e-17, 
1.43199361473811e-16, 1.07234838237959e-15, 7.19135341018869e-15, 
3.20529875816967e-14), logFC = c(-3.73323340223377, -2.96422555675244, 
-3.34493724166712, -2.87787132076412, -2.87670608798164, -3.15664667432159
), entrez = c(AQP8 = "343", CLCA1 = "1179", GUCA2B = "2981", 
ZG16 = "653808", CA4 = "762", CA1 = "759")), row.names = c(NA, 
6L), class = "data.frame")
subsetByOverlaps GenomicRanges GRanges findOverlaps • 958 views
ADD COMMENT
0
Entering edit mode

If you mean that you want to split a GRanges by the values in some column (e.g. gene names) you can use split():

# Create example data
foo <- data.frame(seqnames = paste0("chr",1:10),
                  start = 31:40,
                  end = 41:50,
                  strand = "*",
                  geneName = letters[1:10])

foo <- makeGRangesFromDataFrame(foo, keep.extra.columns = T)

# Filter the GR by genes of interest and then split into GRs

genes <- c("c","b","g")
foo.f <- foo[foo$geneName %in% genes]
foo.s <- split(x = foo.f, f = as.character(foo.f$geneName))
ADD REPLY

Login before adding your answer.

Traffic: 1633 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6