Entering edit mode
4.1 years ago
pt.taklifi
▴
60
I have a list of exons and a list of peak Calls in .txt format
Exons
structure(list(chr1 = structure(c(1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"),
X3857280 = c(3858717L, 3865811L, 3867973L, 3869604L, 3872471L
), X3857717 = c(3858844L, 3866000L, 3868053L, 3869775L, 3872572L
), ENST00000378209.7_exon_0_0_chr1_3857281_f = structure(1:5, .Label = c("ENST00000378209.7_exon_1_0_chr1_3858718_f",
"ENST00000378209.7_exon_2_0_chr1_3865812_f", "ENST00000378209.7_exon_3_0_chr1_3867974_f",
"ENST00000378209.7_exon_4_0_chr1_3869605_f", "ENST00000378209.7_exon_5_0_chr1_3872472_f"
), class = "factor"), X0 = c(0L, 0L, 0L, 0L, 0L), X. = structure(c(1L,
1L, 1L, 1L, 1L), .Label = "+", class = "factor")), class = "data.frame", row.names = c(NA,
-5L))
Peak Calls
structure(list(seqnames = structure(c(1L, 1L, 1L, 1L, 1L, 1L), .Label = "chr1", class = "factor"),
start = c(975451L, 1014228L, 1290080L, 1291099L, 1291742L,
1327977L), end = c(975952L, 1014729L, 1290581L, 1291600L,
1292243L, 1328478L), name = structure(c(5L, 6L, 1L, 2L, 3L,
4L), .Label = c("BRCA_123", "BRCA_124", "BRCA_125", "BRCA_143",
"BRCA_39", "BRCA_55"), class = "factor"), score = c(1.87842575038562,
4.07469686212787, 2.44358820293876, 3.18019908767794, 8.26783029566134,
1.08246502080444), annotation = structure(c(1L, 1L, 1L, 1L,
1L, 1L), .Label = "3' UTR", class = "factor"), percentGC = c(0.6187624750499,
0.62874251497006, 0.678642714570858, 0.702594810379242, 0.640718562874252,
0.676646706586826), percentAT = c(0.3812375249501, 0.37125748502994,
0.321357285429142, 0.297405189620758, 0.359281437125749,
0.323353293413174)), class = "data.frame", row.names = c(NA,
-6L))
so I for each exon I want to calculate if it overlaps with any of the peaks and if it does what percentage of exon is overlapping the peak AND if an exon overlaps more than one peak I want to report that then I want to store the results in a new table or data frame. other than a for loop I can't think of anything. specially since my data is rather big I'm looking for an efficient code. I'm currently working with R but I can do some coding in ubuntu terminal as well
Fyi, if you have data in R and want to share in in an easy copy/paste fashion then use
dput()
on the object. It will create ASCII representation of the data that you can share here so users can quickly have your example data rather than typing them in. Use can useedit
to add content to your post.Ok thanks for advice . I converted my data to ASCII format .