I have designed a Crispr library with dual guide RNAs flanking specific genomic regions. I used crisprDesign to do this.
I have now too many gRNA pairs targeting the same region. I have the data as GRanges objects.
Is there a way to filter the IRanges in a Granges object for non-overlapping regions or at least allow 10% overlap? I tried findOverlaps but with no avail.
If you're not tied to Granges, you could use bedmap --fraction-both 0.1 to require at least 10% overlap between reference and map regions. Then use the bedops --not-element-of operation to get all elements not in this overlap set (i.e., regions not overlapping or having up to 9.99999% overlap).
There is no one-hit function in GenomicRanges, but you can stick something together using a combination of findOverlaps to first find overlaps and then pintersect + width to count the bases that overlap. THen you just need to do a calculation to put the overlap with in relation to some reference width to get a percentage:
a <- GRanges(seqnames = "chr1", ranges = IRanges(start=c(1, 20), end=c(10, 200)))
b <- GRanges(seqnames = "chr1", ranges = IRanges(start=c(5, 500), end=c(10, 1000)))
fo <- findOverlaps(a, b)
> fo
Hits object with 1 hit and 0 metadata columns:
queryHits subjectHits
<integer> <integer>
[1] 1 1
-------
queryLength: 2 / subjectLength: 2
wi <- width(pintersect(a[from(fo)], b[to(fo)]))
6
Here is some representative data and an image in igv of one gene. I have loaded the .bed file in igv. As you can see I have many regions that are overlapping, however, I just need one gRNA pair to cover a stratech of the genomic region. I would like to filter the bed file for one of this blocks and remove the rest that overlap. Ideally, I would like to have some minor overlap but this is not so important at the moment.
does this work with a single .bed file?
Yes,
bedmap
will perform operations on one or two BED files.