Hello,
Is there a way in GenomicRanges or BEDTools to ask if SNPs in a file are included within the ranges specified in a BED file?
Thanks for your help.
cat snps.txt
1:15820:G:T
1:876499:A:G
1:887560:A:C
1:887801:A:G
1:888639:T:C
1:888659:T:C
1:889158:G:C
1:889159:A:C
1:897325:G:C
1:897738:C:T
1:906272:A:C
1:908823:G:A
1:909238:G:C
1:909309:T:C
1:909419:C:T
cat sequencedRegions.bed
CHROMOSOME START STOP LENGTH
chr1 92320598 92320634 36
chr1 196482330 196482379 49
chr2 32141903 32141946 43
chr6 33442452 33442494 42
chr8 25268844 25268885 41
chr10 5924750 5924801 51
chr10 28533997 28534034 37
chr11 117433440 117433471 31
Thanks for your response! I'm trying out your solution, but for
awkI get the error:awk: invalid -v option. Is this something to do with my version ofawkon Mac perhaps?On Mac, you might install GNU
awk(gawk) via Homebrew:brew install gawkand just replaceawkwithgawk.On rechecking, it looks like
brew install gawksymlinks toawk, so that you can just useawkdirectly.OS X ships with a BSD-specific build of
awk, which results in some options being different or unavailable. This is an issue with BSDsed, as well. It can make running Unix scripts on OS X a bit of a pain in the ass, but Homebrewgawkandcoreutilsare popular packages for dealing with this, by installing GNU kit.It is in the same lines of @ Alex code. snps.txt and sequencedRegions.bed are from OP. I added a range to sequencedRegions.bed as sequencedRegions.bed in OP, doesn't intersect SNP records
Input:
output: