Hello,
Is there a way in GenomicRanges or BEDTools to ask if SNPs in a file are included within the ranges specified in a BED file?
Thanks for your help.
cat snps.txt
1:15820:G:T
1:876499:A:G
1:887560:A:C
1:887801:A:G
1:888639:T:C
1:888659:T:C
1:889158:G:C
1:889159:A:C
1:897325:G:C
1:897738:C:T
1:906272:A:C
1:908823:G:A
1:909238:G:C
1:909309:T:C
1:909419:C:T
cat sequencedRegions.bed
CHROMOSOME START STOP LENGTH
chr1 92320598 92320634 36
chr1 196482330 196482379 49
chr2 32141903 32141946 43
chr6 33442452 33442494 42
chr8 25268844 25268885 41
chr10 5924750 5924801 51
chr10 28533997 28534034 37
chr11 117433440 117433471 31
Thanks for your response! I'm trying out your solution, but for
awk
I get the error:awk: invalid -v option
. Is this something to do with my version ofawk
on Mac perhaps?On Mac, you might install GNU
awk
(gawk
) via Homebrew:brew install gawk
and just replaceawk
withgawk
.On rechecking, it looks like
brew install gawk
symlinks toawk
, so that you can just useawk
directly.OS X ships with a BSD-specific build of
awk
, which results in some options being different or unavailable. This is an issue with BSDsed
, as well. It can make running Unix scripts on OS X a bit of a pain in the ass, but Homebrewgawk
andcoreutils
are popular packages for dealing with this, by installing GNU kit.It is in the same lines of @ Alex code. snps.txt and sequencedRegions.bed are from OP. I added a range to sequencedRegions.bed as sequencedRegions.bed in OP, doesn't intersect SNP records
Input:
output: