If you want to sample N
elements without replacement from your set of input BED elements, replace N
with your value:
$ shuf input.bed | head -N | sort-bed - > answer.bed
Or you can use my sample
program on Github: https://github.com/alexpreynolds/sample
$ sample -k N -p input.bed > answer.bed
This also offers sampling with replacement and some other features that might be useful, depending on the experiment.
If you want to sample N
elements without replacement from a population of individual bases made up from your set of input BED elements, use the bedops --chop 1
command to split your input BED file into individual bases, then do the usual shuf
or sample
operation:
$ bedops --chop 1 input.bed > perBaseInput.bed
$ sample -k N -p perBaseInput.bed > perBaseAnswer.bed
You could pipe the per-base output from bedops
into shuf
directly, but if you want to draw multiple samples, it can be a waste of time to regenerate the per-base data each time. Saving to a separate file makes for easy re-use (re-sampling).
How does your bed file differ from the genome file that is required for
bedtools random
?bedtools random takes chromosizes in input.
<chromName><TAB><chromSize>
Is it required that the tool be pre-made or are you happy with a small R or python script? Also, when you say "pick N positions", do you mean from each of the regions or from all of the regions combined (I assume the latter, but this is ambiguous).
From all the regions combined.
I just wonder if such a tool exists. I can indeed wrote a small script using bedtools random and check if the positions are in the regions in the bed file.