Entering edit mode
9.3 years ago
QVINTVS_FABIVS_MAXIMVS
★
2.6k
I've been implementing this command for the past few days now and I wished I did it earlier.
It functions similar to sortBed in bedtools but uses the bash sort which can be used in pipes
Find your .bashrc file in your home directory
$ cd $HOME
$ vi .bashrc # vi ~/.bashrc should work fine from any directory
Add this to .bashrc
alias sortbed="sort -k1,1 -k2,2g "
Save and source your .bashrc to get it to work
$ source ~/.bashrc
Example of use
$ intersectBed -a in.bed -b /segDup_unmappable.bed -wao | sortbed |uniq | cut -f 1,2,3,8 >in_segDup_unmappable_overlap.txt
This intersects a bed file of chr, start, end to a list of segmental duplications and unmappable regions in hg19. It also pipes to bash commands to only remove the in positions and the number of base pairs overlapping it.
sortbed
is used to sort the output and uniq
is applied to return only unique lines. You can treat sortbed
like sort
. Just a nice shortcut I thought others might like.
Like I said, bedtools sort gives the UNIX command to sort bed files; I don't see any advantage in your approach.
and set
LC_ALL=C
to make things faster.BEDOPS sort-bed works faster at sorting BED files than GNU sort, and you can pipe data in and out via standard UNIX streams.
Unlike other tools, it also handles arbitrary numbers of columns and can be assigned a chunk of memory, to sort very large BED files that will not otherwise fit into system memory.
Add semantic version sort to the first key for the chromosomes to be sorted correctly.