HOMER parallel annotation for big .bed file
2
0
Entering edit mode
6.5 years ago
gamma.jian ▴ 40

Dear all, I'm trying to annotate a huge file with HOMER, since I need information about few millions of sites. I would like to parallelize this process in batches of say 10000 instances of my .bed file. Is there a straight forward way to do so? I tried to get this done with GNU parallel but I really can't figure out if and how I can pass arguments through a pipe to HOMER annotatePeaks.pl command.

annotatePeaks.pl mybig.bed hg19 > output.txt

The idea would be to split the .bed file into N pieces, run a multiple number of jobs (both in parallel and in sequence) and then obtain a unique output with all the annotations from them. It might be trivial but I'm really confused on argument piping in this context. The other option would be to write a bash script to create those pieces as files and only then iterate through them using their names, but I was looking for something more elegant. Thank you in advance

ChIP-Seq • 1.8k views
ADD COMMENT
0
Entering edit mode
6.5 years ago
gamma.jian ▴ 40

I solved this using split and then parallel, and then I merged the annotated files again downstream. Note: each file contains the header, which should be removed before merging! I'm sure there are more elegant solutions, but this works!

 split -l 50000 ./../Big_bed.bed
    ls * | parallel -j 10 'annotatePeaks.pl {} hg19 > ./../anno_chunks/{.}_annotated.txt
ADD COMMENT
0
Entering edit mode
6.5 years ago
ole.tange ★ 4.5k

Can you test if this works, too:

parallel -a ../Big_bed.bed --pipe-part --block -1 --fifo \
  annotatePeaks.pl {} hg19 > ./../anno_chunks/{.}_annotated.txt

or this (slower):

parallel -a ../Big_bed.bed --pipe-part --block -1 --cat \
  annotatePeaks.pl {} hg19 > ./../anno_chunks/{.}_annotated.txt
ADD COMMENT
0
Entering edit mode

Thank you for your answer and sorry for being late. I tried both solution but I got the same error:

Died at /usr/bin/parallel line 241

I was not able to troubleshoot this...

ADD REPLY

Login before adding your answer.

Traffic: 1786 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6