Dear all, I'm trying to annotate a huge file with HOMER, since I need information about few millions of sites. I would like to parallelize this process in batches of say 10000 instances of my .bed file. Is there a straight forward way to do so? I tried to get this done with GNU parallel but I really can't figure out if and how I can pass arguments through a pipe to HOMER annotatePeaks.pl command.
annotatePeaks.pl mybig.bed hg19 > output.txt
The idea would be to split the .bed file into N pieces, run a multiple number of jobs (both in parallel and in sequence) and then obtain a unique output with all the annotations from them. It might be trivial but I'm really confused on argument piping in this context. The other option would be to write a bash script to create those pieces as files and only then iterate through them using their names, but I was looking for something more elegant. Thank you in advance
Thank you for your answer and sorry for being late. I tried both solution but I got the same error:
I was not able to troubleshoot this...