dbgh5 does not accept input from pipe [-
] or from <(cat )
dbgh5 -in <(zcat big_file.fastq.gz another_huge.fq.gz ...) ...
EXCEPTION: Empty bank
It works fine if the uncompressed or compressed input is given without the process substitution, one at a time or concatenating them before.
The problem is that the temporary file will become humongous if there are lots of huge files and takes time to have it.
Is there some reason to not support this ? It could be good to avoid using extra disk space in some cases.
Or is there something that I am missing here?
Hello,
I think it's not possible to do so because the dbgh5 command actually reads the input file several times:
Since you can't rewind a pipe (see here), there is no way right now to use pipes with dbgh5.
I assume you are using DiscoSNP? Check this post: DiscoSNP++ 2.2.0 problem
Not that by some magic program authors implemented random access to gz files...
Here I talk about the dbgh5 command itself (DiscoSNP uses this command to build a de Bruijn graph from the input reads).
For memo, the
-in
parameter of dbgh5 can be one of the following:However, a named pipe here should not work because of the several passes on the
-in
parameter (in other words, the pipe would be consumed during the first pass, giving nothing left to read for the other passes).