Hi I worked on a small pipeline and was looking for a more efficient way to do things. Currently, these are two of the commands:
chromap --preset hic -x INLUP00233.hap1.fasta.index -r INLUP00233.hap1.fasta -1 INLUP00233_1.fq.gz -2 INLUP00233_2.fq.gz --SAM -o INLUP0023.sam -t 16
bgzip -c@ 16 INLUP0023.sam | samtools view -bS -@ 16 | samtools sort -n -@ 16 | samtools view -h | sed -e 's/\/.//' | samtools view -bS -o INLUP00233.bam -@ 16
I was wondering whether there is a way to get the --SAM
output form chromap
piped directly into samtools view -bS -@ 16
to get a BAM out of it for the following operations. It seems a bit wasteful in term of time having to pass through bgzip
to get the stream for the SAM compressed so that it can only then be transferred to a BAM format.
I've done few tests e.g. adding a -
of dev/stdin
at the end of samtools view -bS -@ 16
; even changing the command with the following: samtools view INLUP0023.sam -bS -@ 16
, which I knew for a fact it wouldn't have worked but gave it a try since out of options... if anyone has some ideas on how to correctly do this, any help is much appreciated!
@cmdcolin thanks a lot! Indeed, that is something perfectly working and which I came across and experimented with as well; in general cases, it functions just fine. However, I have to deal with a
sh
environment called within my script to effectively filter out some files form the analysis done withchromap
; for this reason, the trick withdev/stdout
stopped working in that situation resulting in the mapping to abort after three/four iterations...Hence, I'm looking for an alternative that can pipe the output SAM after the
-o
flag directly intosamtools
to get a BAM out of it, if there is any. Thanks again!i'm not exactly sure what situation you are running into. i don't have experience with sh. however, carefully read the comments at https://github.com/haowenz/chromap/issues/150 they note that there was an issue with temp files, and that they would likely over-write each other if it was repeatedly being run (in parallel for example), and so maybe you ran into that, but they said it's fixed on master, so you could try running the master branch. alternatively, comment on that issue to add more details of your issue
Sure, I will look more carefully into it and eventually provide a more detailed explanation of what I'm trying to do.
I haven't tested that. How about use "mkfifo" to create a pipefile and use that as the "-o" option for chromap? You can add the "&" command end, so chromap will be put in the running background. Then the samtools just read in the pipefile.
I do have a plan to formalize the support of stdout in Chromap. Could you please open an issue on GitHub and elaborate the issue of the "sh" environment a bit more? With those information, I can make sure my implementation also works for you and we may also bother you with testings over there :).
@mourisl oh thanks a lot! I didn't think this was something worth of a Git issue since the tool works perfectly and my use is very specific/for a niche case. However, since you recommended it I will move and explain in more details my case on GitHub!
P. S. no need to worry about the testing I'm happy to help