Hello,
HISAT2 can only output sam files, which can be quite large. I found an option to output only reads that mapped to the reference but still the files can be 100s of GB.
Is there any issue in directly converting to bam as such:
${hisat2}/hisat2 -p 4 --rg-id=${4} -x $l --dta ${strand} --no-unal -1 $2 -2 $3 -U $4 | samtools view -Sbh > hisat2_output.bam
I am specifically asking about piping stdout to samtools view to convert to bam.
Thanks.
Ok, it makes sense to pipe to sort like swbarnes2 and yourself suggested. What is the impact on RAM usage? Piping to unsorted bam can be done on the go. Do you need to load the whole output into memory in order for you to sort?
sort
has a-m
option that specifies the amount of memory to be used before it spills data as intermediate/tmp files to disk. Default is 768Mb I think per thread.Hi, I have a problem about this command. It's can work when I use a bash script to execute it but not in command line. When I use command line, I got the error message: -sh: syntax error near unexpected token `(' , I can't figure out what happened. Could anybody know what wrong with this?
Let me answer this question myself. I find that because of the "sh" issue. If you want to do this, you should be sure to use one of bash/ksh/zsh. Using for instance "sh" won't work.