Question

Samtools Pipelines

1

Entering edit mode

11.8 years ago

Geparada ★ 1.5k

The most frequent output of a mapper is a SAM file, but in order to process it or visualize it, I have to have a index of a sorted BAM file. I usually work with a group of alignment files, so I use a pipeline like this:

for i in $(ls *.sam)

do

name=${i%.sam}
samtools view -Sb $name.sam -o $name.bam && samtools sort $name.bam $name.sort && samtools index $name.sort.bam $name.sort.bam.bai && rm $name.sam $name.bam

done

The problem with this kind of pipeline, is that I have to use a lot of space in the disk while the pipeline is running. I know that is a way to use temporal files in the pipeline to avoid the excessive use of disk space, can you tell me how can I use temporal files insted of write and delete intermediate files (like $name.sam $name.bam)?

Thanks for your time, Cheers

samtools ngs pipeline • 8.1k views

ADD COMMENT • link updated 11.8 years ago by matted 7.8k • written 11.8 years ago by Geparada ★ 1.5k

score 4 · Answer 1 · 2013-07-12

4

Entering edit mode

11.8 years ago

Pierre Lindenbaum 166k

Instead of using samtools view+sort+index you could use picard/ViewSam :

 java -jar  SortSam.jar I=$name.sam O=$name.bam SORT_ORDER=coordinate MAX_RECORDS_IN_RAM=50000000 CREATE_INDEX=true

ADD COMMENT • link 11.8 years ago by Pierre Lindenbaum 166k

score 3 · Answer 2 · 2013-07-13

3

Entering edit mode

11.8 years ago

matted 7.8k

You can pipe the sam output from the aligner directly to samtools, and then pipe from there into samtools sort. That will create the sorted bam on disk, without using a ton of extra disk space (there will be some extra temporary files from sort, but that's about it). Here's an example using bwa:

bwa aln ref.fa reads.fq | bwa samse ref.fa - reads.fq | samtools view -bS - | samtools sort -o -m 2G -@ 8 - sorted > sorted.bam

You can index sorted.bam afterwards with no trouble.

ADD COMMENT • link 11.8 years ago by matted 7.8k

1

Entering edit mode

That's only sensible for the shortest of alignment tasks. If, for whatever reason, any of the steps fails you get no or erroneous output.

Plus, trying to debug problems in that nested pipe is a nightmare as you've no idea which step is erroring - esp. given samtools poor error messages. I moved to Picard shortly after trying to a debug a simpler samtools pipe.

ADD REPLY • link 11.8 years ago by Chris Cole ▴ 800

0

Entering edit mode

Ok, that is a nightmare , I'm gonna try picard, but I'm afraid that because is written in java is slower than samtools, isn't it?

ADD REPLY • link 11.8 years ago by Geparada ★ 1.5k

0

Entering edit mode

nice!! the "-" flag is to use the standard input?

ADD REPLY • link 11.8 years ago by Geparada ★ 1.5k

1

Entering edit mode

Yes. You could also use /dev/stdout.

ADD REPLY • link 11.8 years ago by matted 7.8k