Aligning, Sorting and Converting to bam at the same command - possible?
1
0
Entering edit mode
4.9 years ago
Aspire ▴ 370

Is it possible to align, sort and convert to bam using a pipeline?

i.e.

bowtie .... | samtools sort | samtools view -bS -o sorted_output.bam

In case that is possible, is this solution very inefficient in terms of demands on the pc?

alignment sorting bam • 2.8k views
ADD COMMENT
2
Entering edit mode

not too familiar with bowtie but yes if it produces sam with a header or bam. No need to do a samtools view after, samtools sort handles bam files. Why would it be inefficient? having to wait for the entire file to sort chunks, increasing your storage 2X is more efficient?

ADD REPLY
0
Entering edit mode

If you think you'll be doing markdup at some point then you may also want to add a "samtools fixmate -m" in there after the bowtie command as this way it doesn't require an additional sort later on. Also when piping it's often best to pipe uncompressed BAM. Some samtools commands have a "-u" options while others need "-l 0" and others have no option so need to add it in to -O instead. Rather unfortunate lack of consistency.

Eg:

bowtie ... | samtools fixmate -m -O bam,level=0 - - | samtools sort -l 0 | samtools markdup - sort_markdup.bam

Depending on speeds, you may want to add threading in there ("-@ 8" etc) to specific commands. Some do better than others, but note with samtools 1.10 it can now multi-thread the SAM parsing too, which could sometimes be a bottleneck in the past if matched up to a high thread count aligner.

ADD REPLY
3
Entering edit mode
4.9 years ago

Most people pipe everything through like that. You probably don't need the view command, I'm pretty sure newer versions of samtools sort will take .sam as input and always output .bam

ADD COMMENT
2
Entering edit mode

Samtools outputs whatever you specify, either via suffixlike .sam or .bam or via the -O parameter (SAM/BAM), so yes the view is not necessary. It is a good approaches to use piles and it saves time by avoiding intermediate files which have to be written to disk. The larger your memory, the more efficient it is. samtools sort has an option -m to specify how many RAM to use for sorting before spilling data to disk as intermediate file once allocated memory is full. You pipes can be arbitrarily long in theory, I have commands in some specialized pipelines that go through 10 tools without producing a single intermediate file.

ADD REPLY

Login before adding your answer.

Traffic: 2099 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6