Sort BAM files by reference, and then by read name within each reference?
1
1
Entering edit mode
3.9 years ago
vitor.rca ▴ 10

Is there an efficient way to sort a BAM file by chromosome, and then by read names within each chromosome?

I could split the BAM by chromosome, sort each resulting BAM by read names, and finally merge the list of BAMs. But I wonder if there is a single and straightforward command to do that.

RNA-Seq alignment • 2.0k views
ADD COMMENT
1
Entering edit mode

I'm not even sure you can sort BAM on reference ? or do you mean on position ?

ADD REPLY
0
Entering edit mode

The default behavior of samtools sort does that: "When the -n option is not present, reads are sorted by reference (according to the order of the @SQ header records), then by position in the reference". I am asking how to sort by reference, then by read name.

ADD REPLY
0
Entering edit mode

ok, yes, that's what I mean with 'by position'. I don't think what you ask is possible with any of the existing tools. Splitting the BAM, as you mentioned, is likely the most easy way to get to it.

ADD REPLY
0
Entering edit mode

OK, thank you. It takes a long time to split a BAM with bamtools split, I mean, more time than I was anticipating for a BAM already sorted by position. So I was hoping for an alternative, but it seems like there's no reason for someone to implement such sorting scheme, since no tool requires it, except for the UBU sam-xlate tool which I'm trying to use (https://github.com/mozack/ubu/wiki).

ADD REPLY
3
Entering edit mode
ADD COMMENT
1
Entering edit mode

damn, should have known you would have code at hand for that Pierre Lindenbaum . :-)

thx.

ADD REPLY

Login before adding your answer.

Traffic: 1493 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6