Entering edit mode
3.8 years ago
vitor.rca
▴
10
Is there an efficient way to sort a BAM file by chromosome, and then by read names within each chromosome?
I could split the BAM by chromosome, sort each resulting BAM by read names, and finally merge the list of BAMs. But I wonder if there is a single and straightforward command to do that.
I'm not even sure you can sort BAM on reference ? or do you mean on position ?
The default behavior of samtools sort does that: "When the -n option is not present, reads are sorted by reference (according to the order of the @SQ header records), then by position in the reference". I am asking how to sort by reference, then by read name.
ok, yes, that's what I mean with 'by position'. I don't think what you ask is possible with any of the existing tools. Splitting the BAM, as you mentioned, is likely the most easy way to get to it.
OK, thank you. It takes a long time to split a BAM with
bamtools split
, I mean, more time than I was anticipating for a BAM already sorted by position. So I was hoping for an alternative, but it seems like there's no reason for someone to implement such sorting scheme, since no tool requires it, except for the UBU sam-xlate tool which I'm trying to use (https://github.com/mozack/ubu/wiki).