Indexing a BAM file without read names
1
1
Entering edit mode
2.2 years ago
Michael 55k

I am trying to index a bam file that was generated by bwa mem from a PacBio run (fastq) that did not contain any read names. I don't want to run the alignment again. Of course, Edit: if the purpose of the index is to find reads by name (Edit: which is not the case), so this undertaking is somewhat silly but some tools require the bam file to be indexed. Also, I cannot just number the rows in the bam file because some may come from the same read. Is it possible?

$ samtools index -@50 SMRT1_Crogercresseyi.bam
[E::hts_idx_push] NO_COOR reads not in a single block at the end 7 -1
samtools index: failed to create index for "SMRT1_Crogercresseyi.bam"
samtools bam • 897 views
ADD COMMENT
4
Entering edit mode
2.2 years ago
jkbonfield ★ 1.3k

I don't understand this. The purpose of the BAM index is to find reads when queried by location, not by name. Read names should have no impact on indexing.

Is your data aligned, and sorted by chromosome/position order?

ADD COMMENT
0
Entering edit mode

I am sorry, I should have checked that I was working on the right file. I ran the same command on the sorted file and it created the index. Indeed the file wasn't sorted, that was the problem and has really nothing to do with read names.

So the error message really means you need to sort the bam file first before indexing.

ADD REPLY

Login before adding your answer.

Traffic: 2246 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6