Header too long for bam format
1
0
Entering edit mode
3 months ago
selplat21 ▴ 20

I have about 20,000 bam files in a directory called intermediate_bams

These will all have the same @HD and @SQ lines. There are about 860 @SQ lines.

When I try to merge these bams, I get the following output:

samtools merge -h custom_header.sam -o test.bam intermediate_bams/*bam

[E::bam_hdr_write] Header too long for BAM format

Here is the header and tail of my custom_header.sam

@HD VN:1.6  SO:coordinate
@SQ SN:NC_088602.1  LN:212386202
@SQ SN:NC_088603.1  LN:163726572
@SQ SN:NC_088604.1  LN:122092291
@SQ SN:NC_088605.1  LN:78855516
...
@SQ SN:NW_027043814.1   LN:173419
@SQ SN:NW_027043815.1   LN:151889
@SQ SN:NW_027043816.1   LN:151339
@SQ SN:NW_027043817.1   LN:234180
@SQ SN:NW_027043818.1   LN:593964
samtools alignments • 409 views
ADD COMMENT
1
Entering edit mode
3 months ago
Ram 44k

This error refers to the number of lines in the header, not the length of any single line. See: https://github.com/samtools/samtools/issues/1613

What is the output to wc -l custom_header.sam and command ls -lh custom_header.sam?

ADD COMMENT
0
Entering edit mode

The output of wc -l custom_header.sam is 870

and the output of command ls -lh custom_header.sam is -rw-r--r-- 1 nicolas nicolas 27K Aug 2 21:02 custom_header.sam

The large number corresponds to the large number of scaffolds in my genome. I've never run into this problem before with previous genomes.

ADD REPLY
1
Entering edit mode

That header should not cause a problem. Please add a comment on the issue I have linked to.

ADD REPLY

Login before adding your answer.

Traffic: 1554 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6