Query Name too long - samtools
2
0
Entering edit mode
8 months ago
marco.barr ▴ 150

I'm trying to convert a SAM file to BAM, the SAM file is generated by bbmap. I don't know how to solve it .

[E::sam_parse1] query name too long 
[W::sam_read1_sam] Parse error at line 4
samtools view: error reading file

I've 10682 lines in my sam file and thi is my line 4:

47e5a65b-37ad-4c9f-8de7-77f7060b40d8 runid=bbd211d87320da0fc271ec77cd3351e83823cf03 read=18 ch=125 start_time=2024-03-01T17:07:31.807069+01:00 flow_cell_id=ALJ911 protocol_group_id=PCR_PI4KA-2 sample_id=PCR_PI4KA-2 barcode=barcode19 barcode_alias=barcode19 parent_read_id=47e5a65b-37ad-4c9f-8de7-77f7060b40d8 basecall_model_version_id=2021-05-17_dna_r9.4.1_minion_384_d37a2ab9_1    4       *       0       0       *       *       0       0 TTGTACTTCGTTTAGTTACGTATTGCTAAGGTTAAGTTCCTCGTGCAGTGTCCAGAAGATCAACACCTATGGCGGCCCCGGCCAGGTGGCGGCAGCTCCTCAGATGGGAGAAGAGTTTGTCCATGTTCACACTGGGTGCAGGAAGAAGCTGAAACCACAGACATGACTGAGTCCTCCATGAGCTGGCCTCCACCCTGCTGGACGCCATCACCGATGGGGACCCCCTGGTGCAGGAGCAGGTCTGCAGTGCCCTGTGCTCCCTCGGGAGGTGCGGCCAGTGGAGACGCTCCGATGCCTGCAGGGGTATCTGCGGCAGCATGACAAGCTGGCACACCCATACCGAGCAGCAGTCCTGAGGAGACCTGAAGGGTCCTGAGCAGTCGCGCCAGTGAGCTGGACAAGGACACAGCCAGCACCATCATCCTCCTGGCCTCCAGCGAGATGACCAAGACGAGGACCTGGTCTGGGACTGGCAGCAGGCGGCGGTGGGCATCCTGGTGGCCGTGGGAAGACAGTTCATCAACAAGGTGATGGAGGAGCTGCTGCCAGGGCTGCACCCTGGGACCCTGCCTGCCGCCGTCTGCACCTCGCCAGCCTCTCGGTGGCCAACGCGTTCGGCGTAGTCCCTTCCTGCCATCCGTCCTGAGCTCTGCCACCCAGGTGCCGGGCGTGGCTCAGCCTGGGACACAGGTGCCGCGTGGCCTTCTGCTCAGCTCTGCAGCGCTTCATGAGGGTGCCCTGGGTCTCTGTGGCCTGGTTGGACCGAGCCCCAGACCCCACGGTCAAGGAGGACTTGCTTTGCCACCGACATCTTCAGCGCCTACGATGTTCTCTTCCATCAGGTGGCTGCACAGAGTCCGAGCCAAGTGGGCCAGATCCTCGAGGCAGCTGTGAGTGTGGGCAGCCGCATGGAGACCCAGCTGGATGCCCTCTTGGCTGCACTGCACTCCCAGATCTGTAGCGCGCCAGCAGGTCCTCGGCCACCTGGTGATGAGTAACA   &)()())*9100-(''',.-**././33.-4579506;589:5002.--011''''(&(446,,,-66:8545102(''',33*(''+-.---(())5544245521/01./,,''*'->@?8788@@88J>44;+1.)''().++(()+200666..../1-,,+*))--..7,13112*%%',001236''&';7<<>===<>?><@>?>;<=21189<;;9016;;>>7877;;00//66:<3221442:>449>:?667(''**)'*22369)(()=4((((7797,*)))88==)((,)''024==96557766667:6000133234560-+*)**7756690...,&''&+%''''&,'%)*()-)),002288878389666:4/..19999>=A@C1242<<4331344:>::<;<>=5<=>=<>768@96667@?==93131+)'(+577:<;?20324<.--763310/0('(&''((%'&&'&(27779;:-689+4+++,()++<888-)'$&()26???A6@==>>>:9200&$$%'(13333880-3349831)('&()'',-*+(&%(&%%(-++764554:AI43336873,,,,15==544582223>33889*..-.78769;924,('+-''('&&&'(&%'))*$&&)/++('&&%%&%%'(')/)(''&&'''%''%36:;79888:42)(()-1130..)'')*(%%%&*0.--.)..*&&((&%&(('(&&&*''(*.67'),./0.4.099=@?>5445.+-..,-,1))(''))'(*)--.275688>@))8;;<)(((.58:?@>JJ@@7-43/*(()+79882/*(**-0.(('%)+''%&%(*,)***;*)+&%%'(%%&%%042245985784314683')+-+***,/2213==A=::<?>?A@>@><<///09778:>7;21;966.)((&&&%%$%&%&%(,()*'%&'',-)0221272((((+,*

Can anyone help? Thanks

samtools • 587 views
ADD COMMENT
2
Entering edit mode
8 months ago
GenoMax 147k

You could have added the option trd=t to your original bbmap mapping command. It will truncate the read names (ONT has those long names) after the first space.

Now you could do the following to fix the names using BBTools.

reformat.sh -Xmx4g in=aligned.bam out=fixed.bam trd=t

BBMap is perhaps one of the rare aligners that keeps the entire fastq header when mapping. Many other programs automatically truncate the names after first space.

ADD COMMENT
1
Entering edit mode
8 months ago

query name cannot be longer than 255: https://github.com/samtools/htslib/blob/7d3efee742cd13a5b23c057ee29a71a51c6f94a6/sam.c#L2895

echo -n '47e5a65b-37ad-4c9f-8de7-77f7060b40d8 runid=bbd211d87320da0fc271ec77cd3351e83823cf03 read=18 ch=125 start_time=2024-03-01T17:07:31.807069+01:00 flow_cell_id=ALJ911 protocol_group_id=PCR_PI4KA-2 sample_id=PCR_PI4KA-2 barcode=barcode19 barcode_alias=barcode19 parent_read_id=47e5a65b-37ad-4c9f-8de7-77f7060b40d8 basecall_model_version_id=2021-05-17_dna_r9.4.1_minion_384_d37a2ab9_1' |wc -c
378

see Modify query names longer than 252

ADD COMMENT

Login before adding your answer.

Traffic: 1613 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6