I am having an error which is new to me and I couldn't really find any useful tips.
I am trying to convert a sam file generated with bowtie (one) into a bam file using samtools view -b. The command prints the following code and a 27byte bam output. see error bellow. Do anyone know what might be the issue or how I can rectify this?
[E::sam_hrecs_error] Header line does not have a two character key at line 246:
"@A00881:421:HM7W5DRXX:1:2101:1118:1016 16 NC_005109.4 73902262 255 23M GTCAACATCAGTCTGATAAGCTA IIIIIIIIIIIIIIIIIIIIIII XA:i:0 MD:Z:23 NM:i:0 XM:i:2"
samtools view: failed to add PG line to the header ^Z
I guess the problem is that the name of reads have the @ as start and samtools interprets that as a header line. Probably this happens if you run for example bowtie2 using a FASTA instead of a FASTQ file, i.e. for FASTA the @ in read names does not get removed. Use sed to edit the file like:
sed 's/\@A00881/A00881/g' yoursamfile > newsamfile
It seems that samtools is becoming stricter about proper format of sam files about tags in header lines. I had a similar error to the one reported here just because I did not set properly the SM tag in the @RG header line when calling bowtie2.
wrong: --rg-id idxx --rg text
right: --rg-id idxx --rg SM:text
See if this applies: https://github.com/samtools/samtools/issues/1237
The error may be fixable by editing the SAM file header.
@ Istvan Albert I saw that I can edit the header with
samtools header
but not sure what the new header should read (what to include).Thank you. The link recommended adding --no-PG to the command. This did not really help. it produced the error
what is the output of
that would tell you which headers seem malformed.