Entering edit mode
9.9 years ago
noushin.farnoud
▴
130
Hi,
I am using the line below to convert SAM to BAM and then sort them:
samtools view -bS $mySAM | samtools sort -n -m 6000000000 - $myBAM
Eventhough the process seem to be generating an output, I receive the following log:
[bam_header_read] EOF marker is absent. The input is probably truncated.
[samopen] SAM header is present: 211 sequences.
Is the fact that I am inputting an aligned RNASeq.sam file and it indicates that 211 sequences have SAM header something I need to worry about?
I appreciate your feedback,
Thanks,
Noushin
Thanks a lot Brian! Is it also normal that only 211 sequences in SAM file appear to have SAM headers?
The number of sequences with headers depends on the reference, not the reads. So for example if you are working with the human genome, excluding auxiliary stuff like unplaced contigs, samtools will report 25 sequences - 22 autosomes, 2 sex chromosomes, and one mitochondrial sequence. This is a result of the reference, and independent of your reads.
Thank you very much for clarifying! That totally makes sense now.