Hello everybody,
I am dealing with BAM files for the first time, and I had some problems to handle them. My goal is to obtain the whole consensus sequence (between sequencing folds of a genomic range from the same sample) in FASTA format. My first approach has been to find a "ready to go" script/one line command (i. e. I had a look here Convert bam file to fasta file ) to make the conversion, but I failed. This is what I got trying to use Samtools:
[main_samview] fail to open "filename.bam" for reading.
[bam_header_read] EOF marker is absent. The input is probably truncated.
[bam_header_read] invalid BAM binary header (this is not a BAM file).
[main_samview] fail to read the header from "R_2:164876791-164886791.bam".
I tried to figure out how a BAM file is structured, but having a look with a text editor I saw that the only explicit information are the non-assembled reads, and it would take a while to obtain the consensus starting from there (and, I guess, there would be no point anymore to use a BAM file).
As a second choice, I tried to use some more graphically oriented tools, such Artemis. Until now, I had problems even to make one of this program work (but this are hopefully physiological issues which I can probably solve myself); furthermore, I did not find a way to extract a consensus from one of the samples pointed here http://www.sanger.ac.uk/resources/software/artemis/ngs/
What would you suggest? Is there an easy way to obtain what I need, while I get more confident with the format? I hope the question is clear enough.
So, if I understood well, my data are currently in SAM format. In fact, using Editra to display them shows tab-delimited fields which may correspond to the description fields of each read. Also, using the file command gives me this output: "ASCII text, with very long lines", so I am not dealing with binary files. Still, I don't see any header line and, by the way, trying to convert my files to BAM (samtools view -b -S myfile.sorted.bam) leads to this error: "[samopen] no @SQ lines in the header. [sam_read1] missing header? Abort!".
This was recently discussed right here How to extract unaligned sequences from BAM files obtainend from BWA
Ok, thank you. I think this particular issue is solved; still I have some troubles, I will open a new question in case I am not able to solve them out.