BAM to fast[aq] while retaining alignment?
0
1
Entering edit mode
7.6 years ago
dmathog ▴ 40

A BAM file contains a description of reads aligned against a reference sequence, plus other information. When converting to fasta/fastq formats these two tools

samtools  fasta in.bam out.fasta
picard SamToFastq I=in.bam F=out.fastq

drop all of the alignment information. Is there another tool which can do this conversion and produce a multiple sequence alignment in fasta format directly? It is acceptable, indeed preferable, if due to the presence of inserts it ends up looking something like this:

ACGTT-ACGTTGCA 
ACGT--ACGTTGCA
ACGT--ACGTTGGA
ACGT--ACGTTGCA
ACGT--ACGTTGCA
ACGTAAACGTTGCA
ACGT--ACGTTGCA reference sequence

as opposed to (same alignment, all insertions dropped)

 ACGTACGTTGCA 
 ACGTACGTTGCA
 ACGTACGTTGGA
 ACGTACGTTGCA
 ACGTACGTTGCA
 ACGTACGTTGCA
 ACGTACGTTGCA reference sequence

Yes, one could realign the fasta file against the reference sequence, but since it would not be with the same alignment tool as was used to build the BAM file, the two representations would in most cases not end up with exactly the same alignment.

alignment sequence • 2.0k views
ADD COMMENT
1
Entering edit mode

what's your goal; why would you want this format ?

ADD REPLY
1
Entering edit mode

Among other things, I prefer other alignment viewers to IGV or tablet, and these will accept the aligned fasta format but not BAM.

ADD REPLY
0
Entering edit mode

but, say for a human genome, all the lines would have the size of the chr1 (250 E6 bp ) + managing the insertions ??

ADD REPLY
0
Entering edit mode

The alignments in question here are only up to tens of thousands of base pairs, and some of them have only tens of reads.

ADD REPLY
1
Entering edit mode

But what about other regions having hundreds or thousands of reads? You are likely to crash multi-alignment viewers or at least make them very slow. Anyway, what you are asking is quite tricky to implement and barely useful to others. You should try to find other solutions.

ADD REPLY
0
Entering edit mode

Are you just trying to visualize the alignments? You could just open the bam with IGV to get a nice visualization of the read alignment. Integrative Genomics Viewer (IGV)

ADD REPLY
0
Entering edit mode

See if the answer in this thread helps.

ADD REPLY

Login before adding your answer.

Traffic: 2333 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6