To make a long story short, I have a bam file that is aligned to the human reference plus a specific contig (human gene + viral vector introduced into a mouse).
The fasta file that was used for alignment got inadvertently deleted, but I have good coverage across the contig. Is there a straightforward way to use the reads aligning to that contig to recreate the contig's fasta file? (All the info should be there!)
Interesting question! I'd love to see what people think of this!
reformat.sh in=your.bam out=fasta.fa
(from BBMap suite) will create a fasta file. I am not sure if it will be the original reference you wish to recover. You could test with a small(ish) example.That gives each read in fasta format, not the entire contig.I guess that could be assembled in subsequent steps or something..
You are right. Sorry a mental lapse on my part.
Do you have the index for what ever aligner that was used? I wonder if there is a way to recreate the fasta from it.
Nope - the directory with the fasta and the indices was nuked. FWIW, I'm pretty sure that I can recreate this fastq by digging up the vector and gene sequences and spending an hour or two doing manual surgery. It just seems like there oughta be a better way!