Hi. I created a metagenome assembly using SPAdes, mapped the assembly using Bowtie2 and Samtools (creating BAM and the indexed BAI files), then binned the assemblies. Somewhere in the file back-up process the SPAdes assembly file was over written. I'd like to recreate my SPAdes assembly fasta without having to remap and rebin the sample.
So my questions are: 1) Can I recreate the SPAdes assembly from the BAM file? I tried running "samtools fasta file.bam > file.fasta" but this didn't produced a fasta of the reads rather than scaffolds. Is there something I'm missing? or 2) If I rerun SPAdes with all the same parameters (k-mers, cutoffs, etc.) will it generate the exact same assembly file as the first time? E.g., will the new SPAdes assembly work with the pre-existing BAM file and bins or will I need to redo the who mapping and binning process?
Thanks for your input! Cheers!
From the BAM file you should be able to reproduce the input file to the alignment. The
samtools fasta
command is how you are supposed to do it.What seem to be the problem with that output?
When I ran the
samtools fasta
command it produced a fasta file of the readsWhat I'm looking to do is recreate a fasta of the scaffolds from the original assembly that was used (along with the read files) to generate the BAM file.
Is there something I'm missing in the
samtools fasta file.bam > file.fasta
command?No you can't create the original reference assembly that was used for the alignments from the alignment BAM file.
Thanks, I was afraid that might be the case.
The wording on the post is imprecise, we don't understand what was aligned to what?
Have you aligned the original raw reads to the contigs of the assembly? If so you could perhaps generate consensus calls of the reads relative to the reference and reproduce the reference that way.
Regardless it is probably best if you recreate the assembly.
Sorry for the confusion and thanks for your feedback!
Yes, to make the BAM file, the raw reads were mapped/aligned to the assembly fasta (scaffolds file from SPAdes).
I think I'll have SPAdes recreate the assembly, then qc it to see it the assembled scaffolds came out the same (e.g. if they correctly map to the bins that were made with the previous, lost, assembly fasta). It seems like with the exact same input SPAdes would produce the same result, but I guess I'll find out. If they're different, then I'll just have to re-bin the sample.