Hey there guys.
We use the Blast Command Line toolkit to Blast populations against each other. We created a database of transcriptomes of one population and then used the blast command line kit to compare the transcriptome of the other species. I should mention we got the longest ORFs from each transcriptome and compared those, not just the transcriptome. Anyway, when we do this we attain a text file with the results and % identity and such. I was wondering if there is any way to convert this to an alignment fasta file. Like the following format,
>title
ATCGCTGCATCGATCGACTTTTCGATCGATC---CGCGCGCGTAGAGCTAGCTAGCT
>title2
ATCGCTGCATCGATCGACTTTTCGATCGATCTTTCGCGCGCGTAGAGCTAGCTAGCT
>title3
GTAGATGATAGATAGATGAAGATAGATAGATAG-GTAGATCGATC----GTCATGAC
>title4
GTAGATGATAGATAGATGAAGATAGATAGATAG-GTAGATCGATC-GC-GTCATGAC
You have a single ORF (from the database) and the matching query hit second. I need it in this specific format because I need to be able to retrieve the dn/ds ratio using a script I edited. The script works and everything but it requires that format and I have no idea how to get that from an blast outformats. So in summation or TL:DR How do I convert ANY blast outfmt to an alignment fasta as shown above (sequence, similar sequence, sequence, similar sequence, etc.) If someone could walk me through, this I'd greatly appreciate it.
Brilliant! I was able to research the SearchIO module and was able to create a script to do it and this post led me there!
Thank you very much!