Entering edit mode
3.3 years ago
sapuizait
▴
10
Dear all
Apologies if this has been asked before but I cannot find a useful answer around
I am using kma to align reads against a set of genes and I am looking for a method/script/software to convert the detailed alignment format to a fasta format:
example input format:
# VFG000863(gb|BAA94855) (astA)
template: ATGCCATCAACACAGTATATCCGAAGGCCCGCATCCAGTTATGCATCGTGCATATGGTGC
||||||||||||||||||||_||_||_|||_|||||||||||||||||||||||||||||
query: ATGCCATCAACACAGTATATTCGGAGACCCACATCCAGTTATGCATCGTGCATATGGTGC
template: GCAACAGCCTGCGCTTCGTGTCATGGAAGGACTACAAAGCCGTCACTCGCGACCTGA
|||||||_|||||||||||||||||||||||||||||||||||||||||||||||||
query: GCAACAGTCTGCGCTTCGTGTCATGGAAGGACTACAAAGCCGTCACTCGCGACCTGA
# VFG000924(gb|NP_752610) (fepB)
template: GTGAGACTCGCCCCGCTCTACCGCAACGCCCTTCTATTAACAGGACTTTTGCTTTCAGGA
||||||||||||||||||||||||||||||||||||||||||||||||||_|||||||||
query: GTGAGACTCGCCCCGCTCTACCGCAACGCCCTTCTATTAACAGGACTTTTACTTTCAGGA
template: ATAGCCGCAGTTCAGGCCGCCGACTGGCCGCGTCAGATTACTGACAGCCGTGGCACTCAT
||||||||||||||||||||_|||||||||||||||||||||||||||||||||||_|||
query: ATAGCCGCAGTTCAGGCCGCTGACTGGCCGCGTCAGATTACTGACAGCCGTGGCACACAT
thanks
an example of output is needed.
sorry, output would look like this:
with seqkit and sed:
seqkit is for sorting the sequences by header and for printing sequence in a single line.
Thanks! Very cool use of sed!