Greetings!!!
I have barrnap output of 100 Pseudomonas aeruginosa genomes.
The output looks like this (sequences have been trimmed to avoid huge lines in biostars)
>16S_rRNA::Pseudomonas_aeruginosa_PAOC_Seq_1:6516148-6517679(-)
TGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTGCTCCTGGATTCAGCGGCGGACGGGTGAG
>23S_rRNA::Pseudomonas_aeruginosa_PAOC_Seq_1:6512786-6515674(-)
TCAAGTGAAGAAGCGCATACGGTGGATGCCTTGGCAGTCAGAGGCGATGAAAGACGTGGTAGCCTGCGAAAAGCT
>5S_rRNA::Pseudomonas_aeruginosa_PAOC_Seq_1:6512529-6512639(-)
TGACGATCATAGAGCGTTGGAACCACCTGATCCCTTCCCGAACTCAGAAGTGA
I want to extract only 16s rRNA headers and sequences from all the outputs.
Result output should look like this
>16S_rRNA::Pseudomonas_aeruginosa_PAOC_Seq_1:6516148-6517679(-)
TGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAGCGGATGAAGGGAGCTTGCTCCTGGATTCAGCGGCGGACGGGTGAG
How can I get this output
Thank you all
What have you tried so far?
did you had a look at for instance
seqtk
orSeqKit
?filter_fasta.py has worked.
with awk and flattend fasta:
Hi optimist,
can you share your script? how did you used barrnap on 100 P.aeruginosa genome ?
Thank you so much!