extract a DNA sequence based on header in multifasta file
1
0
Entering edit mode
4.1 years ago
Optimist ▴ 190

Hello all,

My multifasta file with rRNA sequences looks like this.

>16S_rRNA::1:4-522(-)
TGCCTTCGGGAACTCTGAGACAGGTGCTGCATGGCTGTCGTCAGCTCGTGTTGTGAAATGTTGGGTTAAGTCCCGCAACGAGCGCAACCCTTATCCTTTGTTGCCAGCGCGTGATGGCGGGAACTCAAAGGAGACTGCCGGTGATAAACCGGAGGAAGGTGGGGATGACGTCAAGTCATCATGGCCCTTACGAGTAGGGCTACACACGTGCTACAATGGCGTATACAAAGGGAAGCGACCCCGCGAGGGCAAGCGGAACTCATAAAGTACGTCGTAGTCCGGATTGGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAGTAATCGTAGATCAGAATGCTACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTCCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTT
>16S_rRNA::0-508(-)
TTGACGTTACCGACAGAAGAAGCACCGGCTAACTCCGTGCCAGCAGCCGCGGTAATACGGAGGGTGCAAGCGTTAATCGGAATTACTGGGCGTAAAGCGCACGCAGGCGGTTGATTGAGTCAGATGTGAAATCCCCGGGCTTAACCCGGGAATTGCATCTGATACTGGTCAGCTAGAGTCTTGTAGAGGGGGGTAGAATTCCATGTGTAGCGGTGAAATGCGTAGAGATGTGGAGGAATACCGGTGGCGAAGGCGGCCCCCTGGACAAAGACTGACGCTCAGGTGCGAAAGCGTGGGGAGCAAACAGGATTAGATACCCTGGTAGTCCACGCTGTAAACGATGTCGACTTGGAGGTTGTGCCCTTGAGGCGTGGCTTCCGGAGCTAACGCGTTAAGTCGACCGCCTGGGGAGTACGGCCGCAAGGTTAAAACTCAAATGAATTGACGGGGGCCCGCACAAGCGGTGGAGCATGTGGTTTAATTCGATGCAACGCGAAGAACCTTACCT
>5S_rRNA::1-80(-)
TGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTG
>5S_rRNA::1-77(-)
TGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGT
>5S_rRNA::1:4731-4814(-)
TGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGTGAGAGTAGGGAACCGCC

I want to extract only 16S_rRNA sequences into a new file. Is there a way to do this and I have to apply this on a large dataset of 500 files.

Thanking You

fasta sequence extraction • 734 views
ADD COMMENT
1
Entering edit mode
4.1 years ago

Try:

$ awk '$0 ~ /^>16S_rRNA/ {getline seq; print $0,seq}' test.fa
$ sed -n '/^>16S_rRNA/{N;p}' test.fa

don't use sed -i

ADD COMMENT
0
Entering edit mode

Thanks for the trick. awk has worked for me.

ADD REPLY

Login before adding your answer.

Traffic: 2166 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6