I have a long file (fasta) like this:
>Eha-Novel-38_5p
ACCCATTTTCGTCTGAGGATAAT
>Eha-Novel-38_3p*
TTTCCCAGACCCAAATGGGTGC
>Eha-Novel-46_3p
AATGGCGGCCTGATATCCCGGA
>Eha-Novel-46_5p*
TGGGGTATTAAGCCGCGATTGT
>Eha-Novel-44_3p
TATCACAGTCATTTACGGGTAC
>Eha-Novel-44_5p*
TCCCGTATTTGACTGTGACTGAG
I want to print only lines without the "*" and its following line.
Desired output:
>Eha-Novel-38_5p
ACCCATTTTCGTCTGAGGATAAT
>Eha-Novel-46_3p
AATGGCGGCCTGATATCCCGGA
>Eha-Novel-44_3p
TATCACAGTCATTTACGGGTAC
I tried using grep "*" -v -A 1 FILE, but that did not work.
Thanks for your help.
Maybe you can try:
This way you wouldn't have the sequence, just the identifier.
You are right.Maybe sed works.
sed -n -e "/\p$/ {p;n;p}" FILE >OUTPUT
Everyone needs to learn the before and after flags for grep! Just do -A 1 on the grep:
The OP knew about -A, it just doesn't work well with -v.
There's no need to do an inverse search, just search for lines ending 'p'.