Entering edit mode
8.8 years ago
oussama.badad
▴
10
Dear All
I am trying to remove the sequences with no functional information from a functional annotation fasta file
>Oeu043104.1|---NA---
MIESNFWDACWPHCLLRVLLSLLAESASQPLCPPLQRYNPKYLEDDYGVNQATEWLFYTPRDRENEIENIRNGVAVDGY
>Oeu043107.1|g7lir3_medtr alpha beta-hydrolase superfamily protein os=medicago truncatula gn=mtr_8g086260 pe=4 sv=1
MSTTQGTSPRGNINVKDEPDHLLVLVHGIMGSPSDWTYFEADLKRRLGKRFLIYASSCNTYTKTFTGIDGAGKRLAEEVMEIVRNTESLKKISFLAHSLG
GLFSRYAIAVLYMPNTSSDDSSVIAGSTNTSLKTSCYSNTGLIAGLEPSNFITLATPHLGVRGKKQVNPFSIILIDGPVLPFLLGLPFLEKIAAPLAPIF
TGRTGSQLFLTDGQPDRPPLLLRMASDCKDGKFVSALGAFRCRLLYANVSYDHMVGWRTSSIRRETELIKPPLQSLDGYKHVVSVEYCPPVSSEGPHFPE
EAAKAKQAAQNEPNNQNTVEYHETMEEEMIRGLQRLGWKKVDVSFHSAFWPFFAHNNINVKNEWLYNAGVGVVAHVADNIKQQENQQGSTYVAASL
I was wondering if someone can help me with a python script or a shell script
Thank you
Oussama
What have you tried?
Take some thoughts from these posts:
You could convert to single line fasta and use a simple grep to remove lines with --NA-- or use bio Python to have more control.