Entering edit mode
3.3 years ago
Maliha
•
0
I want to remove the headlines of a fasta file that contains multiple protein sequences. I need to count the amino acid numbers.
>Os12t0641500-03 Similar to RecF/RecN/SMC N terminal domain containing protein, expressed.
MAAAAAGKGGGGQGRIHRLEVENFKSYKGTQTIGPFFDFTAIIGPNGAGKSNLMDAISFV
LIKVPLL*
>Os12t0597800-01 Similar to Helix-loop-helix DNA-binding domain containing protein, expressed.
MMSFPYSSGDLGEATTAAAAAVDMITLDQMFRDYDASTGDDLFELVWESCGGGEIDSGAG
LGRQ*
>Os12t0598600-00 Similar to H0315A08.1 protein.
MKRSMNYSGIECFTFGDDNKLRIFPPNSYKFKPKDHIILDEVQECILDNFWYQYNNKREE
FSDLDTMDLGGHGQPDE*
Hello,
it's not clear what you are trying to achieve. Please add an example of your desired output and explain how it is related to the input.
fin swimmer
Check BioPython for that