Hi I am doing a whole genome analysis of the Dalbergia species. After genome assembly, I utilized the BREAKER3 gene prediction tool available on the Galaxy web server. The viridiplantae protein sequences were given as a training set. Subsequently, I used the output and GTF file along with the reference genome to generate the protein fasta file (using this script: https://github.com/Gaius-Augustus/Augustus/blob/master/scripts/getAnnoFastaFromJoingenes.py). However, I noticed that at the end of each fasta file, there is an asterisk symbol. Is it acceptable to remove it using the 'sed' command or how should I handle this? Your expertise and insights on this would be greatly appreciated
protein seq output
>g1.t1
MEGLVRSGINPVRVSGGRRHQSRFLDASTLHLRKRKSGFAVGIGNMKLSSPLVVAAASVG
GSKVVHFENTLPSKETLELWREGDAVCFDVDSTVCLDEGIDELAEFCGAGKAVAEWTARA
MGGSVPFEEALAARLKLFNPSLSQLQNFLEQKPPRLSPGIQELVKKLKANHIDVYLISGG
FRQMINPVASILGIPKENIFANQLLFGSSGEFLGFDENEPTSRSGGKATAVQQIKKAHGY
KALTMIGDGATDLEARRPGGADLFICYAGVQLREAVAAKADWLVFNFKDLINSLG*
>g2.t1
MQGLRRYPNDINPLATIRVYPTVNESDDHEIAALWNRTPALFIGGACVGWLESLVALHVS
GHLVSKLIQVGALWV*
>g3.t1
MVQACYDSFNYNPYCGSCKYPPEELFEALDLGHLGIWSERTNWEGYVTISDDEMSRKLGM
RDVAIVWRGTTPYTE*