Hi!!!
I have a multifasta file wih headers like:
>trnN-GUU_INIA601-ARAGORN_v1.2.38 ccsA_INIA601-blatX
>rpl16_INIA601-blatX ndhF_INIA601-blatX psbJ_INIA601-blatX
>trnW-CCA-I_INIA601-ARAGORN_v1.2.38 trnL-UAG_INIA601-ARAGORN_v1.2.38
>psaC_INIA601-blatX trnR-UCU_INIA601-ARAGORN_v1.2.38 ndhA_INIA601-blatX
>trnC-ACA_INIA601-ARAGORN_v1.2.38 trnW-CCA-II_INIA601-ARAGORN_v1.2.38
I would like some way to only leave the name of the gene, like:
>rpl16
>trnW
>psaC
>trnC
Thank you so much for your kind help :)
with seqkit:
check if it makes sense to remove "_INIA601" and every thing after "_INIA601" from fasta headers.