Entering edit mode
3.5 years ago
wang-yanfang
•
0
Dear all,
I want to short my fasta file header, which is like below, I listed two sequences. At the same time I want to keep all the sequences exactly the way they are.
>lcl|VSMA01000001.1_prot_KAB5584702.1_1 [locus_tag=GE09DRAFT_1165795] [db_xref=InterPro:IPR002198,JGIDB:Conioc1_1165795] [protein=tetrahydroxynaphthalene reductase] [protein_id=KAB5584702.1] [location=join(1826..1931,1988..2458,2736..2863,2927..3064)] [gbkey=CDS]
MPGLTTNTGKYDQIPGPLGLASASLEGKVALVTGAGRGIGREMAQELGRRGAKVIVNYANSQESAEEVVQAIKKSGSDAA SIKANVSDVDQIVRMFDEAVKVFGKLDIVCSNSGVVSFGHVKDVTPEEFDRVFNINTRGQFFVAREAYKHLEVGGRLILM GSITGQAKGVPKHAVYSGSKGTIETFVRCMAIDFGDKKITVNAVAPGGIKTDMYHAVCREYIPNGINLTDDEVDEYACTW SPLHRVGLPIDIARVVCFLASQDGEWINGKVLGIDGAACM
>lcl|VSMA01000001.1_prot_KAB5584703.1_2 [locus_tag=GE09DRAFT_1165796] [db_xref=InterPro:IPR021840,JGIDB:Conioc1_1165796] [protein=hypothetical protein] [protein_id=KAB5584703.1] [location=complement(join(3193..3215,3871..4374,4440..5628,5725..5886,5941..5989,6050..6066,6130..6234,6286..6495,6547..6561,6622..6728,6843..7103,7155..7719))] [gbkey=CDS]
MFHPSRRRAEQTAYEYNIQATEDHEHDHGVVNLSAEKRRRPRGKRPNYKPTALKWPFIVAQILVLVIAMGLIIWAEKAMP DSDSTAIIDPLPSKGLPERSVKPEFGKHFRRDNTSGVVETATSQLDVQETTLTGGDGLITPGLGSTNGPADNVKTAVTDD
And I only want to keep the header like this:
>GE09DRAFT_1165795
MPGLTTNTGKYDQIPGPLGLASASLEGKVALVTGAGRGIGREMAQELGRRGAKVIVNYANSQESAEEVVQAIKKSGSDAA SIKANVSDVDQIVRMFDEAVKVFGKLDIVCSNSGVVSFGHVKDVTPEEFDRVFNINTRGQFFVAREAYKHLEVGGRLILM GSITGQAKGVPKHAVYSGSKGTIETFVRCMAIDFGDKKITVNAVAPGGIKTDMYHAVCREYIPNGINLTDDEVDEYACTW SPLHRVGLPIDIARVVCFLASQDGEWINGKVLGIDGAACM
>GE09DRAFT_1165796
MFHPSRRRAEQTAYEYNIQATEDHEHDHGVVNLSAEKRRRPRGKRPNYKPTALKWPFIVAQILVLVIAMGLIIWAEKAMP DSDSTAIIDPLPSKGLPERSVKPEFGKHFRRDNTSGVVETATSQLDVQETTLTGGDGLITPGLGSTNGPADNVKTAVTDD
I would be super greatful for any help.
Thanks, Yanfang
See if this works:
Hey GenoMax,
Thanks you so much. I managed to do this, and I adapted your code with a code I read somewhere else.
I posted it here, hope it can be useful for others.
Thanks so much for your help. Yanfang