Please I would like to find out how to remove "-T" from the name in my GFF3 file
i have pasted a sample
For example I will like the ID ID=C1_00010W_A-T to become ID=C1_00010W_A in otherwords the ID section to match the Parent section
CaO19.6115,IPF21113.1,IPF27828.1,orf19.13534,orf19.6115
Ca22chr1A_C_albicans_SC5314 CGD mRNA 4059 4397 . + . ID=C1_00010W_A-T;Parent=C1_00010W_A;Name=C1_00010W_A;Note=%28orf19.6115%29%20Dubious%20open%20reading%20frame;orf_classification=Dubious;Alias=C1_00010W,C1_00010W_B,CaO19.11880,CaO19.13534,CaO19.4402,CaO19.6115,IPF21113.1,IPF27828.1,orf19.13534,orf19.6115
Ca22chr1A_C_albicans_SC5314 CGD exon 4059 4397 . + . ID=C1_00010W_A-T-E1;Parent=C1_00010W_A-T
Ca22chr1A_C_albicans_SC5314 CGD CDS 4059 4397 . + 0 ID=C1_00010W_A-P;Parent=C1_00010W_A-T;orf_classification=Dubious;parent_feature_type=ORF
Ca22chr1B_C_albicans_SC5314 CGD gene 4059 4397 . + . ID=C1_00010W_B;Name=C1_00010W_B;Note=%28orf19.6115%29%20Dubious%20open%20reading%20frame;orf_classification=Dubious
Ca22chr1B_C_albicans_SC5314 CGD mRNA 4059 4397 . + . ID=C1_00010W_B-T;Parent=C1_00010W_B;Name=C1_00010W_B;Note=%28orf19.6115%29%20Dubious%20open%20reading%20frame;orf_classification=Dubious
Ca22chr1B_C_albicans_SC5314 CGD exon 4059 4397 . + . ID=C1_00010W_B-T-E1;Parent=C1_00010W_B-T
Ca22chr1A_C_albicans_SC5314 CGD mRNA 4409 4720 . - . ID=C1_00020C_A-T;Parent=C1_00020C_A;Name=C1_00020C_A;Note=%28orf19.6114%29%20Protein%20of%20unknown%20function%3B%20transcript%20detected%20on%20high-resolution%20tiling%20arrays;orf_classification=Uncharacterized;Alias=C1_00020C,C1_00020C_B,CAWG_03102,CaO19.13533,CaO19.6114,IPF21135.1,IPF27840.1,orf19.13533,orf19.6114,orf6.6227
Ca22chr1A_C_albicans_SC5314 CGD exon 4409 4720 . - . ID=C1_00020C_A-T-E1;Parent=C1_00020C_A-T
Ca22chr1A_C_albicans_SC5314 CGD CDS 4409 4720 . - 0 ID=C1_00020C_A-P;Parent=C1_00020C_A-T;orf_classification=Uncharacterized;parent_feature_type=ORF
Ca22chr1B_C_albicans_SC5314 CGD gene 4409 4720 . - . ID=C1_00020C_B;Name=C1_00020C_B;Note=%28orf19.6114%29%20Protein%20of%20unknown%20function%3B%20transcript%20detected%20on%20high-resolution%20tiling%20arrays;orf_classification=Uncharacterized
Ca22chr1B_C_albicans_SC5314 CGD mRNA 4409 4720 . - . ID=C1_00020C_B-T;Parent=C1_00020C_B;Name=C1_00020C_B;Note=%28orf19.6114%29%20Protein%20of%20unknown%20function%3B%20transcript%20detected%20on%20high-resolution%20tiling%20arrays;orf_classification=Uncharacterized
Ca22chr1B_C_albicans_SC5314 CGD exon 4409 4720 . - . ID=C1_00020C_B-T-E1;Parent=C1_00020C_B-T
Ca22chr1B_C_albicans_SC5314 CGD CDS 4409 4720 . - 0 ID=C1_00020C_B-P;Parent=C1_00020C_B-T;orf_classification=Uncharacterized;parent_feature_type=ORF
Ca22chr1A_C_albicans_SC5314 CGD gene 8597 8908
I'm not sure that's valid according to the GFF3 specs? (ID should be unique ?)
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.Thank you!