Entering edit mode
2.2 years ago
a.mostafa5050
▴
10
Hi,
I have a table like that
>Feature gnl|XXX|IFEJKLFI_1 gnl|XXX|IFEJKLFI_1
locus_tag IFEJKLFI_00001 IFEJKLFI_00001
locus_tag IFEJKLFI_00001 IFEJKLFI_00001
protein_id gnl|XXX|IFEJKLFI_00001 gnl|XXX|IFEJKLFI_00001
locus_tag IFEJKLFI_00002 IFEJKLFI_00002
locus_tag IFEJKLFI_00002 IFEJKLFI_00002
protein_id gnl|XXX|IFEJKLFI_00002 gnl|XXX|IFEJKLFI_00002
locus_tag IFEJKLFI_00419 IFEJKLFI_00419
locus_tag IFEJKLFI_00419 IFEJKLFI_00419
protein_id gnl|XXX|IFEJKLFI_00419 gnl|XXX|IFEJKLFI_00419
>Feature gnl|XXX|IFEJKLFI_2 gnl|XXX|IFEJKLFI_2
locus_tag IFEJKLFI_00423 IFEJKLFI_00423
locus_tag IFEJKLFI_00423 IFEJKLFI_00423
protein_id gnl|XXX|IFEJKLFI_00423 gnl|XXX|IFEJKLFI_00423
>Feature gnl|XXX|IFEJKLFI_3 gnl|XXX|IFEJKLFI_3
>Feature gnl|XXX|IFEJKLFI_4 gnl|XXX|IFEJKLFI_4
>Feature gnl|XXX|IFEJKLFI_5 gnl|XXX|IFEJKLFI_5
I want to remove rows that not contain locus_tag
So, I want the table to look like that and remove the >Feature
rows that contain nothing.
>Feature gnl|XXX|IFEJKLFI_1 gnl|XXX|IFEJKLFI_1
locus_tag IFEJKLFI_00001 IFEJKLFI_00001
locus_tag IFEJKLFI_00001 IFEJKLFI_00001
locus_tag IFEJKLFI_00002 IFEJKLFI_00002
locus_tag IFEJKLFI_00002 IFEJKLFI_00002
locus_tag IFEJKLFI_00419 IFEJKLFI_00419
locus_tag IFEJKLFI_00419 IFEJKLFI_00419
>Feature gnl|XXX|IFEJKLFI_2 gnl|XXX|IFEJKLFI_2
locus_tag IFEJKLFI_00423 IFEJKLFI_00423
locus_tag IFEJKLFI_00423 IFEJKLFI_00423
Could you please help me with that using awk or any other method?
Thanks!!
Hi, Thanks for the quick reply!
This command removed protein_ID, but the other rows that do not contain locus_tag keept as they are.
To elaborate more, the output was like that
I need to remove the other
>Feature
lines that do not contain locus_tag.Ah, ok, that's a bit trickier. Without a file to test with it's a bit hard to make reliable code. I'll assume that there is a tab before "locus_tag" for this bit of code, so you may need to tweak it if there is other whitespace instead:
The first perl line moves the "locus_tag" lines to the same line of the the ">Feature" they match. Then we grab only the lines that still have "locus" so that ">Feature" lines w/o anything else get dropped. Then I re-use a perl one-liner to put the locus_tag lines after their ">Feature" lines.
Like I said, without a test file I can't be 100% sure this will work but hopefully that will give you some code to tweak to get what you want.
Note that there are no spaces after the "\" at the end of each line or else the backslash don't work properly.
Good luck!
Matt
Hi, Thanks for the reply.
The output looked like (see below) without the
>Feature
line and I need to know which locus_tag belongs to which FeatureAlso, I can send you part of the file, if it is possible.
ok, try #3:
Try that. Double check that I have the number of spaces right in the last perl -pe call.
Matt
It worked.
Thanks a bundle!!!!!!!