i have 5000 files (name as: tni00001.keg, eco00001.keg etc) which contains underscre sign in 2nd column inside the many files (W909_00110). after the underscore sign, the number (i.e.00110) is actually represent the enzyme ID but some file out of 5000 donot contain this underscore sign in between the IDs.so now i want to extract those files names (like abc00001.keg) which don’t contains underscore sign _ in 2nd column of each files.
Example: keg file look like this from inside
D W909_00110 glk; glucokinase K00845 glk; glucokinase [EC:2.7.1.2]
D W909_17905 pgi; glucose-6-phosphate isomerase K01810 GPI; glucose-6-phosphate isomerase [EC:5.3.1.9]
D W909_19315 6-phosphofructokinase K00850 pfkA; 6-phosphofructokinase 1 [EC:2.7.1.11]
Is absence of
_
consistent for all lines in these files or only some records may not have_number
?total 5000 files and some files not have this underscore sign in between the IDs. how many files that donot conatins this sign inside, that's what i want to know and extract all these files names.
Clarification I was asking for is do all records in that file of interest not have a
_
or only some.Crossposted:
https://stackoverflow.com/q/57799730/680068
Are these fields all tab separated?
may I ask what the ultimate goal is? I.e. why are you specifically interested in files where the underscore is missing?