I have 3 files : File1
WP_082277722.10 gi|37527581|ref|NP_930925.1| 59.420 138 46 3 141 275 2895 3025 1.34e-41 157 49
WP_082277722.10 gi|91779924|ref|YP_555132.1| 49.254 134 56 3 151 276 2074 2203 6.02e-27 115 46
WP_082277722.10 gi|212640077|ref|YP_002316597.1| 33.981 103 57 5 180 275 403 501 0.008 42.7 35
WP_082277722.10 gi|384147149|ref|YP_005529965.1| 37.681 69 35 3 210 276 2019 2081 0.032 41.2 24
WP_082277722.10 gi|212640328|ref|YP_002316848.1| 32.323 99 56 5 184 275 469 563 0.052 40.0 33
WP_082277722.10 gi|338738658|ref|YP_004675620.1| 26.761 71 43 1 205 275 4 65 0.99 35.0 26
File 2
158333741,Acaryochloris_marina_MBIC11017_uid58167,158333741,432,1,432,COG0001,0,
158339504,Acaryochloris_marina_MBIC11017_uid58167,158339504,491,1,491,COG0001,0,
379012832,Acetobacterium_woodii_DSM_1030_uid88073,379012832,430,1,430,COG0001,0,
302391336,Acetohalobium_arabaticum_DSM_5501_uid51423,302391336,441,1,441,COG0001,0,
311103820,Achromobacter_xylosoxidans_A8_uid59899,311103820,425,1,425,COG0001,0,
332795879,Acidianus_hospitalis_W1_uid66875,332795879,369,1,369,COG0001,0,
332796307,Acidianus_hospitalis_W1_uid66875,332796307,416,1,416,COG0001,0,
37527581,Acetohalobium_arabaticum_DSM_5501_uid51423,302391336,441,1,441,COG0001,0
,
File3
COG0001 H Glutamate-1-semialdehyde aminotransferase
COG0002 E N-acetyl-gamma-glutamylphosphate reductase
COG0003 P Anion-transporting ATPase, ArsA/GET3 family
COG0004 P Ammonia channel protein AmtB
COG0005 F Purine nucleoside phosphorylase
COG0006 E Xaa-Pro aminopeptidase
COG0007 H Uroporphyrinogen-III methylase (siroheme synthase)
COG0008 J Glutamyl- or glutaminyl-tRNA synthetase
COG0009 J tRNA A37 threonylcarbamoyladenosine synthetase subunit TsaC/SUA5/YrdC
COG0010 E Arginase family enzyme
COG0011 S Uncharacterized conserved protein YqgV, UPF0045/DUF77 family
COG0012 J Ribosome-binding ATPase YchF, GTP1/OBG family
Expected outcome :
WP_082277722.10 gi|37527581|ref|NP_930925.1| 37527581,Acetohalobium_arabaticum_DSM_5501_uid51423,302391336,441,1,441,COG0001,0 COG0001 H Glutamate-1-semialdehyde aminotransferase
what have you tried ?
i can do this in excel ..... but i have 2500 files for which i need to repeat the procedure .
Stealing that.... :P
Write a macro? You have to get into a programming mindset, end of the day it doesn't matter what tool you use, only matters how you use that tool and how well you know it. Excel is deceptively complex. You will encounter problems that you cannot explain, get to the root of or even reproduce at times. Excel does not preserve state or document logic. When it comes to problems such as yours, it's just a glorified calculator.
Use R or python, something that helps you learn the programming mindset and document your actions, as you will need to do this again in 6 months' time and programming means you'll have to spend only a fraction of this time then.
Excel is a hammer. The problem before you is a screw. Yes, it's a big hammer, and if you swing hard enough, you can make Excel do what you want, but it would be smarter to find a screwdriver. R (or Rstudio might be a bit easier to work with) can do this.
This question is presented without a proper explanation in body of the post. Please edit the original question and add this information there.
Instead of just posting parts of three files you should spell out which parts of the three are supposed to contain a
key
that can be used to relate the records. I tried to match a couple three numbers from file 1 but could not do it visually.WP_082277722.10 gi|384147149|ref|YP_005529965.1|
Is thegi
number384147149
supposed to be infile 2
? Then one needs to match theCOG
id from that line infile 2
with the list infile 3
?The examples posted here seem incomplete, which OP should have mentioned. While OP does give an example of desired output, they should also point out the logic.
I suspect this is part of their learning the mindset though, so once they do learn they will no longer need any help with this sort of problem :-)
Don't forget to follow up on your previous threads, multiple posts are without accepted answer:
If an answer was helpful, you should upvote it; if the answer resolved your question, you should mark it as accepted. You can accept more than one if they work.