Hi there, I have a list of gene name in a csv file (orthofinder output). I want to replace these names with respective GO or KEGG terms. Any suggestions? I was trying "genescf" but getting the following error:
[zillur@genomics Results_Nov15]$ ./../../../genescf/geneSCF-master-source-v1.1-p2/geneSCF -m=update -i=Orthogroups.csv -t=sym -db=KEGG -o=genescf_out -p=yes -org=pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv
Error:background gene set information missing --background Since you have selected 'update' mode. It will take a while to prepare new updated database Connecting remote RUD.. processing started....Sun Feb 26 08:54:42 AST 2017 Retreiving 0 KEGG pathways for pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv Do not panic. The processing is going on... Database retreived..You are now ready to use geneSCF with organism pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv from --database KEGG Done....Sun Feb 26 08:54:43 AST 2017 =>processing in update started....Sun Feb 26 08:54:43 AST 2017 => Finished retriving database... => Calculating statistics... find: ‘pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv/class/lib/db/yes/kegg_database.txt’: No such file or directory Note:Only KEGG and Geneontology supports multiple organisms (GeneSCF-xx/org_codes_help). If you choose REACTOME/NCG database please specify organism as 'Hs'. Currently REACTOME and NCG in GeneSCF only supports Human (Hs). KEGG last updated
Example input types
gid | sym => Retreving gene list for yes from KEGG sh: pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv/mapping/DB/Orthogroups.csv_gene_list.txt: No such file or directory curl: (23) Failed writing body (2717 != 2896) => Mapping user list Can't open perl script "pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv/class/scripts/mappingIDS.pl": No such file or directory sh: pfa,pyo,pcb,pbe,pkn,pvx,pcy,cpv,cho,tgo,bbo,beq,tan,tpv/mapping/Orthogroups.csv_input_list.txt: No such file or directory Note: There were 0 genes mapped from 15068 user provided unique genes (0 %) Please cross-check your gene identifier.Sun Feb 26 08:54:45 AST 2017 finished processing
I was also trying egg_nog mapper for my orthogroups fasta file (~50000 files) but it takes eternal time.
Here is my sample input file:
CryptoDB-29_CparvumIowaII_AnnotatedProteins PiroplasmaDB-28_BmicrotiRI_AnnotatedProteins PiroplasmaDB-29_TparvaMuguga_AnnotatedProteins PlasmoDB-28_PbergheiANKA_A$
OG0000000 PBANKA_0000600, PBANKA_0000701, PBANKA_0000801, PBANKA_0001001, PBANKA_0001101, PBANKA_0001201, PBANKA_0001301, PBANKA_0001401, PBANKA_000$ OG0000001 PmUG01_00010100.1-p1, PmUG01_00010200.1-p1, PmUG01_00010400.1-p1, PmUG01_0$ OG0000002 PF3D7_0100200, PF3D7_0100400, PF3D7_0100600, PF3D7_0100800, PF3D7_0100900, PF3D7_0101000, PF3D7_0101600, PF3D7_010$ OG0000003 PBANKA_0000901, PBANKA_0001200, PBANKA_0001601, PBANKA_0007501, PBANKA_0008101, PBANKA_0100100, PBANKA_0112661.1, PBANKA_0112701, PBANKA_0$ OG0000004 TP03_0403-t26_1-p1 PCYB_001410, PCYB_001660, PCYB_005410, PCYB_006920, PCYB_101490 PKNH_0000100, PKNH_0000200, PKNH_0$ OG0000005 PCYB_001700, PCYB_002110, PCYB_002240 PF3D7_0113100, PF3D7_0115000, PF3D7_0402200, PF3D7_0424400, PF3D7_0800700, PF3D7_0$ OG0000006 PCYB_001280, PCYB_001550, PCYB_001690, PCYB_002300, PCYB_002310, PCYB_002420, PCYB_002630, PCYB_002840, PCYB_003590, PCYB_$ OG0000007 PCYB_001020, PCYB_001140, PCYB_001270, PCYB_001290, PCYB_001300, PCYB_001310, PCYB_001370, PCYB_001400, PCYB
Any help about this matter?
Best Regards Zillur