After spent a lot of time to run the 13 step of orthoMCL I reached running it successfully.
The difficulty for me it's not orthoMCL but, to know the usage of each command because the documentation it's not clear enough.
Here I present the details of commands with parameter that I have used according to the OrthoMCL User's Guide http://orthomcl.org/common/downloads/software/v2.0/UserGuide.txt.
(1) I using MySql,
You need first to create a database name : orthomcl
this is my orthomcl.config.template
file
# this config assumes a mysql database named 'orthomcl'. adjust according
#to your situation.
dbVendor=mysql
dbConnectString=dbi:mysql:orthomcl
dbLogin=your_login
dbPassword=your_password
similarSequencesTable=SimilarSequencesorthomcl
orthologTable=Orthologorthomcl
inParalogTable=InParalogorthomcl
coOrthologTable=CoOrthologorthomcl
interTaxonMatchView=InterTaxonMatchorthomcl
percentMatchCutoff=50
evalueExponentCutoff=-5
oracleIndexTblSpc=NONE
(2) et (3) obvious
(4) in the command line switch to the root of orthomcl file, and run this command, it will create tables in your orthomcl database
orthomclSoftware-v2.0.9$ bin/orthomclInstallSchema my_orthomcl_dir/orthomcl.config.template my_orthomcl_dir/install_schema.log
(5) according to orthomcl guide switch to
orthomclSoftware-v2.0.9$ cd my_orthomcl_dir/compliantFasta/
and after cow.fasta
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta cow /home/....../cow.fasta 1
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta hsa /home/....../human.fasta 1
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclAdjustFasta mus /home/....../mouse.fasta 1
These commandes will create 3 files in compliantFasta directory name cow.fasta, hsa.fasta ans mus.fasta
(6)
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclFilterFasta . 10 20
(7) Step 7: All-v-all BLAST
You need first to create a local database in file for blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ makeblastdb -in goodProteins.fasta -dbtype prot -out my_prot_blast_db
and
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ blastall -p blastp -F 'm S' -v 100000 -b 100000 -z 55 -e 1e-5 -d my_prot_blast_db -i goodProteins.fasta -o out.tab -m 8
(8) In my own case, I have create a directory name 'blast' compliantFasta directory in copy cow.fasta, hsa.fasta ans mus.fasta in this diectory by doing
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ mkdir blast
and
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp cow.fasta blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp hsa.fasta blast
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ cp mus.fasta blast
and
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclBlastParser out.tab blast/ >> similarSequences.txt
(9)
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclLoadBlast ../orthomcl.config.template similarSequences.txt
(10)
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclPairs ../orthomcl.config.template ../orthomcl_pairs.log cleanup=no
(11)
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclDumpPairsFiles ../orthomcl.config.template
(12)
orthomclSoftware-v2.0.9/my_orthomcl_dir/compliantFasta$ mcl mclInput --abc -I 1.5 -o mclOutput
(13)
orthomclSoftware/my_orthomcl_dir/compliantFasta$ ../../bin/orthomclMclToGroups cluster 1000 < mclOutput > groups.txt
This is what I have done to execute successful.
I hope that will help you.
for those who like me love copy and paste
if you like please just vote, like or comment.
Hi Durelle. I would like to run orthoMCL too but I am quite confuse with the analysis itself. I am still new to this that's why there were some things that I am confuse a bit. Below are my questions and the error I went through when I tried running the orthoMCL.
1.) DATA. I'm not really sure about the input data in the pre-proccessing stage (orthomclAdjustFasta). I've read in one of the article that the input file is a protein sequence in a fasta format. What I have is my de novo assembled transcriptome. Can I use that in the orthoMCL? I would like to know the orthologs actually. Or, do I need to run Transdecoder then convert the .pep into fasta format and used that as my input file?
2.) I tried using my de novo transcriptome in the orthomclAdjustFasta but i met with an error while running the analysis. And if I can actually used my transcriptomic data, then it would be nice if you can shed some light with this error.
My assembled transcriptome contains >300,000 transcripts. A few of the first transcripts are:
When I did the orthomclAdjustFasta i met with an error stating that TR3 is repeated (or something like that. Please go back to the example above) and thus terminates the run. I am using the ssh command and that's what I got in my .err output.
If I can used my data, what should I do to solve this error? I am thinking of renaming those repeated transcripts but I am afraid if I would jeopardize the result. Would that infer with the blast run? What analysis can you refer to me to edit those repeated transctips as fast as I could? Renaming it individually is very tedious.
Hello stephravelo7!
I don't know if you have already fixed your problem. You must filter your .pep (protein) file and keep only unique (e.g. longest) isoforms in a fasta format style. File extension doesn't matter as long as it's correctly formatted. That is:
That's what you should use as input for orthomclAdjustFasta.
Hope it can help :-)
Hi,
I am following the instructions and my analysis is running well until in step 9.
Hi Santiago,
Thank for the answer. Anyway, i ran with this error in orthomclDumpPairsFIles:
Do you have any suggestion on how to solve this? Thanks.
This tutorial was really useful for me! Thanks!
Great. Thanks for posting your experience
Hi, thanks for posting the streamlined help.
I was able to run everything up to step 8 and then, after submitting my command for step 9 I got the following error:
I am not very familiar with SWL, can someone please hep me figuring out what is wrong?
My command line was :
And my config scheme looks like:
Please use the formatting bar (especially the
code
option) to present your post better. I've done it for you this time.can you post a
head
of yoursimilarSequences.txt
? It seems that you are missing some fields in that file? Perhaps also add the blast command you execute in one of the previous steps.Thanks a lot!, I could spot the error by looking at the SimilarSequences.txt files, I had a wrong header ( several limes of "processing XX genome" just before the columns start). That was because I used nohup to redirect output. Now my SimilarSequences head table looks like:
Everything worked fine now but at the end only 2 of the 6 genomes I added are in the final tables (Only Atra and Prap). My config file looks like this now, could that be the issue?
many thanks again!.
I am also stucked in step 8. mysql Problem DBD::mysql::st execute failed: Loading local data is disabled; this must be enabled on both the client and server sides at ../orthomclLoadBlast line 39, <F> line 9.
this sounds like a 'configuration' issue on your mysql server. Most likely you will need to talk to your local sys-admin to open this up for you (== such that you are able & allowed to submit data to the DBs)
Could you please let me know if you have fixed this problem? I have the same issue here. Thank you!
Thank you very much! It was very useful and easy to follow