OrthoMcl - Orthologs.txt is empty. All output is present in inParalogs.txt
0
0
Entering edit mode
9.2 years ago

Hi,

I am attempting OrthoMCL to compare 10 different strains of a certain bacterial species, and went through the pipeline after installing MySql, OrthoMcl and Mcl locally.

The problem is that all the entries are in the inParalogs.txt, while the orthologs.txt is empty.

The input for the pipeline was the aminoacid fasta files (got from RAST output from assembled contigs).

I process each file through the pipeline individually.

I called on orthomcladjustfasta, orthomclfilterfasta, makeblastdb, blastall, orthomclblastparser, orthomclloadblast, orthomclpairs etc. They all ran to completion without any glitches.

My adjusted fasta header looks like this

>sampleID|Protein_id

makeblastdb and blastall commands I used for each of my sample are

makeblastdb -in mySample.goodProteins.fasta -dbtype prot -out mySample_blastDB
blastall -p blastp -i mySample.goodProteins.fasta -d mySample_blastDB -o mySample_blast.csv -e 1 -m 8 -a 2 -v 1000 -b 1000

Then I call on orthomclblastparser to produce similarSequences.txt for each file.

I appreciate any help in understanding why my orthologs.txt is empty and the inparalogs.txt has about 21000 rows of data.

Kind regards,
Brindha

orthologs orthomcl orthomcl-orthogroup • 2.6k views
ADD COMMENT
0
Entering edit mode

Typically, you want to include an outgroup in the clustering, so consider how closely related the strains are. Also, I'm not sure mixing blast+ database and legacy blast tools is a good idea, though it may work.

ADD REPLY
0
Entering edit mode

Thanks for your quick response @SES.

Regarding outgroup, should it be a related species or can be totally random bacterial species?

Also could you expand on what you mean by "I'm not sure mixing blast+ database and legacy blast tools is a good idea, though it may work". I was following on of the online tutorials to use makeblastdb and blastall. I tried earlier with formatdb that another tutorial suggested, but it didn't work with blastall. So, I used makeblastdb.

Look forward to hearing your suggestions.

ADD REPLY
0
Entering edit mode

The outgroup should be closely related, not random. By mixing the tools I mean, formatdb and blastall work together (both from legacy blast), and makeblastdb and blastp (blast+) work together. By mixing them you are using programs from different toolkits and they are not designed to work together (they may, but it is not advisable).

ADD REPLY
0
Entering edit mode

Yes. I used another strain from the same phlyum, but different family as the outgroup. The orthologs file is still empty while all the info is in the inparalogs file. Not sure why this is the case. (These strains btw is from a published paper on their comparative genomics, and they have managed to identify orthologs using orthomcl and Synergy2. Not sure why I am not able to replicate it).

I shall try blastp with makeblastdb, and see if it make a difference.

(Separately, I also ran into trouble with Amphora2 that Synergy2 uses, and I have posted another question in this forum regarding that.)

Thanks much

ADD REPLY

Login before adding your answer.

Traffic: 1648 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6