proteinOrtho out put help required
1
0
Entering edit mode
7.0 years ago
majeedaasim ▴ 60

I installed and ran proteinortho successfully. Next I used grab_proteins.pl to extract sequences of the identified orthologous groups. I get 32 fasta files on test data provided with the software. Does it mean that there are 32 orthologous groups detected by the software? Also I need to find the the sequences common to all the species for constructing the phylogenetic tree, Which tool can I use to find shared orthology among the different species?

orthology proteinortho • 3.1k views
ADD COMMENT
0
1
Entering edit mode
7.0 years ago
Jon ▴ 360

You need to parse the file tab delimited file named .proteinortho or .poff (if you used syteny). Per the manual, you'll see that the first column is the number of species that the orthologous group is found in. So lets say you had 4 species (i.e. you ran with 4 proteome fasta files), you could then filter the tab-delimited .proteinortho or .poff output as follows:

#this will show orthologous groups containing proteins from each species
grep '^4' output.proteinortho

However, since it seems like you are maybe looking for single copy orthologs, you could filter those like:

#this will get orthologs containing all 4 species and no duplications in each genome
grep '^4\t4' output.proteinortho
ADD COMMENT
0
Entering edit mode

I have the same problem and I can not extract the orthologs containing all species and no duplications in each genome to do the phylogeny. do you have a tool that can extract orthology sequences????

ADD REPLY

Login before adding your answer.

Traffic: 1965 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6