Phylogeny based on multiple sequences
1
0
Entering edit mode
8.7 years ago
Mehmet ▴ 820

Dear All:

I have 113 ortholog groups of 12 species. I want to make a phylogenomics tree based on these orthologs. I was wondering how I could do this. After alignment of each ortholog group separately, should I combine them and then do the phylogenomics tree or do phylogenomics tree of each ortholog groups separately and then combine the each ortholog group`s tree?

Thank you.

genome gene alignment next-gen • 2.5k views
ADD COMMENT
1
Entering edit mode
8.7 years ago
Naren ▴ 1000

You can either align them first and merge alignments to one. or concatenate all representative sequences and align them as one sequence per Ort. group. MUSCLE is my choice for alignment as well as tree generation.

ADD COMMENT
1
Entering edit mode

Really fancy way would be to estimate best substitution model for each alignment post gap removal but prior to concatenation and then give that info to RAxML along with the super alignment.

ADD REPLY
0
Entering edit mode

Hello:

I aligned each ortholog group one by one and merged them by using "cat" command. but I received an error during trimming. The error says sequences are not at the same length. I want to use alignments in nexus format for MrBayes.

ADD REPLY
1
Entering edit mode

cat will vertically merge them all but you need to merge them horizontally somehow.

ADD REPLY
0
Entering edit mode

I don't know in what format your alignments are but it's almost certain that you can't merge them with cat..

ADD REPLY
0
Entering edit mode

could you please tell us how to merge alignment files?

ADD REPLY
0
Entering edit mode

I used a script (FasconCAT), which is available at https://www.zfmk.de/en/research/research-centres-and-groups/fasconcat

ADD REPLY
0
Entering edit mode

Instead of cat, paste command is possible to concatenate the sequences side by side. But you have to make sure that the FASTA sequences are in one-liner format before concatenating them using paste. After pasting, remove the delimiter and you will get your concatenated sequences. Remember to pick the delimiter carefully. Pick delimiter that does not exist in your header or any character that might be in your sequences as. Else, it will be a disaster to remove the delimiter later. Eg.

paste -d' ' seq1 seq2 | sed 's/ //g' >> concatenate.fa

The command assigned space as the delimiter and the sed command removed the spaces after pasting. *There are more better way perhaps, for concatenating sequence in command line. This is for sure not the best one, but it somehow work.

ADD REPLY

Login before adding your answer.

Traffic: 2691 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6