I have generated a phylogenetic tree based on the amino acid sequence (28 genes been extracted from 30 bacterial strains and concatenated), which is obtained by AMPHORA tool, the tree topology is proper as I expected. Then, I have extracted the nucleotide sequences of the same genes based on the coordinates mentioned in AMPHORA output amino acid sequences. Most of the coordinates were reverse in order but, I did not make reverse complement of the same. It is because, I strongly rely on alignment tool such as mafft
. mafft
has --adjustdirection
option, which can reverse complement the sequences if it needed. I have opted this option and aligned the nucleotide sequences and generated the tree, but it topology completely vary from the amino acid based tree.
I need to have nucleotide sequences based phylogenetic tree for my further analysis. But due to this issue I could not move further in my analysis. Hence, kindly help me to fix this issue.
Thank you in advance.
Help you fix what issue?
If you believe your data to be correct, then the tree is correct.
DNA sequences and Amino Acid sequences evolve at different rates so there are considerations in evolutionary model etc to be made, but there is no way to 'fix' your data, so I'm not really sure what the question is?
Can you elaborate on how exactly your DNA sequences were retrieved? It sounds to me like there's the possibility of some confusion in the reverse complementing etc that you mentioned. If they are concatenated alignments, I would expect that the signal is such that the correct topology would be returned, and be the same as the amino acids, so my first port of call would be to scrutinse your sequences closer.
Dear healey, I have extracted the nucleotide sequences from the bacterial genome in reverse order, for example
likewise the concatenated sequences (extracted from the above coordinates) were mixed with proper sequences and reverse order sequences as well. I did not make the reverse complement of those sequences been extracted in reverse order
>gene1 (contig1:520-350)
. However, I have aligned the same withmafft --adjustdirection
. As per my opinion, mafft failed to make reverse complement of those sequences mentioned above. I guess the tree topology changes because of this issue. That is why I seek any alternative to reverse complement of the same. I mean is there any tool available better than mafft to do the same.So you have 'normal' (5' -> 3') amino acid sequences which you aligned, and they look correct?
And you have DNA sequences but some of them are backwards? Why?
I would advise that you simply extract the DNA sequences properly in the correct orientation in the first place. If you're trying to align a set of sequences which are mixed up between reverse and 'normal', of course the tree will change.
I agree with you healey, But, Amphora output fasta header is shown below,
I have made the above showed output as
ACPX01000193.1:57682-58245
for xarg samtools extraction. If it, less number of sequences, I could do these reverse complementing process effectively. But, I have more than 200 sequences to be reverse complemented. This would be a nightmare. That is why I am in need of any tool, which can sense the reverse order sequences and do the necessary changes. If you know any of the tool which can perform the same, please suggest me.Sorry, I'm not familiar with AMPHORA, so I'm not sure what it does.
Are the sequences which are reversed on their own in a single file, or mixed in with the sequences which are 'forward'?
Please post a small representative example of the fasta file with the headers and reversed sequences.
Amphora output are shown below,
So, I formated the fasta header for
samtools xarg
extraction as shown below,In this two header, the first one was in reverse sense, So I simply reversed the coordinates. Whereas the second one is proper one, no need to reverse it. likewise AMPHORA gives mixed output.
So, like this I have more than 200 sequences to be reverse complemented. This is where I am facing the problem. So, could you please suggest me any way to do the same.