How can I build a phylogenetic tree based on some selected strains of bacteria sequences downloaded from ncbi? For example, I picked 'J1776','ScottA','R2-502' from Listeria monocytogenes bacteria and downloaded three fasta (.fna) files using the following links: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/438/585/GCF_000438585.1_ASM43858v1/GCF_000438585.1_ASM43858v1_genomic.fna.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/009/866/905/GCF_009866905.1_ASM986690v1/GCF_009866905.1_ASM986690v1_genomic.fna.gz ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/438/705/GCF_000438705.2_ASM43870v2/GCF_000438705.2_ASM43870v2_genomic.fna.gz
Then, concatenated the *.fna files into one .fasta file, and tried megax and beauti but they both give me errors indicating that sequence lengths are not equal. Am I doing anything wrong?
Yes, you are doing something wrong. But without more details there could be many reasons. I guess it's just you are not aligning the sequences before trying to build the tree. You need to generate a multiple alignment. Megax can do it too. Also, you posted it with the wrong tags. That's not related to assembly.
Thanks for the reply. I tried aligning the tree sequences by MUSCLE in Megax, but I got "Error-Alignment Failed: MUSCLE Log file did not end properly, suggesting an unhandled exception." Then, I tried Mauve. Although I did not get any errors with Mauve, the resulting fasta has more than 3 rows of sequences, which I do not understand the reason. I was expecting to see three aligned sequences. Do you have any comment on this?
Yep, these sequences are too large to be aligned with common multiple-alignment software I just focused on the error you were getting from Megax. You can do what Mensur says or you can just select a shorter random sequence and align it. Something between 5000 and 20000 bp. You'll definitively get a phylogenetic tree. How informative will it be? I don't know. I guess it depends on what you need it for. You can also make trees with different chunks of data and later make an average tree. I guess that will be easier but not sure if it will be much less informative.
Thanks for your suggestion.