Generating a phylogeny from 5 different species' nuclear de novo assemblies
1
0
Entering edit mode
9.2 years ago

I have 5 nuclear de novos and I was attempting to build a phylogeny out of them, but I'm running into a brick wall. I was initially trying to use MrBayes, but I can't figure out how to format the nexus file. My .fasta files are set up like this:

>scaffold3 Locus_16_0 16.4 COMPLEX
GAGTTTGCAAGGAAGCCTCCCAAACAGGATGTAAGCACAAAGAAGCAGAAACAGAAGAATGTGAACTTGGTGGATGAGCAGAGGGCAAAGAGATTGAAGT
TAGGACCTGGTATGAAGGTGAAATATGATCAGGTCAAAGGAGGTTATTACATAGAGGTTGGTTTCTGTTATATAATATGTGTTGTTGATTAATAACAAGT
TAAGTATGANNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNGTAGGATAAACATGGGAATGGGAATCCAGTGGGTATTGCAGGCACTTCATCCTCAATGC
CCCATACTGAAGTTGAAACATCAAATGATCATGAGGTTAGTATATACAATATAATCTGGTTAGACTATATTTAAATAAAATGTTGCTTCTTGCACTGATA
TACATCATCTTGTTATCATTTGTAGGATGAATCTGTGAAAGCAATNTAACAATGTAGGAACATCATCATCACTTAAAATACCAGGGCTGCAACTTAGGAG
AAGTAGGAGACTGTTAGTTCAGCCTACAAATCTATCAGATCCTGCTCCTTCACAGGTCCCAGAGCCTGTTATTTCTAAATGCCCAAATCCTGTTTTTTCA
AAGCTAGCAGAACCAGTGTCCTTATCACATCATGTGGAAATACCAGCACCAAAAGTGGATCCTACAAGGAAGTTAAAGTCCTCTACACAGGCACAACCAG
TGTTGCAACCCAGGAGAAGTAGCAGACTNNNNNNNNNNNNNNNNNNNNNNNNNNTAGAAAACAATCTTTTGGTGAAGGTTGTGTTGGGATGTTTATGACC
TTGGTACTTTGTTTGGTTAACTGTGTATAACAAACCTATCAAACTTTTGTACTTNNNNNNNNNNNNNNNNNNATGGTTCAAAGTACAGTATATGAAAAAT
TTGTCTTGGTGCACAAAAATTGGATTGTTTCATAATGTGCAGATATGTAACAAGTGTAGATGATCCAAAACTTTCATCATTGACATCAATTGTACATTTA
CCATACATAAACAAGTGCATATTTCTATATTTTGTTGGTCATTATCATTTTAGGGAATATTACAATACCAACTAAATCCAAAG

It's just a list of different loci and scaffolds that I got from a SOAPdenovo assembly. The nexus file seems to require a single fasta file where every species is represented, and I'm not sure how to go about doing that.

Do I need to figure out which scaffolds are from the same location on each species so I can compare them, or can I compare them with what I have here? Will MrBayes work for this, or do I need to work with something else?

phylogeny Assembly • 1.9k views
ADD COMMENT
3
Entering edit mode
9.2 years ago
h.mon 35k

You have to align homologous regions to infer phylogeny, at least with the most commonly used maximum likelihood and bayesian methods. For genomes, I guess you could align the drafts assemblies with Mauve and build the phylogeny from these alignments. Another way would be annotating the genomes for genes, clustering the orthologous genes and inferring the phylogeny from the concatenated gene set.

ADD COMMENT

Login before adding your answer.

Traffic: 1366 users visited in the last hour
Help About
FAQ
Access RSS
API
Stats

Use of this site constitutes acceptance of our User Agreement and Privacy Policy.

Powered by the version 2.3.6