Hi EveryOne,
I have MSA file of 11000 genomes (30k size each) aligned by MAFFT. I want to construct a phylogenetic tree from this large MSA file (500megabytes). I tried MEGAX and RAxML but, it takes so long and at last it got crashed in my ubuntu 16.04, 8GB RAM and 1TB HD workstation. So, can anyone suggest me to accomplish this ? Thanks
You should reduce redundancy since it is unlikely that all 11K genomes are completely unique w.r.t sequence. SARS genomes?
A multiple sequence alignment of that many sequences which are that long is highly likely to be spurious, unless all the genomes are incredibly similar (in which case you might as well remove redundant identical sequences).
On the lines of what genomax and Joe already alluded to, you can reduce redundancy in 2 ways:
And submit this reduced representation to RaXML at the CIPRES Gateway, that might help...