I have sampled and aligned a set of 7000 proteomes. I tried to use RAxML to make maximum likelihood trees but I get an error saying "unknown character "U" is at position xyz". How do I remove the U's in my sampled file and what do I replace it with ? Or is there any other software to make maximum likelihood trees that can recognize U ?
Normally there is no U character in proteins, but sometimes it stands for selenocysteine. I suggest you make sure that you are not aligning an RNA sequence instead. If it is a protein, you can either remove the whole offending sequence (it should still be a pretty good dataset with 6999 proteomes), or replace U with C.
Out of curiosity, why are you aligning that many proteomes? It is almost impossible to have a meaningful view of that tree. Besides, if this is for prokaryotes, it is almost a guarantee that a better and larger tree already exists here.