Maybe is a silly question but I found no clear information about. I generated the 3D models of a COL4A5 protein with a missense mutation c.2546G>A p.(Gly849Glu) through Phyre2 (here link to my results) and the protein inserted FASTA is ~1690 residues. The generated primary model is only from residue 1464 to 1691. I checked and the model in the COL4A5 UniProt page is identical and relative to this little portion.
My question is: why all the rest of the AA sequence is not showed in the model? If I try to open the relative model from Phyre2 results that is just a long sequence without structure...is there any theoretical reason for that that I am missing? How can I visualize the impact of my mutation on the protein if it does not fall in this little window?
Thank so much in advance
Phyre specifically only models regions it finds strong homologies for.
If you want full models, I suggest ITASSER, but bear in mind those regions will be of more dubious quality.
But they have a limitation to 1500 AA sequence. I suppose I cannot cut a portion of the protein and hoping that the results will be reliable, right?
Yeah the hard limit of 1500 is a bit of a pain. I think, though don't quote me on this, it is possible to edit the source code to remove the limit of 1500, the models just become more and more likely to be spurious. You can indeed split the model and model the domains separately. It's tricky, though not impossible to join them up again later on.
You could try MODELLER or another tool. I'm less familiar with these though so I don't know if they return full length models or just regions like Phyre, or whether they have a large cutoff.