I was studying models for constructing phylogenetic trees. It is mentioned that all other models are 'nested' within GTR model i.e. can be arrived upon by constraining some parameters in GTR. Why should we do a model test at all and instead use a GTR model every single time because as I understand, it includes other models within itself. There are some papers where they mention "HKY/(other nucleotide model) was found to be optimal based on AIC values".
Read David W's comment about overfitting, to start, but realize that model selection in phylogenetics has quite a bit of history. In short, you need to find some way to select among models. More parameterized models will always fit the data better, so you need to employ model-selection criteria that takes the number of parameters into account. GTR has many submodels that you can select among using a likelihood ratio test (because they're nested), but the Akaike information criterion (AIC), Bayesian information criterion (BIC), and decision theory (DT) are often used. I recommend this paper by Ripplinger and Sullivan in Sys Bio to get you up to speed.