I don't really understand how PCA was done, the authors doesn't seem to have too much detail on it. Can someone tell me more about how PCA was done in this diagram?
Principal component analysis with health status as instrumental
variables, based on the abundance of 155 species with ≥1% genome coverage
by the Illumina reads in at least 1 individual of the cohort, was carried
out with 14 healthy individuals and 25 IBD patients (21 ulcerative colitis
and 4 Crohn’s disease) from Spain...
Individuals (represented by points) were clustered and centre of gravity computed for
each class; P-value of the link between health status and species abundance was
assessed using a Monte-Carlo test (999 replicates)
They authors did PCA on a matrix. One axis is the human subject (39 of them). The other axis is species, meaning microbial species sampled from that subject's gut (155 of them). The values in the matrix are the abundance of the species per subject. The labels come from the health status of the patient.
Often when you see a PCA presented, the authors plot all of the points for the first two components. In this case, the authors decided there were three clusters of points, corresponding to three health conditions, and the calculated a center of gravity for each one. The ellipses indicate confidence regions, though the authors don't state this and don't indicate what level of confidence is represented (e.g. 95%).
The point of the figure is to suggest that there are systematically differing abundances of microbial species when comparing the three health conditions.
Nice to know! Would you be so kind to tell us what your question is then?