Hello,
I have a general question about finding the outliers in microarray data. For my normalised datasets, I have generated the PCA and heatmap plots with samples clustering. My heatmap plot shows the triplicates cluster together. But, looking at PCA plot, on PC1, one replicate might be much further away from the other two replicates, like having two at +60 and the other being at -20 on PC1 vector. On PC1 more than 55% variance is explained (at least) and all the replicates show rather similar position relative to PCA2 on the plot. My question is which of PCA or heatmap plots are more accurate to use for excluding the outliers from the sample and why?
Your opinions are very appreciated. Thank you
Thanks a lot Kevin. I went through the PCAtools tutorial, but I couldn't regenerate the same stat ellipse on the same dataset. I used this tool for my own data, too. Can I say that GSM4910611 is the outlier of sessile group? Besides, I don't know if all the samples of planktonic are within their ellipse and there's no outlier because its ellipse is divided into halves and looks odd. Thank you in advance.
my data stat ellipse: https://ibb.co/Rjd56CM tutorial stat ellipse: https://ibb.co/PYRJ79s
Oh, you need to increase the axes widths so that the ellipses are drawn correctly. I do not see any outliers in your data.