Entering edit mode
5.7 years ago
prasundutta87
▴
670
Whenever you perform an XP-EHH analysis involving populations, you tend to perform many pairwise XP-EHH calculations. During those scenarios, what is the best way to detect outliers? Currently, for all pairwise comparison, the XP-EHH scores were calculated per chromosome. the scores were converted to z-scores which were then converted to absolute Z-score. All the chromosomes were joined to get the whole genome XP-EHH plot.
What is the correct way to assign a threshold to detect outliers?
The 1.5xIQR is a rule of thumb (and generally a good one, although it may not always be appropriate) by John Tukey for determining outliers. Outliers here are defined as observations that fall below
Q1 − 1.5*IQR
or aboveQ3 + 1.5*IQR
, where IQR = Q3-Q1 and Q1 is first quartile and Q3 is third quartile. In a boxplot, the highest and lowest occurring value within this limit are indicated by whiskers of the box and any outliers as individual points.Thanks for this..the thing is somehow there is no one appropriate way to detect outliers for this kind of analysis..many papers have used different techniques..could not find a consensus.