Check outliers by z-score transformation
202307061152
Status:
Tags: Statistics Hypothesis testing
Convert raw scores on some dependent variable into z-scores to check for outliers. In a normal distribution, 95% of observations will lie between the interval [-1.96; +1.96]. If there are more than 5% of observations with an absolute z-score larger than two, you have reason to believe in some serious outliers.
An observation with an absolute z-score of greater than 3 is very unlikely.
Implementation
In MATLAB: zscore(X)
.
Further Testing
Code suspected outliers as 1, and other observations 0, and carry out logistic regression analysis with Outlier(0,1) as binary dependent variable, and all independent variables included. If there are no significant effects, there is no reason to exclude an outlier with z-score smaller than 3.
First-pass statistical exploration