Outlier is the value far from the main group. Missing value is the value of blank. We often meet them when we analyze large size data.
Outlier and missing value are also called "abnormal value", "noise", "trash", "bad data" and "incomplete data". Some people dislike them because when there are these data in the data set, we cannot make beautiful statistical model or the software outputs error.
Outlier and missing value are often removed as unnecessary data. But the removal may remove the important information because outlier and missing value express some facts. And there are cases that we need to understand the reason of the mechanism of such data.
There are the cases that there are same causes of outlier and missing. So the rule "Outlier is the value far from the main group. Missing value is the value of blank." is not the only one.
For example, the missing value is the effect of the outlier. And the outlier is the effect of missing value.
There is the case that outlier and missing value are written as "Invalid" or "No data".
Or there is the case that outlier as "Infinity".
The case "0" is difficult.
The cause of "0" is that the computer recognized the blank as "0". For example, when we set the cell value of Excel from blank cell, the cell is "0".
Difference from Hopes
Independent Component Analysis
NEXT Cause and Effect Analysis of Outlier and Missing ValueTweet