#
Robust Analysis

"Outlier and Missing Value
prevent the analysis",
"Distribution is different far from the
normal distribution" and
"Small difference of the data changes the output largely" can be big problems in the data analysis,
especially for big data.

This page collects the robust methods to weaken these problems.

In many cases, robust methods can be good for the speed of the analysis.

##
Using Category for Quantity Data

Decision Tree
is one of the examples of this approach.

Analysis Using Category Data
is the page for this approach.

##
Using Only Important Data

Support Vector Machine
and
k-NN
have the way to use only important data.
This is strong to avoid
outlier.

##
Using Intermediate Layer

All of
Analysis Using Intermediate Layer
is not strong for outlier.

Principal Component Regression Analysis
is robust for the outlier.

Support Vector Machine
uses the intermediate layer but the purpose for the layer is not for the outlier.

Strength and Weakness of Big Data

Statistical Way of Making Hypothesis

Selection of Methods

Prediction by Statistical Model

Outlier and Missing Value

Difference from Hopes

NEXT Mathematical Modeling