Top Page | Upper Page | Contents | About This Site | JAPANESE

Robust Analysis

"Outlier and Missing Value prevent the analysis", "Distribution is different far from the normal distribution" and "Small difference of the data changes the output largely" can be big problems in the data analysis, especially for big data.

This page collects the robust methods to weaken these problems.

In many cases, robust methods can be good for the speed of the analysis.

Using Category for Quantity Data

Decision Tree is one of the examples of this approach.

Analysis Using Category Data is the page for this approach.

Using Only Important Data

Support Vector Machine and k-NN have the way to use only important data. This is strong to avoid outlier.

Using Intermediate Layer

All of Analysis Using Intermediate Layer is not strong for outlier.

Principal Component Regression Analysis is robust for the outlier.

Support Vector Machine uses the intermediate layer but the purpose for the layer is not for the outlier.

Strength and Weakness of Big Data

Statistical Way of Making Hypothesis

Selection of Methods

Prediction by Statistical Model

Outlier and Missing Value

Difference from Hopes

NEXT Isolation of Data, Methods and Indexes