# Analysis Using Intermediate Layer

In general type of
Multi-Variable Analysis
(M-VA), direct relationship between X and Y is studied by a mathematical model.

Intermediate layers between X and Y are used in some higher level of M-VA.

The process, merit and demerit are useful as knowledge of
Data Literacy.
So I introduce in this page.

##
Strong Point

If simple model is difficult to study the relationship between X and Y,
intermediate layer models are one of the solutions.

###
Get Useful Information by Intermediate Layer

One of the reasons of difficulties is noise.
And there are cases that the reason is complicated information.

Intermediate layer shine the useful information.

##
Process to Make Intermediate Layer

###
One Step Method

The model including all of Xs, Y and Zs is made by one step.

There is an example in the methods of
Neural Network.

###
Two Steps Method

At first step, Zs are made.
At second step, relationship between Zs and Y are studied.

At first step, unsupervised learning is used.
At second step, supervised learning is used.

Principal Component Regression Analysis
(PCR)
is one of the examples.
At first step,
Principal Component Analysis
(PCA) is used.
At second step,
Multi-Regression Analysis
(MRA) is used.
Principal components are used as Zs.

For the complicated data,
Self Organizing Map
and
Kernel method
are useful.

Classifying methods make categorical data.
It is also used as Zs.
And it is often used by the
dummy variable.

###
Make Intermediate Layer with Meta Knowledge

Statistics software helps us to make the intermediate layer.
But it is not so strong.

One of the reasons is that it includes only statistical models.
And it cannot deal with the knowledge of the background of the data.

I often make intermediate layers with
Meta knowledge
of the data.
Models of in physics, chemistry and so on are used to make Zs.

##
Weak Point

There are weak points in the analysis using intermediate layer.

###
Loss of Important Information

Information in the noise is lost when Zs are made.
So if important information is included in the noise, we cannot analyze the relationship between important information and Y.

For example, if useful information is included the fact of
outlier and missing value,
the loss happens.
In
process analysis for abnormal condition,
it often happens.

###
Difficult to Consider the Phenomena

If relationship between X and Y is explained directly, it is better.
The simple story is easy to understand.

And if there is the intermediate layer (Z), we need to plan the action for "(1) X and Z", "(2) Z and Y" and "(3) combination of (1) and (2)".

Gap between Models of Statistics and Real

Statistical Way of Making Hypothesis

Selection of Methods

Outlier and Missing Value

Difference from Hopes

NEXT Analysis Using Category Data