Top Page | Upper Page | Contents | About This Site | JAPANESE

Dummy Variable

sample "Dummy Variable" is the method to transform category data into quantity data. We can use the analysis methods for quantity data to analyze category data.

Using 1 and 0

Generally, the transformed variable that category data into quantity data 1 and 0 is called "dummy variable".

It is used mainly for the X.

Quantification Methods from 1 to 4 are Multi-Variable Analysis. These methods transform the category data X into quantity data 0 and 1.

Strong Points

If a value is multiplied by 1, it is the same value. And a value is multiplied by 0, it is 0.

This is used to show the "Use the variable or not" in the formulation.

And it is useful in the programming because computers are the machines that use 0 and 1.

Process of Transformation

sample sample Below is the case of this transformation, " 4 kinds of data in a column ".

Make 4 columns and put 4 names on each column.
If column name and data value is same, put 1 into the cell.
Or if column name and data value are not same, put 0 into the cell.

Weak points

The dummy variable method has weak points.

There is multicollinearity problem. One column is not used in analysis to avoid this problem easily.

And if there are many kinds of names, analysis is difficult because we need prepare numbers of columns as same as numbers of names.

Binary Number Transformation is made to cover these weak points. But it has another weak point.

Using 1 and -1

The transformation into 1 and -1 is used Y in many cases.

Strong Points

For example, the case Y is the category data "A" and "B" and it is transformed into "1" and "-1".

If the calculated Y is larger than 0, it means that calculate Y is "A". And if the calculated Y is smaller than 0, it means that calculate Y is "B".

Using 0 for the decision is also useful in the programming.

Software

Example of R is in the page, Variable conversion by R .

NEXT Binary Number Transformation