Top Page | Upper Page | Contents | About This Site | JAPANESE

Independence Test

Independence Test is a method to judge the correlation between of items of row and and items of column. Contingency Table
A-B

If there is no correlation all values should be equal. Independence test use this idea.

Index for Test

There are some indexes. Calculation is below.

Cross Tabulation

Cross Tabulation
Cross Tabulation

Calculation example of R

The following is an example of using the chi-square test of R for the test of independence . (The following is copy-paste and can be used as it is. In this example, it is assumed that the folder named "Rtest" on the C drive contains the data named "Data.csv".

setwd("C:/Rtest") #
Data <- read.table("Data.csv", header=F, sep=",") #
chisq.test(Data)

In the case of a 2*3 contingency table Part 1

Try it with a 2*3 contingency table.

First, the six data in the table on the left do not vary much. In this case, the p-value is almost 1.

In the table on the right, only one piece of data is particularly large. In this case, the p-value is considerably smaller.

In the case of a 2*3 contingency table Part 2

Try different types of 2*3 contingency tables.

The left and right tables only have the numbers in the rightmost column reversed, but the right table has a much smaller p-value. If there is any correlation between the row and column items, you will find that the p-value is small.

In the case of a 9*7 contingency table Part 1

Try increasing the size of the table. The more you go to the table on the right, the greater the variability of the data.

The greater the variability of the data, the smaller the p-value. It can be seen that the magnitude of the p-value varies depending on the variation.

If large variability is the point, it may be a good way to measure variability.

If you have only a few items in the row and column directions and the table size is small, you can use the independence test as a way to see the correlation, but if the table size is large , you can use this method for values. It's hard to find regularity.

In the case of a 9*7 contingency table Part 2

The size of the table is 9*7, which is the same, but there are places where the value is extremely high.

The table on the left does not vary much. In this case the p-value is exactly 1. In the center table, only one row has a higher value. Again, the p-value is exactly 1. The table on the right is a replacement of the numbers in the center table. The p-value is fairly close to 0.

In Part 1, I wrote "scale of variation", but if only a specific line has a high value and the variation is large, you can see that it cannot be measured with this scale.

R-EDA1

Analysis set with a bar graph

In R-EDA1, select "Stratifeid_graph" and then "bar" to test the independence using the variables selected to draw the graph. The procedure for creating a contingency table with two qualitative variables and performing a test is automatic.
GLMM in R-EDA1 Independence Test in R-EDA1

Analyze the contingency table

If you want to analyze the data that is already in the contingency table, it is in a different location than above.
Independence Test in R-EDA1




NEXT Log-linear analysis

Tweet