Data Analysis by R

Generalized linear mixed model by R

This is an R example of a Generalized Linear Mixed Model.

Generalized linear model

Multi-Regression Analysis and Logistic Regression Analysis , which are generally introduced alone, are a type of generalized linear mixed model .

The following is the same as using glm, but the family and link parts are different. Some of the following are the default links, so it's okay to omit the link, but I'm writing it here.

If you want to calculate the predicted value for any position, create a separate file called Data2.csv and enter the arbitrary position there. The variable name contained in the Data2.csv file must be the same as Data.csv. The Y variable may or may not be present, and it may or may not be used.

Multi-Regression Analysis

library(MASS)
setwd("C:/Rtest")
Data1 <- read.csv("Data.csv", header=T)
gm <- step(glm(Y~., data=Data1, family= gaussian(link = "identity")))
summary(gm)

#Prediction procedure from here
library(ggplot2)
Data2 <- read.csv("Data2.csv", header=T)
s2 <- predict(gm,Data2)
Data2s2 <- cbind(Data2,s2)
ggplot(Data2s2, aes(x=X01, y=X02)) + geom_point(aes(colour=s2)) + scale_color_viridis_c(option = "D")

Poisson regression analysis

Poisson regression analysis is often used when the objective variable is frequency (count data).

library(MASS)
setwd("C:/Rtest")
Data1 <- read.csv("Data.csv", header=T)
gm <- step(glm(Y~., data=Data1, family= poisson(link = "log")))
summary(gm)

#Prediction procedure from here
library(ggplot2)
Data2 <- read.csv("Data2.csv", header=T)
s2 <- predict(gm,Data2,type="response")
Data2s2 <- cbind(Data2,s2)
ggplot(Data2s2, aes(x=X01, y=X02)) + geom_point(aes(colour=s2)) + scale_color_viridis_c(option = "D")

Log-linear analysis

Log-linear analysis is a type of Poisson regression analysis, but it also generally looks at terms of degree 2 and above. There is an example in Log-Linear Analysis by R .

Logistic regression analysis

In logistic regression analysis, Y must be two numbers, "0" and "1". If Y is a qualitative variable such as "OK" and "NG", an error will occur even if there are only two types. In this case, convert them to 0 and 1, respectively.

library(MASS)
setwd("C:/Rtest")
Data1 <- read.csv("Data.csv", header=T)
gm <- step(glm(Y~., data=Data1, family= binomial(link = "logit")))
summary(gm)

#Prediction procedure from here
library(ggplot2)
Data2 <- read.csv("Data2.csv", header=T)
s2 <- predict(gm,Data2,type="response")
Data2s2 <- cbind(Data2,s2)
ggplot(Data2s2, aes(x=X01, y=X02)) + geom_point(aes(colour=s2)) + scale_color_viridis_c(option = "D")

If you enter "type =" response "", the prediction will be a value from 0 to 1. It can be interpreted as the probability of occurrence of a sample where Y is "1", or it can be interpreted as the expected value of 0 and 1.

Linear mixed model

This is an example of a linear mixed model .

Install and load the package "lme4" in advance.

The following is copy paste and can be used as it is. In this example, it is assumed that the folder named "Rtest" on the C drive contains data with the name "Data.csv".

The data assumes that the first column is the column name "Y1" and the objective variable, the second column is the column name "X1" and the explanatory variable, and the third column is the column name "C1" and the category is entered. doing.

setwd("C:/Rtest")
Data <- read.csv("Data.csv", header=T)
lmer <- lmer(Y1 ~ X1 + (1 + X1|C1) + (1|C1), data=Data) # Linear mixed model (variable effect on slope and section)
summary(lmer)

You can also make a random effect with just one of the intercepts and slopes.

lmer <- lmer(Y1 ~ X1 + (1|C1), data=Data) # Linear mixed model (with random effect on intercept)

lmer <- lmer(Y1 ~ X1 + (1 + X1|C1), data=Data) # Linear mixed model (with random effect on slope)


On this page, I used Excel to draw a graph with different regression lines for each group, but with R, I can draw multiple straight lines at once. You can find out how to do this on the ggplot2 page.



Tweet