Data Analysis by R

High-dimensional regression analysis using intervals by R

This is an example of High-dimensional regression analysis using intervals by R.

High-dimensional regression analysis using intervals

High-dimensional regression analysis using intervals
dummy

Another way to simplify your code is to use a generalized linear mixture model, such as the one in Generalized Linear Mixture Model with R, or an interaction model. However, using this method, the terms of the original data of x come into the model, making it difficult to understand the evaluation of the coefficients.

library(MASS)
library(dummies)

setwd("C:/Rtest")

Data <- read.csv("Data.csv", header=T)

Data1 <- Data

DataY <- Data

DataY$X <- NULL

Data1$Y <- NULL

DataX <- Data1

Data1[,1] <- droplevels(cut(Data1[,1], breaks = 3,include.lowest = TRUE))
# Divide the quantitative variables in column 1 into three categories by one-dimensional clustering
Data2 <- dummy.data.frame(Data1)

Data3 <-Data2*DataX[,1]

colnames(Data3)<-paste0("X:",colnames(Data3))

Data4 <-cbind(DataY,Data2,Data3)

gm <- step(glm(Y~., data=Data4, family= gaussian(link = "identity")))

summary(gm)

dummy


library(ggplot2)
s2 <- predict(gm,Data4)

Data4s2 <- cbind(Data4,s2)

ggplot(Data4s2, aes(x=Y, y=s2)) + geom_point() + labs(x="Y",y="predicted Y")

dummy

High-dimensional regression analysis using cluster

This example uses data from the Cluster Higher-Dimensionalization Regression page.

This example uses data from the Cluster Higher-Dimensionalization Regression page.

Here, the k-means method is used for cluster analysis. For other methods, you can refer to the Cluster Analysis with R page.
dummy

library(MASS)
library(dummies)

setwd("C:/Rtest")

Data <- read.csv("Data.csv", header=T)

Data10 <- Data

Data10$Y <- NULL

Data11 <- Data10

for (i in 1:ncol(Data10)) {
Data11[,i] <- (Data10[,i] - min(Data10[,i]))/(max(Data10[,i]) - min(Data10[,i]))

}

km <- kmeans(Data11,3)
cluster <- km$cluster

cluster <- as.character(cluster)
cluster <- as.data.frame(cluster)

cluster <- dummy.data.frame(cluster)

Data4 <-cbind(Data,cluster)

gm <- step(glm(Y~.^2, data=Data4, family= gaussian(link = "identity")))

summary(gm)

dummy
With this code, the model contains the original explanatory variables, which makes interpreting the results a bit cumbersome.

With this code, the model contains the original explanatory variables, which makes interpreting the results a bit cumbersome.

With this code, the model contains the original explanatory variables, which makes interpreting the results a bit cumbersome.


library(ggplot2)
s2 <- predict(gm,Data4)

Data4s2 <- cbind(Data4,s2)

ggplot(Data4s2, aes(x=Y, y=s2)) + geom_point() + labs(x="Y",y="predicted Y")

dummy

Since they are almost aligned, you can see that they are very accurate and predictable.