1 使用 c,cbind,rbind结合变量
之前获得四列数据 Wingcrd,Tarsus,Head,Wt
每个列包含8个数据,可以通过c继续连接变量
> BirdData <- c(Wingcrd,Tarsus,Head,Wt)
> BirdData
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0 22.3 19.7 20.8 20.3 20.8 21.5
[15] 20.6 21.5 31.2 30.4 30.6 30.3 30.3 30.8 32.5 NA 9.5 13.8 14.8 15.2
[29] 15.5 15.6 15.6 15.7
BirdData是长度为32的单个向量,符号[1],[15],[29]不需要考虑不同电脑有不同的现实。这只是单个向量,R并没有区分这些值都属于哪一个变量。通过:
Id <- c(1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4)
Id <- rep(c(1,2,3,4),each = 8)
Id <- rep(1:4,each = 8)
这三个表达式效果是一样的
> Id <- rep(1:4,each = 8)
> Id
[1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4
>
a <- seq(from =1 ,to = 4,by = 1)
rep(a,each = 8)
效果:
> a <- seq(from =1 ,to = 4,by = 1)
> rep(a,each = 8)
[1] 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 3 3 3 3 3 3 3 3 4 4 4 4 4 4 4 4
>
VarNames <- c("Wingcrd","Tarsus","Head","Wt")
Id2 <- rep(VarNames,each = 8)
> VarNames <- c("Wingcrd","Tarsus","Head","Wt")
> Id2 <- rep(VarNames,each = 8)
> Id2
[1] "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd"
[8] "Wingcrd" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus"
[15] "Tarsus" "Tarsus" "Head" "Head" "Head" "Head" "Head"
[22] "Head" "Head" "Head" "Wt" "Wt" "Wt" "Wt"
[29] "Wt" "Wt" "Wt" "Wt"
>
req(VarNames,8):
> rep(VarNames,8)
[1] "Wingcrd" "Tarsus" "Head" "Wt" "Wingcrd" "Tarsus" "Head"
[8] "Wt" "Wingcrd" "Tarsus" "Head" "Wt" "Wingcrd" "Tarsus"
[15] "Head" "Wt" "Wingcrd" "Tarsus" "Head" "Wt" "Wingcrd"
[22] "Tarsus" "Head" "Wt" "Wingcrd" "Tarsus" "Head" "Wt"
[29] "Wingcrd" "Tarsus" "Head" "Wt"
>
cbind函数将结合的变量以列的形式输出
>
> Z <- cbind(Wingcrd,Tarsus,Head,Wt)
> Z
Wingcrd Tarsus Head Wt
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
>
访问Z的第一列
> Z[,1]
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0
> Z[1:8,1]
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0
访问Z的第一行
> Z[1,]
Wingcrd Tarsus Head Wt
59.0 22.3 31.2 9.5
> Z[1,1:4]
Wingcrd Tarsus Head Wt
59.0 22.3 31.2 9.5
>
同样可以使用的访问方式:
Z[1,]
Z[1,1:4]
Z[1,1]
Z[,2:3]
X <- Z[4,4]
Y <- Z[,4]
W <- Z[,3]得到第三列的数据
D <- Z[,c(1,3,4)]得到1,3,4列的所有数据
E <- Z[,c(-1,-3)] 负号表示排除第一第三列
显示Z的维数
dim(Z)
> dim(Z)
[1] 8 4
>
只查看Z行数
> Nrows <- dim(Z)[1]
> Nrows
[1] 8
>
rbind与cbind函数类似,只不过前者以行的形式表示数据,后者以列的形式表示数据
> Z2 <- rbind(Wingcrd,Tarsus,Head,Wt)
> Z2
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8]
Wingcrd 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0
Tarsus 22.3 19.7 20.8 20.3 20.8 21.5 20.6 21.5
Head 31.2 30.4 30.6 30.3 30.3 30.8 32.5 NA
Wt 9.5 13.8 14.8 15.2 15.5 15.6 15.6 15.7
>
使用vector表示数据
> W <- vector(length = 8)
> W[1] <- 59
> W[2] <- 55
> W[3] <- 53.5
> W[4] <- 55
> W[5] <- 52.5
> W[6] <- 57.5
> W[7] <- 53
> W[8] <- 55
> W
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0
>
说明:
在输入 W <- vector(length = 8)之后直接再输入W会得到一个FALSE的响应值。
必须在所有元素的值都输入之后才可以输入W查看数据
可以通过W[1],W[1:4],W[2:6],W[-2],W[c(1,2,3)]等查看数据
通过矩阵结合数据
> Dmat <- matrix(nrow = 8,ncol = 4)
> Dmat
[,1] [,2] [,3] [,4]
[1,] NA NA NA NA
[2,] NA NA NA NA
[3,] NA NA NA NA
[4,] NA NA NA NA
[5,] NA NA NA NA
[6,] NA NA NA NA
[7,] NA NA NA NA
[8,] NA NA NA NA
>
初始化矩阵
> Dmat[,1] <- c(59,55,53.5,55,52.5,57.5,53,55)
> Dmat[,2] <- c(22.3,19.7,20.8,20.3,20.8,21.5,20.6,21.5)
> Dmat[,3] <- c(31.2,30.4,30.6,30.3,30.3,30.8,32.5,NA)
> Dmat[,4] <- c(9.5,13.8,14.8,15.2,15.5,15.6,15.6,15.7)
> Dmat
[,1] [,2] [,3] [,4]
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
>
使用colnames函数给矩阵Dmat的列添加名称
> colnames(Dmat) <- c("Wingcrd","Tarsus","Head","Wt")
> Dmat
Wingcrd Tarsus Head Wt
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
>
当数据按照变量进行了分类,则可以:
> Dmat2 <- as.matrix(cbind(Wingcrd,Tarsus,Head,Wt))
> Dmat2
Wingcrd Tarsus Head Wt
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
>
使用data.frame函数结合数据
使用数据框结合具有相同长度的变量,而数据框的每一行就包含有统一样本的不同观察值。
eg:
> Dfrm <- data.frame(WC = Wingcrd,TS = Tarsus,HD=Head,W=Wt)
> Dfrm
WC TS HD W
1 59.0 22.3 31.2 9.5
2 55.0 19.7 30.4 13.8
3 53.5 20.8 30.6 14.8
4 55.0 20.3 30.3 15.2
5 52.5 20.8 30.3 15.5
6 57.5 21.5 30.8 15.6
7 53.0 20.6 32.5 15.6
8 55.0 21.5 NA 15.7
>
数据框的优点:可以在不改变原始数据的基础上改变数据。eg:
> Dfrm2 <- data.frame(WC = Wingcrd,TS = Tarsus,HD=Head,Wsq=sqrt(Wt))
> Dfrm2
WC TS HD Wsq
1 59.0 22.3 31.2 3.082207
2 55.0 19.7 30.4 3.714835
3 53.5 20.8 30.6 3.847077
4 55.0 20.3 30.3 3.898718
5 52.5 20.8 30.3 3.937004
6 57.5 21.5 30.8 3.949684
7 53.0 20.6 32.5 3.949684
8 55.0 21.5 NA 3.962323
Wt和W是不同的实体,验证:
> rm(Wt)
> Wt
错误: 找不到对象'Wt'
> Dfrm$W
[1] 9.5 13.8 14.8 15.2 15.5 15.6 15.6 15.7
>
数据框通常的用法:
向R中输入数据后对数据做些改变(移出极端值,应用变化,增加分类变量等等),再将数据存入数据框中以备后续分析。
使用list结合数据
以list结合数据,list中的每一个数据既可以是向量,也可以是单个的数据等。其中向量的维数可能一样也可能不一样。
eg
> X1 <- c(1,2,3)
> X2 <- c("a","b","c","d")
> X3 <- 3
> X4 <- matrix(nrow = 2 , ncol = 2)
> X4[,1] <- c(1,2)
> X4[,2] <- c(3,4)
> Y <- list(LX1=X1,LX2=X2,LX3=X3,LX4=X4)
> Y
$LX1
[1] 1 2 3
$LX2
[1] "a" "b" "c" "d"
$LX3
[1] 3
$LX4
[,1] [,2]
[1,] 1 3
[2,] 2 4
>
list的重要性:
线性回归,广义线性回归,t-检验等的结果一般都保存在list中
> M <- lm(WC ~ W,data = Dfrm)
> M
Call:
lm(formula = WC ~ W, data = Dfrm)
Coefficients:
(Intercept) W
65.5315 -0.7239
纤细分析的结果存储在:
> names(M)
[1] "coefficients" "residuals" "effects" "rank"
[5] "fitted.values" "assign" "qr" "df.residual"
[9] "xlevels" "call" "terms" "model"
>
可以通过以下方式访问具体的值
> M$coefficients
(Intercept) W
65.5315140 -0.7238731
> M$residuals
1 2 3 4 5 6 7
0.3452800 -0.5420659 -1.3181928 0.4713564 -1.8114817 3.2609056 -1.2390944
8
0.8332929
> M$effects
(Intercept) W
-155.7402686 4.0250694 -1.2416235 0.5887546 -1.6634618
3.4191327 -1.0808673 1.0017273
>
综合:
> AllData <- list(BirdData = BirdData,Id = Id2,Z = Z,VarNames = VarNames)
> AllData
$BirdData
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0 22.3 19.7 20.8 20.3 20.8 21.5
[15] 20.6 21.5 31.2 30.4 30.6 30.3 30.3 30.8 32.5 NA 9.5 13.8 14.8 15.2
[29] 15.5 15.6 15.6 15.7
$Id
[1] "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd"
[8] "Wingcrd" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus"
[15] "Tarsus" "Tarsus" "Head" "Head" "Head" "Head" "Head"
[22] "Head" "Head" "Head" "Wt" "Wt" "Wt" "Wt"
[29] "Wt" "Wt" "Wt" "Wt"
$Z
Wingcrd Tarsus Head Wt
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
$VarNames
[1] "Wingcrd" "Tarsus" "Head" "Wt"
分别取其中的某一个元素
> AllData$BirdData
[1] 59.0 55.0 53.5 55.0 52.5 57.5 53.0 55.0 22.3 19.7 20.8 20.3 20.8 21.5
[15] 20.6 21.5 31.2 30.4 30.6 30.3 30.3 30.8 32.5 NA 9.5 13.8 14.8 15.2
[29] 15.5 15.6 15.6 15.7
> AllData$Id
[1] "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd" "Wingcrd"
[8] "Wingcrd" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus" "Tarsus"
[15] "Tarsus" "Tarsus" "Head" "Head" "Head" "Head" "Head"
[22] "Head" "Head" "Head" "Wt" "Wt" "Wt" "Wt"
[29] "Wt" "Wt" "Wt" "Wt"
> AllData$Z
Wingcrd Tarsus Head Wt
[1,] 59.0 22.3 31.2 9.5
[2,] 55.0 19.7 30.4 13.8
[3,] 53.5 20.8 30.6 14.8
[4,] 55.0 20.3 30.3 15.2
[5,] 52.5 20.8 30.3 15.5
[6,] 57.5 21.5 30.8 15.6
[7,] 53.0 20.6 32.5 15.6
[8,] 55.0 21.5 NA 15.7
>
注意:在list中只能使用= 不能使用 <-