R 코드 정리
1. CSV 파일 불러오기
df = read.csv(file="/Users/shlee/Dropbox/R/acs.csv", header= T)
2. 데이터 확인
head(): 데이터 확인
str(): 데이터 구조 확인
summary(): 데이터 요약
head(df)
age | sex | cardiogenicShock | entry | Dx | EF | height | weight | BMI | obesity | TC | LDLC | HDLC | TG | DM | HBP | smoking |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
62 | Male | No | Femoral | STEMI | 18.0 | 168 | 72 | 25.51020 | Yes | 215 | 154 | 35 | 155 | Yes | No | Smoker |
78 | Female | No | Femoral | STEMI | 18.4 | 148 | 48 | 21.91381 | No | NA | NA | NA | 166 | No | Yes | Never |
76 | Female | Yes | Femoral | STEMI | 20.0 | NA | NA | NA | No | NA | NA | NA | NA | No | Yes | Never |
89 | Female | No | Femoral | STEMI | 21.8 | 165 | 50 | 18.36547 | No | 121 | 73 | 20 | 89 | No | No | Never |
56 | Male | No | Radial | NSTEMI | 21.8 | 162 | 64 | 24.38653 | No | 195 | 151 | 36 | 63 | Yes | Yes | Smoker |
73 | Female | No | Radial | Unstable Angina | 22.0 | 153 | 59 | 25.20398 | Yes | 184 | 112 | 38 | 137 | Yes | Yes | Never |
str(df)
'data.frame': 857 obs. of 17 variables:
$ age : int 62 78 76 89 56 73 58 62 59 71 ...
$ sex : Factor w/ 2 levels "Female","Male": 2 1 1 1 2 1 2 2 1 2 ...
$ cardiogenicShock: Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 1 1 1 1 ...
$ entry : Factor w/ 2 levels "Femoral","Radial": 1 1 1 1 2 2 2 1 2 1 ...
$ Dx : Factor w/ 3 levels "NSTEMI","STEMI",..: 2 2 2 2 1 3 3 2 3 2 ...
$ EF : num 18 18.4 20 21.8 21.8 22 24.7 26.6 28.5 31.1 ...
$ height : num 168 148 NA 165 162 153 167 160 152 168 ...
$ weight : num 72 48 NA 50 64 59 78 50 67 60 ...
$ BMI : num 25.5 21.9 NA 18.4 24.4 ...
$ obesity : Factor w/ 2 levels "No","Yes": 2 1 1 1 1 2 2 1 2 1 ...
$ TC : num 215 NA NA 121 195 184 161 136 239 169 ...
$ LDLC : int 154 NA NA 73 151 112 91 88 161 88 ...
$ HDLC : int 35 NA NA 20 36 38 34 33 34 54 ...
$ TG : int 155 166 NA 89 63 137 196 30 118 141 ...
$ DM : Factor w/ 2 levels "No","Yes": 2 1 1 1 2 2 2 2 2 2 ...
$ HBP : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 1 ...
$ smoking : Factor w/ 3 levels "Ex-smoker","Never",..: 3 2 2 2 3 2 1 1 2 3 ...
summary(df)
age sex cardiogenicShock entry
Min. :28.00 Female:287 No :805 Femoral:312
1st Qu.:55.00 Male :570 Yes: 52 Radial :545
Median :64.00
Mean :63.31
3rd Qu.:72.00
Max. :91.00
Dx EF height weight
NSTEMI :153 Min. :18.00 Min. :130.0 Min. : 30.00
STEMI :304 1st Qu.:50.45 1st Qu.:158.0 1st Qu.: 58.00
Unstable Angina:400 Median :58.10 Median :165.0 Median : 65.00
Mean :55.83 Mean :163.2 Mean : 64.84
3rd Qu.:62.35 3rd Qu.:170.0 3rd Qu.: 72.00
Max. :79.00 Max. :185.0 Max. :112.00
NA's :134 NA's :93 NA's :91
BMI obesity TC LDLC HDLC
Min. :15.62 No :567 Min. : 25.0 Min. : 15.0 Min. : 4.00
1st Qu.:22.13 Yes:290 1st Qu.:154.0 1st Qu.: 88.0 1st Qu.:32.00
Median :24.16 Median :183.0 Median :114.0 Median :38.00
Mean :24.28 Mean :185.2 Mean :116.6 Mean :38.24
3rd Qu.:26.17 3rd Qu.:213.0 3rd Qu.:141.0 3rd Qu.:45.00
Max. :41.42 Max. :493.0 Max. :366.0 Max. :89.00
NA's :93 NA's :23 NA's :24 NA's :23
TG DM HBP smoking
Min. : 11.0 No :553 No :356 Ex-smoker:204
1st Qu.: 68.0 Yes:304 Yes:501 Never :332
Median :105.5 Smoker :321
Mean :125.2
3rd Qu.:154.0
Max. :877.0
NA's :15
3. 데이터 정리
1. 결측치 제거
na.omit(): 결측치 제거
str(df)를 보면 857명의 데이터가 있다.
na.omit(df) 실행 후 다시 str(df)를 실행하면 결측치가 있는 환자들은 제거되고 677명의 환자만 남았다.
df = na.omit(df)
str(df)
'data.frame': 677 obs. of 17 variables:
$ age : int 62 89 56 73 58 62 59 71 52 52 ...
$ sex : Factor w/ 2 levels "Female","Male": 2 1 2 1 2 2 1 2 2 1 ...
$ cardiogenicShock: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
$ entry : Factor w/ 2 levels "Femoral","Radial": 1 1 2 2 2 1 2 1 2 2 ...
$ Dx : Factor w/ 3 levels "NSTEMI","STEMI",..: 2 2 1 3 3 2 3 2 3 3 ...
$ EF : num 18 21.8 21.8 22 24.7 26.6 28.5 31.1 31.1 31.1 ...
$ height : num 168 165 162 153 167 160 152 168 175 156 ...
$ weight : num 72 50 64 59 78 50 67 60 60 63 ...
$ BMI : num 25.5 18.4 24.4 25.2 28 ...
$ obesity : Factor w/ 2 levels "No","Yes": 2 1 1 2 2 1 2 1 1 2 ...
$ TC : num 215 121 195 184 161 136 239 169 272 184 ...
$ LDLC : int 154 73 151 112 91 88 161 88 212 123 ...
$ HDLC : int 35 20 36 38 34 33 34 54 32 43 ...
$ TG : int 155 89 63 137 196 30 118 141 52 72 ...
$ DM : Factor w/ 2 levels "No","Yes": 2 1 2 2 2 2 2 2 2 2 ...
$ HBP : Factor w/ 2 levels "No","Yes": 1 1 2 2 2 2 2 1 1 2 ...
$ smoking : Factor w/ 3 levels "Ex-smoker","Never",..: 3 2 3 2 1 1 2 3 1 2 ...
- attr(*, "na.action")= 'omit' Named int 2 3 16 18 29 72 87 89 102 108 ...
..- attr(*, "names")= chr "2" "3" "16" "18" ...
2. 범주형 변수로 변환
데이터를 엑셀에 정리 할 때 흔히 생존은 0, 사망은 1
또는 성별을 남자는 0, 여자는 1 이런식으로 숫자를 셀에 입력한다.
이렇게 작성한 csv 파일은 데이터가 정수형(int)로 담겨져 있기 때문에 범주형 자료로 변환을 해야한다.
그렇지 않으면 성별이 0.5 처럼 이상한 수치가 나오게 된다.
library(pROC)
data(aSAH)
aSAH$gos6 <- as.integer(aSAH$gos6)
Type 'citation("pROC")' for a citation.
Attaching package: ‘pROC’
The following objects are masked from ‘package:stats’:
cov, smooth, var
head(aSAH)
str(aSAH)
summary(aSAH)
gos6 | outcome | gender | age | wfns | s100b | ndka | |
---|---|---|---|---|---|---|---|
29 | 5 | Good | Female | 42 | 1 | 0.13 | 3.01 |
30 | 5 | Good | Female | 37 | 1 | 0.14 | 8.54 |
31 | 5 | Good | Female | 42 | 1 | 0.10 | 8.09 |
32 | 5 | Good | Female | 27 | 1 | 0.04 | 10.42 |
33 | 1 | Poor | Female | 42 | 3 | 0.13 | 17.40 |
34 | 1 | Poor | Male | 48 | 2 | 0.10 | 12.75 |
'data.frame': 113 obs. of 7 variables:
$ gos6 : int 5 5 5 5 1 1 4 1 5 4 ...
$ outcome: Factor w/ 2 levels "Good","Poor": 1 1 1 1 2 2 1 2 1 1 ...
$ gender : Factor w/ 2 levels "Male","Female": 2 2 2 2 2 1 1 1 2 2 ...
$ age : int 42 37 42 27 42 48 57 41 49 75 ...
$ wfns : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 3 2 5 4 1 2 ...
$ s100b : num 0.13 0.14 0.1 0.04 0.13 0.1 0.47 0.16 0.18 0.1 ...
$ ndka : num 3.01 8.54 8.09 10.42 17.4 ...
gos6 outcome gender age wfns s100b
Min. :1.000 Good:72 Male :42 Min. :18.0 1:39 Min. :0.030
1st Qu.:3.000 Poor:41 Female:71 1st Qu.:42.0 2:32 1st Qu.:0.090
Median :5.000 Median :51.0 3: 4 Median :0.140
Mean :3.726 Mean :51.1 4:16 Mean :0.247
3rd Qu.:5.000 3rd Qu.:61.0 5:22 3rd Qu.:0.330
Max. :5.000 Max. :81.0 Max. :2.070
ndka
Min. : 3.01
1st Qu.: 9.01
Median : 12.22
Mean : 19.66
3rd Qu.: 17.30
Max. :419.19
aSAH 데이터는 pROC package에서 불러온 데이터다.
gos6가 애초에 범주형 변수로 저장되어 있었으나 임의로 정수형 변수로 변환 시켰다.
str(aSAH)를 보면 gos6가 int(정수형 자료)로 되어있다.
summary(aSAH)를 보면 gos6의 평균, 최소, 최대값 등이 나와있다.
이제 gos6를 다시 범주형 변수로 변환하겠다.
aSAH$gos6 <- factor(aSAH$gos6, levels=c(1:5), labels=c("Good", "Moderate", "Severe", "Vegetative", "Death"))
gos6를 levels=c(1:5)로 1, 2, 3, 4, 5로 순서를 주었다.
그리고 labels=c("Good", "Moderate", "Severe", "Vegetative", "Death")로 순서에 해당하는 labels을 부여했다.
반대로 leels=c(5:1)로 순서를 반대로 해도 된다.
그러면 labels=c("Death", "Vegetative", "Severe", "Moderate", "Good")으로
label도 순서를 반대로 해야 이름이 제대로 부여 된다.
head(aSAH)
str(aSAH)
summary(aSAH)
gos6 | outcome | gender | age | wfns | s100b | ndka | |
---|---|---|---|---|---|---|---|
29 | Death | Good | Female | 42 | 1 | 0.13 | 3.01 |
30 | Death | Good | Female | 37 | 1 | 0.14 | 8.54 |
31 | Death | Good | Female | 42 | 1 | 0.10 | 8.09 |
32 | Death | Good | Female | 27 | 1 | 0.04 | 10.42 |
33 | Good | Poor | Female | 42 | 3 | 0.13 | 17.40 |
34 | Good | Poor | Male | 48 | 2 | 0.10 | 12.75 |
'data.frame': 113 obs. of 7 variables:
$ gos6 : Factor w/ 5 levels "Good","Moderate",..: 5 5 5 5 1 1 4 1 5 4 ...
$ outcome: Factor w/ 2 levels "Good","Poor": 1 1 1 1 2 2 1 2 1 1 ...
$ gender : Factor w/ 2 levels "Male","Female": 2 2 2 2 2 1 1 1 2 2 ...
$ age : int 42 37 42 27 42 48 57 41 49 75 ...
$ wfns : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 3 2 5 4 1 2 ...
$ s100b : num 0.13 0.14 0.1 0.04 0.13 0.1 0.47 0.16 0.18 0.1 ...
$ ndka : num 3.01 8.54 8.09 10.42 17.4 ...
gos6 outcome gender age wfns s100b
Good :28 Good:72 Male :42 Min. :18.0 1:39 Min. :0.030
Moderate : 0 Poor:41 Female:71 1st Qu.:42.0 2:32 1st Qu.:0.090
Severe :13 Median :51.0 3: 4 Median :0.140
Vegetative: 6 Mean :51.1 4:16 Mean :0.247
Death :66 3rd Qu.:61.0 5:22 3rd Qu.:0.330
Max. :81.0 Max. :2.070
ndka
Min. : 3.01
1st Qu.: 9.01
Median : 12.22
Mean : 19.66
3rd Qu.: 17.30
Max. :419.19
head(aSAH): 5가 Death로, 1이 Good으로 변환되어 있다.
str(aSAH): gos6의 자료형이 Factor로 변환되었으며 Good, Moderate... 의 순서를 가진다.
summary(aSAH): 각 범주에 해당하는 환자의 수가 나온다.
4. 두 그룹의 평균 비교
0. 검정
1. 정규성 검정
output = lm(age ~ cardiogenicShock, data=df)
shapiro.test(resid(output))
Shapiro-Wilk normality test
data: resid(output)
W = 0.99083, p-value = 0.0003219
2. 등분산 검정
var.test(age ~ cardiogenicShock, data=df)
F test to compare two variances
data: age by cardiogenicShock
F = 1.6109, num df = 645, denom df = 30, p-value = 0.11
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.8939231 2.5597701
sample estimates:
ratio of variances
1.610921
1. t-검정
t.test(age ~ cardiogenicShock, data=df, var.equal=T)
Two Sample t-test
data: age by cardiogenicShock
t = 0.22149, df = 675, p-value = 0.8248
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.799331 4.765475
sample estimates:
mean in group No mean in group Yes
63.09598 62.61290
2. 웰치의 검정
t.test(age ~ cardiogenicShock, data=df, var.equal=F)
Welch Two Sample t-test
data: age by cardiogenicShock
t = 0.27492, df = 34.808, p-value = 0.785
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.084807 4.050951
sample estimates:
mean in group No mean in group Yes
63.09598 62.61290
3. 윌콕슨 순위합 검정 (Wilcox rank-sum test)
wilcox.test(age ~ cardiogenicShock, data=df)
Wilcoxon rank sum test with continuity correction
data: age by cardiogenicShock
W = 10361, p-value = 0.7438
alternative hypothesis: true location shift is not equal to 0
4. Boxplot
library(ggplot2)
fig <- function(width, heigth){
options(repr.plot.width = width, repr.plot.height = heigth)
}
fig(4, 4)
ggplot(df) +
aes(x = cardiogenicShock, y = age) +
geom_boxplot(shape = "circle", fill = "#4682B4") +
labs(x = "Cardiogenic Shock", y = "Age") +
theme_minimal()
5. 세 그룹 이상의 평균 비교
0. 검정
1. 정규성 검정
out = aov(LDLC ~ Dx, data=df)
shapiro.test(resid(out))
Shapiro-Wilk normality test
data: resid(out)
W = 0.96866, p-value = 7.479e-11
2. 등분산 검정
bartlett.test(LDLC ~ Dx, data=df)
Bartlett test of homogeneity of variances
data: LDLC by Dx
Bartlett's K-squared = 4.6984, df = 2, p-value = 0.09545
1. 분산분석(ANOVA)를 통한 그룹 간의 평균 비교
out = aov(LDLC ~ Dx, data=df)
summary(out)
Df Sum Sq Mean Sq F value Pr(>F)
Dx 2 18525 9263 5.649 0.00369 **
Residuals 674 1105097 1640
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
사후 검정 (mutiple comparison)
TukeyHSD(out)
Tukey multiple comparisons of means
95% family-wise confidence level
Fit: aov(formula = LDLC ~ Dx, data = df)
$Dx
diff lwr upr p adj
STEMI-NSTEMI -11.632006 -21.88366 -1.380357 0.0214679
Unstable Angina-NSTEMI -14.071523 -24.05755 -4.085496 0.0028229
Unstable Angina-STEMI -2.439517 -10.60690 5.727869 0.7626228
2. 웰치의 ANOVA를 이용한 그룹 간의 비교
oneway.test(LDLC ~ Dx, data=df, var.equal=F)
One-way analysis of means (not assuming equal variances)
data: LDLC and Dx
F = 4.7883, num df = 2.00, denom df = 334.18, p-value = 0.008907
사후 검정
games.howell <- function(grp, obs) {
#Create combinations
combs <- combn(unique(grp), 2)
# Statistics that will be used throughout the calculations:
# n = sample size of each group
# groups = number of groups in data
# Mean = means of each group sample
# std = variance of each group sample
n <- tapply(obs, grp, length)
groups <- length(tapply(obs, grp, length))
Mean <- tapply(obs, grp, mean)
std <- tapply(obs, grp, var)
statistics <- lapply(1:ncol(combs), function(x) {
mean.diff <- Mean[combs[2,x]] - Mean[combs[1,x]]
#t-values
t <- abs(Mean[combs[1,x]] - Mean[combs[2,x]]) / sqrt((std[combs[1,x]] / n[combs[1,x]]) + (std[combs[2,x]] / n[combs[2,x]]))
# Degrees of Freedom
df <- (std[combs[1,x]] / n[combs[1,x]] + std[combs[2,x]] / n[combs[2,x]])^2 / # Numerator Degrees of Freedom
((std[combs[1,x]] / n[combs[1,x]])^2 / (n[combs[1,x]] - 1) + # Part 1 of Denominator Degrees of Freedom
(std[combs[2,x]] / n[combs[2,x]])^2 / (n[combs[2,x]] - 1)) # Part 2 of Denominator Degrees of Freedom
#p-values
p <- ptukey(t * sqrt(2), groups, df, lower.tail = FALSE)
# Sigma standard error
se <- sqrt(0.5 * (std[combs[1,x]] / n[combs[1,x]] + std[combs[2,x]] / n[combs[2,x]]))
# Upper Confidence Limit
upper.conf <- lapply(1:ncol(combs), function(x) {
mean.diff + qtukey(p = 0.95, nmeans = groups, df = df) * se
})[[1]]
# Lower Confidence Limit
lower.conf <- lapply(1:ncol(combs), function(x) {
mean.diff - qtukey(p = 0.95, nmeans = groups, df = df) * se
})[[1]]
# Group Combinations
grp.comb <- paste(combs[1,x], ':', combs[2,x])
# Collect all statistics into list
stats <- list(grp.comb, mean.diff, se, t, df, p, upper.conf, lower.conf)
})
# Unlist statistics collected earlier
stats.unlisted <- lapply(statistics, function(x) {
unlist(x)
})
# Create dataframe from flattened list
results <- data.frame(matrix(unlist(stats.unlisted), nrow = length(stats.unlisted), byrow=TRUE))
# Select columns set as factors that should be numeric and change with as.numeric
results[c(2, 3:ncol(results))] <- round(as.numeric(as.matrix(results[c(2, 3:ncol(results))])), digits = 3)
# Rename data frame columns
colnames(results) <- c('groups', 'Mean Difference', 'Standard Error', 't', 'df', 'p', 'upper limit', 'lower limit')
return(results)
}
with(df, games.howell(Dx, LDLC))
groups | Mean Difference | Standard Error | t | df | p | upper limit | lower limit |
---|---|---|---|---|---|---|---|
STEMI : NSTEMI | 11.632 | 3.268 | 2.517 | 229.624 | 0.033 | 22.533 | 0.731 |
STEMI : Unstable Angina | -2.440 | 2.379 | 0.725 | 536.857 | 0.749 | 5.466 | -10.345 |
NSTEMI : Unstable Angina | -14.072 | 3.238 | 3.073 | 225.415 | 0.007 | -3.268 | -24.875 |
3. 크루스컬-왈리스 H 검정 (Kruskal-Wallis rank sum test)
kruskal.test(LDLC ~ Dx, data=df)
Kruskal-Wallis rank sum test
data: LDLC by Dx
Kruskal-Wallis chi-squared = 8.9643, df = 2, p-value = 0.01131
사후 검정
library(nparcomp)
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: TH.data
Loading required package: MASS
Attaching package: ‘TH.data’
The following object is masked from ‘package:MASS’:
geyser
result = mctp(LDLC ~ Dx, data=df)
summary(result)
#----------------Nonparametric Multiple Comparisons for relative effects---------------#
- Alternative Hypothesis: True differences of relative effects are not equal to 0
- Estimation Method: Global Pseudo Ranks
- Type of Contrast : Tukey
- Confidence Level: 95 %
- Method = Fisher with 278 DF
#--------------------------------------------------------------------------------------#
#----------------Nonparametric Multiple Comparisons for relative effects---------------#
- Alternative Hypothesis: True differences of relative effects are not equal to 0
- Estimation Method: Global Pseudo ranks
- Type of Contrast : Tukey
- Confidence Level: 95 %
- Method = Fisher with 278 DF
#--------------------------------------------------------------------------------------#
#----Data Info-------------------------------------------------------------------------#
Sample Size Effect Lower Upper
1 NSTEMI 131 0.5500590 0.5204616 0.5793063
2 STEMI 251 0.4905976 0.4648875 0.5163576
3 Unstable Angina 295 0.4593433 0.4345728 0.4843164
#----Contrast--------------------------------------------------------------------------#
1 2 3
2 - 1 -1 1 0
3 - 1 -1 0 1
3 - 2 0 -1 1
#----Analysis--------------------------------------------------------------------------#
Estimator Lower Upper Statistic p.Value
2 - 1 -0.059 -0.130 0.011 -1.974 0.1196648
3 - 1 -0.091 -0.159 -0.022 -3.087 0.0061829
3 - 2 -0.031 -0.090 0.028 -1.247 0.4252763
#----Overall---------------------------------------------------------------------------#
Quantile p.Value
1 2.353006 0.0061829
#--------------------------------------------------------------------------------------#
4. Boxplot 그래프
fig <- function(width, heigth){
options(repr.plot.width = width, repr.plot.height = heigth)
}
fig(4, 4)
ggplot(df) +
aes(x = Dx, y = LDLC) +
geom_boxplot(shape = "circle", fill = "#F2B091") +
theme_minimal()
6. 그룹 간의 비율 비교
기대 도수(expected values)가 5 이하인 셀이 전체 셀의 20% 이상이면 피셔의 정확한 검정을 시행
M x 2 표에서 M이 3이상이면서 서열이 있다면 코크란-아미티지 서열 검정 시행
1. Chi square and Fisher's exact test
library(gmodels)
with(df,
CrossTable(Dx, cardiogenicShock, chisq = T, fisher = T, expected = T, sresid = T, format = "SPSS"))
Cell Contents
|-------------------------|
| Count |
| Expected Values |
| Chi-square contribution |
| Row Percent |
| Column Percent |
| Total Percent |
| Std Residual |
|-------------------------|
Total Observations in Table: 677
| cardiogenicShock
Dx | No | Yes | Row Total |
----------------|-----------|-----------|-----------|
NSTEMI | 130 | 1 | 131 |
| 125.001 | 5.999 | |
| 0.200 | 4.165 | |
| 99.237% | 0.763% | 19.350% |
| 20.124% | 3.226% | |
| 19.202% | 0.148% | |
| 0.447 | -2.041 | |
----------------|-----------|-----------|-----------|
STEMI | 221 | 30 | 251 |
| 239.507 | 11.493 | |
| 1.430 | 29.799 | |
| 88.048% | 11.952% | 37.075% |
| 34.211% | 96.774% | |
| 32.644% | 4.431% | |
| -1.196 | 5.459 | |
----------------|-----------|-----------|-----------|
Unstable Angina | 295 | 0 | 295 |
| 281.492 | 13.508 | |
| 0.648 | 13.508 | |
| 100.000% | 0.000% | 43.575% |
| 45.666% | 0.000% | |
| 43.575% | 0.000% | |
| 0.805 | -3.675 | |
----------------|-----------|-----------|-----------|
Column Total | 646 | 31 | 677 |
| 95.421% | 4.579% | |
----------------|-----------|-----------|-----------|
Statistics for All Table Factors
Pearson's Chi-squared test
------------------------------------------------------------
Chi^2 = 49.75095 d.f. = 2 p = 1.572966e-11
Fisher's Exact Test for Count Data
------------------------------------------------------------
Alternative hypothesis: two.sided
p = 1.206607e-12
Minimum expected frequency: 5.998523
2. Cochran-Armitage trend test
library(DescTools)
df$Dx_arr <- factor(df$Dx, levels=c("Unstable Angina", "NSTEMI", "STEMI"))
table <- with(df,
table(Dx_arr, cardiogenicShock))
table <- addmargins(table)
table
No | Yes | Sum | |
---|---|---|---|
Unstable Angina | 295 | 0 | 295 |
NSTEMI | 130 | 1 | 131 |
STEMI | 221 | 30 | 251 |
Sum | 646 | 31 | 677 |
CochranArmitageTest(table)
Error: Cochran-Armitage test for trend must be used with rx2-table
Traceback:
1. CochranArmitageTest(table)
2. stop("Cochran-Armitage test for trend must be used with rx2-table",
. call. = FALSE)
3. 모자이크 그래프
Author And Source
이 문제에 관하여(R 코드 정리), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@shlee-ns/R-코드-정리저자 귀속: 원작자 정보가 원작자 URL에 포함되어 있으며 저작권은 원작자 소유입니다.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)