R 코드 정리

28679 단어 R의학 통계R

1. CSV 파일 불러오기

df = read.csv(file="/Users/shlee/Dropbox/R/acs.csv", header= T)

2. 데이터 확인

head(): 데이터 확인
str(): 데이터 구조 확인
summary(): 데이터 요약

head(df)
agesexcardiogenicShockentryDxEFheightweightBMIobesityTCLDLCHDLCTGDMHBPsmoking
62 Male No Femoral STEMI 18.0 168 72 25.51020 Yes 215 154 35 155 Yes No Smoker
78 Female No Femoral STEMI 18.4 148 48 21.91381 No NA NA NA 166 No Yes Never
76 Female Yes Femoral STEMI 20.0 NA NA NA No NA NA NA NA No Yes Never
89 Female No Femoral STEMI 21.8 165 50 18.36547 No 121 73 20 89 No No Never
56 Male No Radial NSTEMI 21.8 162 64 24.38653 No 195 151 36 63 Yes Yes Smoker
73 Female No Radial Unstable Angina22.0 153 59 25.20398 Yes 184 112 38 137 Yes Yes Never
str(df)
'data.frame':	857 obs. of  17 variables:
 $ age             : int  62 78 76 89 56 73 58 62 59 71 ...
 $ sex             : Factor w/ 2 levels "Female","Male": 2 1 1 1 2 1 2 2 1 2 ...
 $ cardiogenicShock: Factor w/ 2 levels "No","Yes": 1 1 2 1 1 1 1 1 1 1 ...
 $ entry           : Factor w/ 2 levels "Femoral","Radial": 1 1 1 1 2 2 2 1 2 1 ...
 $ Dx              : Factor w/ 3 levels "NSTEMI","STEMI",..: 2 2 2 2 1 3 3 2 3 2 ...
 $ EF              : num  18 18.4 20 21.8 21.8 22 24.7 26.6 28.5 31.1 ...
 $ height          : num  168 148 NA 165 162 153 167 160 152 168 ...
 $ weight          : num  72 48 NA 50 64 59 78 50 67 60 ...
 $ BMI             : num  25.5 21.9 NA 18.4 24.4 ...
 $ obesity         : Factor w/ 2 levels "No","Yes": 2 1 1 1 1 2 2 1 2 1 ...
 $ TC              : num  215 NA NA 121 195 184 161 136 239 169 ...
 $ LDLC            : int  154 NA NA 73 151 112 91 88 161 88 ...
 $ HDLC            : int  35 NA NA 20 36 38 34 33 34 54 ...
 $ TG              : int  155 166 NA 89 63 137 196 30 118 141 ...
 $ DM              : Factor w/ 2 levels "No","Yes": 2 1 1 1 2 2 2 2 2 2 ...
 $ HBP             : Factor w/ 2 levels "No","Yes": 1 2 2 1 2 2 2 2 2 1 ...
 $ smoking         : Factor w/ 3 levels "Ex-smoker","Never",..: 3 2 2 2 3 2 1 1 2 3 ...
summary(df)
      age            sex      cardiogenicShock     entry    
 Min.   :28.00   Female:287   No :805          Femoral:312  
 1st Qu.:55.00   Male  :570   Yes: 52          Radial :545  
 Median :64.00                                              
 Mean   :63.31                                              
 3rd Qu.:72.00                                              
 Max.   :91.00                                              
                                                            
               Dx            EF            height          weight      
 NSTEMI         :153   Min.   :18.00   Min.   :130.0   Min.   : 30.00  
 STEMI          :304   1st Qu.:50.45   1st Qu.:158.0   1st Qu.: 58.00  
 Unstable Angina:400   Median :58.10   Median :165.0   Median : 65.00  
                       Mean   :55.83   Mean   :163.2   Mean   : 64.84  
                       3rd Qu.:62.35   3rd Qu.:170.0   3rd Qu.: 72.00  
                       Max.   :79.00   Max.   :185.0   Max.   :112.00  
                       NA's   :134     NA's   :93      NA's   :91      
      BMI        obesity         TC             LDLC            HDLC      
 Min.   :15.62   No :567   Min.   : 25.0   Min.   : 15.0   Min.   : 4.00  
 1st Qu.:22.13   Yes:290   1st Qu.:154.0   1st Qu.: 88.0   1st Qu.:32.00  
 Median :24.16             Median :183.0   Median :114.0   Median :38.00  
 Mean   :24.28             Mean   :185.2   Mean   :116.6   Mean   :38.24  
 3rd Qu.:26.17             3rd Qu.:213.0   3rd Qu.:141.0   3rd Qu.:45.00  
 Max.   :41.42             Max.   :493.0   Max.   :366.0   Max.   :89.00  
 NA's   :93                NA's   :23      NA's   :24      NA's   :23     
       TG          DM       HBP           smoking   
 Min.   : 11.0   No :553   No :356   Ex-smoker:204  
 1st Qu.: 68.0   Yes:304   Yes:501   Never    :332  
 Median :105.5                       Smoker   :321  
 Mean   :125.2                                      
 3rd Qu.:154.0                                      
 Max.   :877.0                                      
 NA's   :15                                         

3. 데이터 정리

1. 결측치 제거

na.omit(): 결측치 제거

str(df)를 보면 857명의 데이터가 있다.
na.omit(df) 실행 후 다시 str(df)를 실행하면 결측치가 있는 환자들은 제거되고 677명의 환자만 남았다.

df = na.omit(df)
str(df)
'data.frame':	677 obs. of  17 variables:
 $ age             : int  62 89 56 73 58 62 59 71 52 52 ...
 $ sex             : Factor w/ 2 levels "Female","Male": 2 1 2 1 2 2 1 2 2 1 ...
 $ cardiogenicShock: Factor w/ 2 levels "No","Yes": 1 1 1 1 1 1 1 1 1 1 ...
 $ entry           : Factor w/ 2 levels "Femoral","Radial": 1 1 2 2 2 1 2 1 2 2 ...
 $ Dx              : Factor w/ 3 levels "NSTEMI","STEMI",..: 2 2 1 3 3 2 3 2 3 3 ...
 $ EF              : num  18 21.8 21.8 22 24.7 26.6 28.5 31.1 31.1 31.1 ...
 $ height          : num  168 165 162 153 167 160 152 168 175 156 ...
 $ weight          : num  72 50 64 59 78 50 67 60 60 63 ...
 $ BMI             : num  25.5 18.4 24.4 25.2 28 ...
 $ obesity         : Factor w/ 2 levels "No","Yes": 2 1 1 2 2 1 2 1 1 2 ...
 $ TC              : num  215 121 195 184 161 136 239 169 272 184 ...
 $ LDLC            : int  154 73 151 112 91 88 161 88 212 123 ...
 $ HDLC            : int  35 20 36 38 34 33 34 54 32 43 ...
 $ TG              : int  155 89 63 137 196 30 118 141 52 72 ...
 $ DM              : Factor w/ 2 levels "No","Yes": 2 1 2 2 2 2 2 2 2 2 ...
 $ HBP             : Factor w/ 2 levels "No","Yes": 1 1 2 2 2 2 2 1 1 2 ...
 $ smoking         : Factor w/ 3 levels "Ex-smoker","Never",..: 3 2 3 2 1 1 2 3 1 2 ...
 - attr(*, "na.action")= 'omit' Named int  2 3 16 18 29 72 87 89 102 108 ...
  ..- attr(*, "names")= chr  "2" "3" "16" "18" ...

2. 범주형 변수로 변환

데이터를 엑셀에 정리 할 때 흔히 생존은 0, 사망은 1
또는 성별을 남자는 0, 여자는 1 이런식으로 숫자를 셀에 입력한다.

이렇게 작성한 csv 파일은 데이터가 정수형(int)로 담겨져 있기 때문에 범주형 자료로 변환을 해야한다.
그렇지 않으면 성별이 0.5 처럼 이상한 수치가 나오게 된다.

library(pROC)
data(aSAH)
aSAH$gos6 <- as.integer(aSAH$gos6)
Type 'citation("pROC")' for a citation.

Attaching package: ‘pROC’

The following objects are masked from ‘package:stats’:

    cov, smooth, var
head(aSAH)
str(aSAH)
summary(aSAH)
gos6outcomegenderagewfnss100bndka
295 Good Female42 1 0.13 3.01
305 Good Female37 1 0.14 8.54
315 Good Female42 1 0.10 8.09
325 Good Female27 1 0.04 10.42
331 Poor Female42 3 0.13 17.40
341 Poor Male 48 2 0.10 12.75
'data.frame':	113 obs. of  7 variables:
 $ gos6   : int  5 5 5 5 1 1 4 1 5 4 ...
 $ outcome: Factor w/ 2 levels "Good","Poor": 1 1 1 1 2 2 1 2 1 1 ...
 $ gender : Factor w/ 2 levels "Male","Female": 2 2 2 2 2 1 1 1 2 2 ...
 $ age    : int  42 37 42 27 42 48 57 41 49 75 ...
 $ wfns   : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 3 2 5 4 1 2 ...
 $ s100b  : num  0.13 0.14 0.1 0.04 0.13 0.1 0.47 0.16 0.18 0.1 ...
 $ ndka   : num  3.01 8.54 8.09 10.42 17.4 ...



      gos6       outcome      gender        age       wfns       s100b      
 Min.   :1.000   Good:72   Male  :42   Min.   :18.0   1:39   Min.   :0.030  
 1st Qu.:3.000   Poor:41   Female:71   1st Qu.:42.0   2:32   1st Qu.:0.090  
 Median :5.000                         Median :51.0   3: 4   Median :0.140  
 Mean   :3.726                         Mean   :51.1   4:16   Mean   :0.247  
 3rd Qu.:5.000                         3rd Qu.:61.0   5:22   3rd Qu.:0.330  
 Max.   :5.000                         Max.   :81.0          Max.   :2.070  
      ndka       
 Min.   :  3.01  
 1st Qu.:  9.01  
 Median : 12.22  
 Mean   : 19.66  
 3rd Qu.: 17.30  
 Max.   :419.19  

aSAH 데이터는 pROC package에서 불러온 데이터다.
gos6가 애초에 범주형 변수로 저장되어 있었으나 임의로 정수형 변수로 변환 시켰다.

str(aSAH)를 보면 gos6가 int(정수형 자료)로 되어있다.
summary(aSAH)를 보면 gos6의 평균, 최소, 최대값 등이 나와있다.

이제 gos6를 다시 범주형 변수로 변환하겠다.

aSAH$gos6 <- factor(aSAH$gos6, levels=c(1:5), labels=c("Good", "Moderate", "Severe", "Vegetative", "Death"))

gos6를 levels=c(1:5)로 1, 2, 3, 4, 5로 순서를 주었다.
그리고 labels=c("Good", "Moderate", "Severe", "Vegetative", "Death")로 순서에 해당하는 labels을 부여했다.

반대로 leels=c(5:1)로 순서를 반대로 해도 된다.
그러면 labels=c("Death", "Vegetative", "Severe", "Moderate", "Good")으로
label도 순서를 반대로 해야 이름이 제대로 부여 된다.

head(aSAH)
str(aSAH)
summary(aSAH)
gos6outcomegenderagewfnss100bndka
29Death Good Female42 1 0.13 3.01
30Death Good Female37 1 0.14 8.54
31Death Good Female42 1 0.10 8.09
32Death Good Female27 1 0.04 10.42
33Good Poor Female42 3 0.13 17.40
34Good Poor Male 48 2 0.10 12.75
'data.frame':	113 obs. of  7 variables:
 $ gos6   : Factor w/ 5 levels "Good","Moderate",..: 5 5 5 5 1 1 4 1 5 4 ...
 $ outcome: Factor w/ 2 levels "Good","Poor": 1 1 1 1 2 2 1 2 1 1 ...
 $ gender : Factor w/ 2 levels "Male","Female": 2 2 2 2 2 1 1 1 2 2 ...
 $ age    : int  42 37 42 27 42 48 57 41 49 75 ...
 $ wfns   : Ord.factor w/ 5 levels "1"<"2"<"3"<"4"<..: 1 1 1 1 3 2 5 4 1 2 ...
 $ s100b  : num  0.13 0.14 0.1 0.04 0.13 0.1 0.47 0.16 0.18 0.1 ...
 $ ndka   : num  3.01 8.54 8.09 10.42 17.4 ...



         gos6    outcome      gender        age       wfns       s100b      
 Good      :28   Good:72   Male  :42   Min.   :18.0   1:39   Min.   :0.030  
 Moderate  : 0   Poor:41   Female:71   1st Qu.:42.0   2:32   1st Qu.:0.090  
 Severe    :13                         Median :51.0   3: 4   Median :0.140  
 Vegetative: 6                         Mean   :51.1   4:16   Mean   :0.247  
 Death     :66                         3rd Qu.:61.0   5:22   3rd Qu.:0.330  
                                       Max.   :81.0          Max.   :2.070  
      ndka       
 Min.   :  3.01  
 1st Qu.:  9.01  
 Median : 12.22  
 Mean   : 19.66  
 3rd Qu.: 17.30  
 Max.   :419.19  

head(aSAH): 5가 Death로, 1이 Good으로 변환되어 있다.
str(aSAH): gos6의 자료형이 Factor로 변환되었으며 Good, Moderate... 의 순서를 가진다.
summary(aSAH): 각 범주에 해당하는 환자의 수가 나온다.

4. 두 그룹의 평균 비교

0. 검정

1. 정규성 검정

output = lm(age ~ cardiogenicShock, data=df)
shapiro.test(resid(output))
	Shapiro-Wilk normality test

data:  resid(output)
W = 0.99083, p-value = 0.0003219

2. 등분산 검정

var.test(age ~ cardiogenicShock, data=df)
	F test to compare two variances

data:  age by cardiogenicShock
F = 1.6109, num df = 645, denom df = 30, p-value = 0.11
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
 0.8939231 2.5597701
sample estimates:
ratio of variances 
          1.610921 

1. t-검정

t.test(age ~ cardiogenicShock, data=df, var.equal=T)
	Two Sample t-test

data:  age by cardiogenicShock
t = 0.22149, df = 675, p-value = 0.8248
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.799331  4.765475
sample estimates:
 mean in group No mean in group Yes 
         63.09598          62.61290 

2. 웰치의 검정

t.test(age ~ cardiogenicShock, data=df, var.equal=F)
	Welch Two Sample t-test

data:  age by cardiogenicShock
t = 0.27492, df = 34.808, p-value = 0.785
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 -3.084807  4.050951
sample estimates:
 mean in group No mean in group Yes 
         63.09598          62.61290 

3. 윌콕슨 순위합 검정 (Wilcox rank-sum test)

wilcox.test(age ~ cardiogenicShock, data=df)
	Wilcoxon rank sum test with continuity correction

data:  age by cardiogenicShock
W = 10361, p-value = 0.7438
alternative hypothesis: true location shift is not equal to 0

4. Boxplot

library(ggplot2)

fig <- function(width, heigth){
     options(repr.plot.width = width, repr.plot.height = heigth)
}

fig(4, 4)

ggplot(df) +
  aes(x = cardiogenicShock, y = age) +
  geom_boxplot(shape = "circle", fill = "#4682B4") +
  labs(x = "Cardiogenic Shock", y = "Age") +
  theme_minimal()

5. 세 그룹 이상의 평균 비교

0. 검정

1. 정규성 검정

out = aov(LDLC ~ Dx, data=df)
shapiro.test(resid(out))
	Shapiro-Wilk normality test

data:  resid(out)
W = 0.96866, p-value = 7.479e-11

2. 등분산 검정

bartlett.test(LDLC ~ Dx, data=df)
	Bartlett test of homogeneity of variances

data:  LDLC by Dx
Bartlett's K-squared = 4.6984, df = 2, p-value = 0.09545

1. 분산분석(ANOVA)를 통한 그룹 간의 평균 비교

out = aov(LDLC ~ Dx, data=df)
summary(out)
             Df  Sum Sq Mean Sq F value  Pr(>F)   
Dx            2   18525    9263   5.649 0.00369 **
Residuals   674 1105097    1640                   
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

사후 검정 (mutiple comparison)

TukeyHSD(out)
  Tukey multiple comparisons of means
    95% family-wise confidence level

Fit: aov(formula = LDLC ~ Dx, data = df)

$Dx
                             diff       lwr       upr     p adj
STEMI-NSTEMI           -11.632006 -21.88366 -1.380357 0.0214679
Unstable Angina-NSTEMI -14.071523 -24.05755 -4.085496 0.0028229
Unstable Angina-STEMI   -2.439517 -10.60690  5.727869 0.7626228

2. 웰치의 ANOVA를 이용한 그룹 간의 비교

oneway.test(LDLC ~ Dx, data=df, var.equal=F)
	One-way analysis of means (not assuming equal variances)

data:  LDLC and Dx
F = 4.7883, num df = 2.00, denom df = 334.18, p-value = 0.008907

사후 검정

games.howell <- function(grp, obs) {
  
  #Create combinations
  combs <- combn(unique(grp), 2)
  
  # Statistics that will be used throughout the calculations:
  # n = sample size of each group
  # groups = number of groups in data
  # Mean = means of each group sample
  # std = variance of each group sample
  n <- tapply(obs, grp, length)
  groups <- length(tapply(obs, grp, length))
  Mean <- tapply(obs, grp, mean)
  std <- tapply(obs, grp, var)
  
  statistics <- lapply(1:ncol(combs), function(x) {
    
    mean.diff <- Mean[combs[2,x]] - Mean[combs[1,x]]
    
    #t-values
    t <- abs(Mean[combs[1,x]] - Mean[combs[2,x]]) / sqrt((std[combs[1,x]] / n[combs[1,x]]) + (std[combs[2,x]] / n[combs[2,x]]))
    
    # Degrees of Freedom
    df <- (std[combs[1,x]] / n[combs[1,x]] + std[combs[2,x]] / n[combs[2,x]])^2 / # Numerator Degrees of Freedom
      ((std[combs[1,x]] / n[combs[1,x]])^2 / (n[combs[1,x]] - 1) + # Part 1 of Denominator Degrees of Freedom 
         (std[combs[2,x]] / n[combs[2,x]])^2 / (n[combs[2,x]] - 1)) # Part 2 of Denominator Degrees of Freedom
    
    #p-values
    p <- ptukey(t * sqrt(2), groups, df, lower.tail = FALSE)
    
    # Sigma standard error
    se <- sqrt(0.5 * (std[combs[1,x]] / n[combs[1,x]] + std[combs[2,x]] / n[combs[2,x]]))
    
    # Upper Confidence Limit
    upper.conf <- lapply(1:ncol(combs), function(x) {
      mean.diff + qtukey(p = 0.95, nmeans = groups, df = df) * se
    })[[1]]
    
    # Lower Confidence Limit
    lower.conf <- lapply(1:ncol(combs), function(x) {
      mean.diff - qtukey(p = 0.95, nmeans = groups, df = df) * se
    })[[1]]
    
    # Group Combinations
    grp.comb <- paste(combs[1,x], ':', combs[2,x])
    
    # Collect all statistics into list
    stats <- list(grp.comb, mean.diff, se, t, df, p, upper.conf, lower.conf)
  })
  
    # Unlist statistics collected earlier
    stats.unlisted <- lapply(statistics, function(x) {
      unlist(x)
    })
  
    # Create dataframe from flattened list
    results <- data.frame(matrix(unlist(stats.unlisted), nrow = length(stats.unlisted), byrow=TRUE))
  
    # Select columns set as factors that should be numeric and change with as.numeric
    results[c(2, 3:ncol(results))] <- round(as.numeric(as.matrix(results[c(2, 3:ncol(results))])), digits = 3)
  
    # Rename data frame columns
    colnames(results) <- c('groups', 'Mean Difference', 'Standard Error', 't', 'df', 'p', 'upper limit', 'lower limit')
  
    return(results)
  }

with(df, games.howell(Dx, LDLC))
groupsMean DifferenceStandard Errortdfpupper limitlower limit
STEMI : NSTEMI 11.632 3.268 2.517 229.624 0.033 22.533 0.731
STEMI : Unstable Angina -2.440 2.379 0.725 536.857 0.749 5.466 -10.345
NSTEMI : Unstable Angina-14.072 3.238 3.073 225.415 0.007 -3.268 -24.875

3. 크루스컬-왈리스 H 검정 (Kruskal-Wallis rank sum test)

kruskal.test(LDLC ~ Dx, data=df)
	Kruskal-Wallis rank sum test

data:  LDLC by Dx
Kruskal-Wallis chi-squared = 8.9643, df = 2, p-value = 0.01131

사후 검정

library(nparcomp)
Loading required package: multcomp
Loading required package: mvtnorm
Loading required package: survival
Loading required package: TH.data
Loading required package: MASS

Attaching package: ‘TH.data’

The following object is masked from ‘package:MASS’:

    geyser
result = mctp(LDLC ~ Dx, data=df)
summary(result)
 #----------------Nonparametric Multiple Comparisons for relative effects---------------# 
 
 - Alternative Hypothesis:  True differences of relative effects are not equal to 0 
 - Estimation Method:  Global Pseudo Ranks 
 - Type of Contrast : Tukey 
 - Confidence Level: 95 % 
 - Method = Fisher with 278 DF 
 
 #--------------------------------------------------------------------------------------# 
 

 #----------------Nonparametric Multiple Comparisons for relative effects---------------# 
 
 - Alternative Hypothesis:  True differences of relative effects are not equal to 0 
 - Estimation Method: Global Pseudo ranks 
 - Type of Contrast : Tukey 
 - Confidence Level: 95 % 
 - Method = Fisher with 278 DF 
 
 #--------------------------------------------------------------------------------------# 
 
 #----Data Info-------------------------------------------------------------------------# 
           Sample Size    Effect     Lower     Upper
1          NSTEMI  131 0.5500590 0.5204616 0.5793063
2           STEMI  251 0.4905976 0.4648875 0.5163576
3 Unstable Angina  295 0.4593433 0.4345728 0.4843164

 #----Contrast--------------------------------------------------------------------------# 
       1  2 3
2 - 1 -1  1 0
3 - 1 -1  0 1
3 - 2  0 -1 1

 #----Analysis--------------------------------------------------------------------------# 
      Estimator  Lower  Upper Statistic   p.Value
2 - 1    -0.059 -0.130  0.011    -1.974 0.1196648
3 - 1    -0.091 -0.159 -0.022    -3.087 0.0061829
3 - 2    -0.031 -0.090  0.028    -1.247 0.4252763

 #----Overall---------------------------------------------------------------------------# 
  Quantile   p.Value
1 2.353006 0.0061829

 #--------------------------------------------------------------------------------------# 

4. Boxplot 그래프

fig <- function(width, heigth){
     options(repr.plot.width = width, repr.plot.height = heigth)
}

fig(4, 4)

ggplot(df) +
  aes(x = Dx, y = LDLC) +
  geom_boxplot(shape = "circle", fill = "#F2B091") +
  theme_minimal()

6. 그룹 간의 비율 비교

기대 도수(expected values)가 5 이하인 셀이 전체 셀의 20% 이상이면 피셔의 정확한 검정을 시행
M x 2 표에서 M이 3이상이면서 서열이 있다면 코크란-아미티지 서열 검정 시행

1. Chi square and Fisher's exact test

library(gmodels)
with(df, 
     CrossTable(Dx, cardiogenicShock, chisq = T, fisher = T, expected = T, sresid = T, format = "SPSS"))
   Cell Contents
|-------------------------|
|                   Count |
|         Expected Values |
| Chi-square contribution |
|             Row Percent |
|          Column Percent |
|           Total Percent |
|            Std Residual |
|-------------------------|

Total Observations in Table:  677 

                | cardiogenicShock 
             Dx |       No  |      Yes  | Row Total | 
----------------|-----------|-----------|-----------|
         NSTEMI |      130  |        1  |      131  | 
                |  125.001  |    5.999  |           | 
                |    0.200  |    4.165  |           | 
                |   99.237% |    0.763% |   19.350% | 
                |   20.124% |    3.226% |           | 
                |   19.202% |    0.148% |           | 
                |    0.447  |   -2.041  |           | 
----------------|-----------|-----------|-----------|
          STEMI |      221  |       30  |      251  | 
                |  239.507  |   11.493  |           | 
                |    1.430  |   29.799  |           | 
                |   88.048% |   11.952% |   37.075% | 
                |   34.211% |   96.774% |           | 
                |   32.644% |    4.431% |           | 
                |   -1.196  |    5.459  |           | 
----------------|-----------|-----------|-----------|
Unstable Angina |      295  |        0  |      295  | 
                |  281.492  |   13.508  |           | 
                |    0.648  |   13.508  |           | 
                |  100.000% |    0.000% |   43.575% | 
                |   45.666% |    0.000% |           | 
                |   43.575% |    0.000% |           | 
                |    0.805  |   -3.675  |           | 
----------------|-----------|-----------|-----------|
   Column Total |      646  |       31  |      677  | 
                |   95.421% |    4.579% |           | 
----------------|-----------|-----------|-----------|

 
Statistics for All Table Factors


Pearson's Chi-squared test 
------------------------------------------------------------
Chi^2 =  49.75095     d.f. =  2     p =  1.572966e-11 


 
Fisher's Exact Test for Count Data
------------------------------------------------------------
Alternative hypothesis: two.sided
p =  1.206607e-12 

 
       Minimum expected frequency: 5.998523 

2. Cochran-Armitage trend test

library(DescTools)

df$Dx_arr <- factor(df$Dx, levels=c("Unstable Angina", "NSTEMI", "STEMI"))

table <- with(df, 
              table(Dx_arr, cardiogenicShock))
table <- addmargins(table)
table
NoYesSum
Unstable Angina295 0 295
NSTEMI130 1 131
STEMI22130 251
Sum64631 677
CochranArmitageTest(table)
Error: Cochran-Armitage test for trend must be used with rx2-table
Traceback:


1. CochranArmitageTest(table)

2. stop("Cochran-Armitage test for trend must be used with rx2-table", 
 .     call. = FALSE)

3. 모자이크 그래프

좋은 웹페이지 즐겨찾기