Principal Components Analysis(주성분 분석법) R의 prcomp

7480 단어

Principal Components Analysis(주성분 분석법)


Description


Performs a principal components analysis on the given data matrix and returns the results as an object of class  prcomp .(주어진 데이터 매트릭스에 대해 PCA 분석을 실행하고prcomp의 한 대상을 통해 결과를 되돌려준다)

Usage

prcomp(x, ...)

## S3 method for class 'formula'
prcomp(formula, data = NULL, subset, na.action, ...)

## Default S3 method:
prcomp(x, retx = TRUE, center = TRUE, scale. = FALSE,
       tol = NULL, ...)

## S3 method for class 'prcomp'
predict(object, newdata, ...)

Arguments

formula
a formula with no response variable, referring only to numeric variables. 응답 변수의 공식이 없고 수치 변수와만 관련이 있습니다!data
an optional data frame (or similar: see  model.frame ) containing the variables in the formula  formula . By default the variables are taken from  environment(formula) . 선택할 수 있는 데이터 프레임워크는 공식formula의 변수를 포함합니다.기본적으로 이 변수는 환경에서 온다. (즉formula)subset
an optional vector used to select rows (observations) of the data matrix  x .데이터 매트릭스 x에서 줄을 선택하는 데 사용할 선택할 수 있는 벡터na.action
a function which indicates what should happen when the data contain  NA s. The default is set by the  na.action  setting of  options , and is  na.fail  if that is unset. The ‘factory-fresh’ default is  na.omit . ...
arguments passed to or from other methods. If  x  is a formula one might specify  scale.  or  tol . x
a numeric or complex matrix (or data frame) which provides the data for the principal components analysis.PCA 분석을 위한 수치 매트릭스retx
a logical value indicating whether the rotated variables should be returned.회전 변수를 반환할지 여부를 지정하는 논리 변수center
a logical value indicating whether the variables should be shifted to be zero centered. Alternately, a vector of length equal the number of columns of  x  can be supplied. The value is passed to  scale .논리 변수scale.
a logical value indicating whether the variables should be scaled to have unit variance before the analysis takes place. The default is  FALSE  for consistency with S, but in general scaling is advisable. Alternatively, a vector of length equal the number of columns of  x  can be supplied. The value is passed to  scale . 논리 변수는 분석을 하기 전에 단위의 방차로 비례를 조정해야 하는지를 나타낸다.FALSE는 고정 상태를 나타내며 일반적으로 열을 기준으로 조정할 수 있습니다.tol
a value indicating the magnitude below which components should be omitted. (Components are omitted if their standard deviations are less than or equal to  tol  times the standard deviation of the first component.) With the default null setting, no components are omitted. Other settings for tol could be  tol = 0  or  tol = sqrt(.Machine$double.eps) , which would omit essentially constant components.이 변수는 지정된 가장 낮은 성분의 등급을 지정하고 이 성분보다 낮은 등급의 등급은 무시된다.기본값은 null이며 모두 표시object
Object of class inheriting from  "prcomp" prcomp newdata
An optional data frame or matrix in which to look for variables with which to predict. If omitted, the scores are used. If the original fit used a formula or a data frame or a matrix with column names,  newdata  must contain columns with the same names. Otherwise it must contain the same number of columns, to be used in the same order.선택할 수 있는 데이터 프레임이나 행렬입니다. 이 데이터 프레임이나 행렬에서 예측할 수 있는 변수를 찾습니다.

Details


The calculation is done by a singular value decomposition of the (centered and possibly scaled) data matrix, not by using  eigen  on the covariance matrix. This is generally the preferred method for numerical accuracy. The  print  method for these objects prints the results in a nice format and the  plot  method produces a scree plot.
Unlike  princomp , variances are computed with the usual divisor N - 1.
계산 결과는 원 행렬을 기이한 값으로 분해하는 것이지 협방차의 특징을 사용하는 것이 아니다.이것은 통상적으로 수치가 정확한 첫 번째 방법이다.
Note that  scale = TRUE  cannot be used if there are zero or constant (for  center = TRUE ) variables.

Value

prcomp  returns a list with class  "prcomp"  containing the following components: sdev
the standard deviations of the principal components (i.e., the square roots of the eigenvalues of the covariance/correlation matrix, though the calculation is actually done with the singular values of the data matrix). rotation
the matrix of variable loadings (i.e., a matrix whose columns contain the eigenvectors). The function  princomp  returns this in the element  loadings . x
if  retx  is true the value of the rotated data (the centred (and scaled if requested) data multiplied by the  rotation  matrix) is returned. Hence,  cov(x)  is the diagonal matrix  diag(sdev^2) . For the formula method,  napredict()  is applied to handle the treatment of values omitted by the  na.action . center, scale
the centering and scaling used, or  FALSE .

Note


The signs of the columns of the rotation matrix are arbitrary, and so may differ between different programs for PCA, and even between different builds of R.

References


Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988) The New S Language. Wadsworth & Brooks/Cole.
Mardia, K. V., J. T. Kent, and J. M. Bibby (1979) Multivariate Analysis, London: Academic Press.
Venables, W. N. and B. D. Ripley (2002) Modern Applied Statistics with S, Springer-Verlag.

See Also

biplot.prcompscreeplotprincompcorcovsvdeigen .

Examples

## signs are random
require(graphics)

## the variances of the variables in the
## USArrests data vary by orders of magnitude, so scaling is appropriate
prcomp(USArrests)  # inappropriate
prcomp(USArrests, scale = TRUE)
prcomp(~ Murder + Assault + Rape, data = USArrests, scale = TRUE)
plot(prcomp(USArrests))
summary(prcomp(USArrests, scale = TRUE))
biplot(prcomp(USArrests, scale = TRUE))

좋은 웹페이지 즐겨찾기