Business Analytics II Summary
Regression Analysis
Linear Regression
Regression Assumptions
- No Multicollinearity
- Variance Inflation Factor (VIF) < 10
- Homoskedasticity
- Linearity : Residual Distribution
- Breusch-Pagan Test : if p-value > 0.05, samples have homoskedasticity.
- Correction:
sm.OLS(y, x).fit(cov_type="HC3")
- Normality of Error
- QQ Plot
- Normality Tests: if p-value > 0.05, samples are normally distributed.
(Kolmogorov-Smirnov, Shapiro-Wilk, Jarque-Bera etc.)
Non-Linear Regression
- Logistic Regression
- Probit Regression
Machine Learning
Decision Tree
Confusion Matrix
sm.OLS(y, x).fit(cov_type="HC3")
(Kolmogorov-Smirnov, Shapiro-Wilk, Jarque-Bera etc.)
Decision Tree
Confusion Matrix
Predicted (y=1) | Predicted (y=0) | |
---|---|---|
True (y=1) | True Positive | False Negative (Type II Error) |
False (y=0) | False Positive (Type I Error) | True Negative |
- Accuracy = (TP + TN) / Total
- Precision = TP / (TP + FP)
- Recall = TP / (TP + FN)
- F1 Score = 2 (Precision x Recall) / (Precision + Recall)
Random Forest
- Ensemble learning method: a multitude of decision trees
- Data Preprocessing (Encoding, Categorizing, Normalizing, Scaling)
- Balancing Dataset (Up/Down Sampling)
- Defining Variables (Dependent/Independent)
- Modeling (Supervised Learning) & Cross Validation
- Evaluation (Accuracy Scores, Feature Importances)
Neural Networks
MLPClassifier(activation='relu', hidden_layer_sizes=10, max_iter=100)
Support Vector Machine
- Linear SVM
SVC(kernel='linear')
- Non-linear SVM
- Kernel: Polynomial(
'poly'
), Gaussian: Radial Basis Fuction('rbf'
), Sigmoid('sigmoid'
)
Naive Bayes
GaussianNB()
K-Nearest Neighbor
KNeighborsClassifier(n_neighbors=10)
Author And Source
이 문제에 관하여(Business Analytics II Summary), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@yewonkim/Business-Analytics-II-Basics-Summary저자 귀속: 원작자 정보가 원작자 URL에 포함되어 있으며 저작권은 원작자 소유입니다.
우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)