scikit-learn 사용법 (1)

다음 기사와 동일한 작업을 scikit-learn 0.23.1에서 수행했습니다.
기계 학습 라이브러리! scikit-learn이란 【초보자용】

사용한 버전
$ python
Python 3.8.5 (default, Jul 27 2020, 08:42:51) 
[GCC 10.1.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import sklearn
>>> print(sklearn.__version__)
0.23.1
>>>

데이터 확인

show_data.py
#! /usr/bin/python
#
#   show_data.py
#
#                       Sep/03/2020
#
from sklearn import datasets
import matplotlib.pyplot as plt

digits = datasets.load_digits()

plt.matshow(digits.images[0], cmap="Greys")
plt.show()

실행 결과


SVM

svm01.py
#! /usr/bin/python
#
#   svm01.py
#
#                       Sep/03/2020
#
from sklearn import datasets
from sklearn import svm
import sklearn.metrics as metrics

digits = datasets.load_digits()


X = digits.data
y = digits.target

X_train, y_train = X[0::2], y[0::2]
X_test, y_test = X[1::2], y[1::2]

clf = svm.SVC(gamma=0.001)

clf.fit(X_train, y_train)

accuracy = clf.score(X_test, y_test)
print(f"正解率{accuracy}")

predicted = clf.predict(X_test)

print("classification report")
print(metrics.classification_report(y_test, predicted))

실행 결과
$ ./svm01.py 
正解率0.9866369710467706
classification report
              precision    recall  f1-score   support

           0       1.00      0.99      0.99        88
           1       0.98      1.00      0.99        89
           2       1.00      1.00      1.00        91
           3       1.00      0.98      0.99        93
           4       0.99      1.00      0.99        88
           5       0.98      0.97      0.97        91
           6       0.99      1.00      0.99        90
           7       0.99      1.00      0.99        91
           8       0.97      0.97      0.97        86
           9       0.98      0.97      0.97        91

    accuracy                           0.99       898
   macro avg       0.99      0.99      0.99       898
weighted avg       0.99      0.99      0.99       898

로지스틱 회귀

logistic01.py
#! /usr/bin/python
#
#   logistic.py
#
#                       Sep/03/2020
#
from sklearn import datasets
import sklearn.metrics as metrics

from sklearn.linear_model import LogisticRegression

digits = datasets.load_digits()

X = digits.data
y = digits.target

X_train, y_train = X[0::2], y[0::2]
X_test, y_test = X[1::2], y[1::2]

clf = LogisticRegression(max_iter=2000)

clf.fit(X_train, y_train)

accuracy = clf.score(X_test, y_test)
print(f"正解率{accuracy}")

predicted = clf.predict(X_test)

print("classification report")
print(metrics.classification_report(y_test, predicted))

실행 결과
$ ./logistic01.py 
正解率0.9532293986636972
classification report
              precision    recall  f1-score   support

           0       1.00      0.98      0.99        88
           1       0.87      0.98      0.92        89
           2       0.97      1.00      0.98        91
           3       0.98      0.92      0.95        93
           4       0.93      0.98      0.96        88
           5       0.96      0.95      0.95        91
           6       0.97      0.99      0.98        90
           7       0.99      0.97      0.98        91
           8       0.95      0.88      0.92        86
           9       0.93      0.89      0.91        91

    accuracy                           0.95       898
   macro avg       0.95      0.95      0.95       898
weighted avg       0.95      0.95      0.95       898

관련 기사
scikit-learn 사용법 (2)
scikit-learn 사용법 (3)
scikit-learn 사용법 (4)

좋은 웹페이지 즐겨찾기