<과목> 기계 학습 제 5 장 : 알고리즘 1 (k 근방법 (kNN))

<과목> 기계 학습

목차
제1장:선형 회귀 모델
제2장:비선형 회귀 모델
제3장: 물류 회귀 모델
제4장:주성분 분석
제 5 장 : 알고리즘 1 (k 이웃 방법 (kNN))
제6장: 알고리즘 2(k-means)
제7장: 서포트 벡터 머신

제 5 장 : 알고리즘 1 (k 이웃 방법 (kNN))

k 이웃 방법 (kNN)

분류 문제를 위한 기계 학습 기법

최근 옆의 데이터를 개개해 와서, 그것들이 가장 많이 소속하는 클래스에 식별

k를 변화시키면 결과도 바뀐다

k를 크게 하면 결정 경계가 매끄럽게 된다

(연습 5) 인구 데이터와 분류 결과를 플롯

설정 인구 데이터 분류

과제 인구 데이터와 분류 결과를 플롯하십시오.

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

훈련 데이터 생성

def gen_data():
    x0 = np.random.normal(size=50).reshape(-1, 2) - 1
    x1 = np.random.normal(size=50).reshape(-1, 2) + 1.
    x_train = np.concatenate([x0, x1])
    y_train = np.concatenate([np.zeros(25), np.ones(25)]).astype(np.int)
    return x_train, y_train

X_train, ys_train = gen_data()
plt.scatter(X_train[:, 0], X_train[:, 1], c=ys_train)

학습 단계 없음

예측

예측할 데이터 포인트에 가장 가까운 거리 𝑘 개의 훈련 데이터 라벨의 최빈값 할당

def distance(x1, x2):
    return np.sum((x1 - x2)**2, axis=1)

def knc_predict(n_neighbors, x_train, y_train, X_test):
    y_pred = np.empty(len(X_test), dtype=y_train.dtype)
    for i, x in enumerate(X_test):
        distances = distance(x, X_train)
        nearest_index = distances.argsort()[:n_neighbors]
        mode, _ = stats.mode(y_train[nearest_index])
        y_pred[i] = mode
    return y_pred

def plt_resut(x_train, y_train, y_pred):
    xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
    xx = np.array([xx0, xx1]).reshape(2, -1).T
    plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train)
    plt.contourf(xx0, xx1, y_pred.reshape(100, 100).astype(dtype=np.float), alpha=0.2, levels=np.linspace(0, 1, 3))

n_neighbors = 3

xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
X_test = np.array([xx0, xx1]).reshape(2, -1).T

y_pred = knc_predict(n_neighbors, X_train, ys_train, X_test)
plt_resut(X_train, ys_train, y_pred)

numpy 구현에서도 사용해보십시오.

xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
xx = np.array([xx0, xx1]).reshape(2, -1).T

from sklearn.neighbors import KNeighborsClassifier
knc = KNeighborsClassifier(n_neighbors=n_neighbors).fit(X_train, ys_train)
plt_resut(X_train, ys_train, knc.predict(xx))

n_neighbors = 15

xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
X_test = np.array([xx0, xx1]).reshape(2, -1).T

y_pred = knc_predict(n_neighbors, X_train, ys_train, X_test)
plt_resut(X_train, ys_train, y_pred)

n_neighbors의 값을 3에서 15로 올려 보면, 오이타 매끄러워졌다.

관련 사이트
제1장:선형 회귀 모델
제2장:비선형 회귀 모델
제3장: 물류 회귀 모델
제4장:주성분 분석
제 5 장 : 알고리즘 1 (k 이웃 방법 (kNN))
제6장: 알고리즘 2(k-means)
제7장: 서포트 벡터 머신

Reference

이 문제에 관하여(<과목> 기계 학습 제 5 장 : 알고리즘 1 (k 근방법 (kNN))), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/matsukura04583/items/543719b44159322221ed

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다