알고리즘

k 근방법

k-means

k근방법

k 근방법은 기계 학습 알고리즘의 하나이며, 분류 문제를 풀기 위한 교사 있어 학습이다.

훈련 데이터 생성

%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
from scipy import stats

def gen_data():
  x0 = np.random.normal(size=50).reshape(-1, 2) - 1
  x1 = np.random.normal(size=50).reshape(-1, 2) + 1.
  x_train = np.concatenate([x0, x1])
  y_train = np.concatenate([np.zeros(25), np.ones(25)]).astype(np.int)
  return x_train, y_train

X_train, ys_train = gen_data()
plt.scatter(X_train[:, 0], X_train[:, 1], c=ys_train)

예측

예측할 데이터 저장소와의 거리가 가장 가까운 k개의 훈련 데이터 라벨의 최빈값을 할당합니다.
여기서 k를 변화시키면 결과도 바뀐다.

def distance(x1, x2):
    return np.sum((x1 - x2)**2, axis=1)

def knc_predict(n_neighbors, x_train, y_train, X_test):
    y_pred = np.empty(len(X_test), dtype=y_train.dtype)
    for i, x in enumerate(X_test):
        distances = distance(x, X_train)
        nearest_index = distances.argsort()[:n_neighbors]
        mode, _ = stats.mode(y_train[nearest_index])
        y_pred[i] = mode
    return y_pred

def plt_resut(x_train, y_train, y_pred):
    xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
    xx = np.array([xx0, xx1]).reshape(2, -1).T
    plt.scatter(x_train[:, 0], x_train[:, 1], c=y_train)
    plt.contourf(xx0, xx1, y_pred.reshape(100, 100).astype(dtype=np.float), alpha=0.2, levels=np.linspace(0, 1, 3))

n_neighbors = 3

xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
X_test = np.array([xx0, xx1]).reshape(2, -1).T

y_pred = knc_predict(n_neighbors, X_train, ys_train, X_test)
plt_resut(X_train, ys_train, y_pred)

numpy로 구현하는 경우

xx0, xx1 = np.meshgrid(np.linspace(-5, 5, 100), np.linspace(-5, 5, 100))
xx = np.array([xx0, xx1]).reshape(2, -1).T

from sklearn.neighbors import KNeighborsClassifier
knc = KNeighborsClassifier(n_neighbors=n_neighbors).fit(X_train, ys_train)
plt_resut(X_train, ys_train, knc.predict(xx))

k-means

k-means는 클러스터링 알고리즘의 하나이며, 주어진 데이터를 k개의 클러스터로 분류하기 위한 교사 없는 학습이다.
클러스터링은 특징과 유사한 것을 서로 그룹화하는 것입니다.

절차

① 각 클러스터 중심의 초기값 설정

② 각 데이터 포인트에 대해 각 클러스터 중심과의 거리를 계산하고 가장 거리가 가까운 클러스터를 할당합니다.

③ 각 클러스터의 평균 벡터 계산

④수렴할 때까지 ②, ③의 처리를 반복한다

Reference

이 문제에 관하여(알고리즘), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://qiita.com/takuowake/items/4ab00c7cb0cdb67de2c6

텍스트를 자유롭게 공유하거나 복사할 수 있습니다.하지만 이 문서의 URL은 참조 URL로 남겨 두십시오.

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다