Dissecting the Practice Dataset

Analyze how the default practice datasets are organized using keys.

Keys

Keys are normally composed of data, target, target_name, feature_names & DESCR

data : feature data-set
target : label data-set in classification, number data-set in regression
target_names : names of label data (only in classification)
feature_names : names of feature data
DESCR : explanation on dataset and each features

Input

import sklearn
from sklearn.datasets import load_iris

iris_data = load_iris()
print(type(iris_data))

Output

<class 'sklearn.utils.Bunch'>

Input

keys = iris_data.keys()
print('Keys of Iris Data : ', keys)

Ouput

Keys of Iris Data :  dict_keys(['data', 'target', 'target_names', 'DESCR', 'feature_names', 'filename'])

Input

print('Type of Feature Names : ', type(iris_data.feature_names))
print('Shape of Feature Names : ', len(iris_data.feature_names))
print(iris_data.feature_names)

print('\n Type of Target Names : ', type(iris_data.target_names))
print('Shape of Target Names : ', len(iris_data.target_names))
print(iris_data.target_names)

print('\n Type of Data : ', type(iris_data.data))
print('Shape of Data : ', iris_data.data.shape)
print(iris_data['data'][:5]) #print first five feature data

print('Type of Target Data : ', type(iris_data.target))
print('Shape of Target Data : ', iris_data.target.shape)
print(iris_data.target)

Output

Type of Feature Names :  <class 'list'>
Shape of Feature Names :  4
['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']

 Type of Target Names :  <class 'numpy.ndarray'>
Shape of Target Names :  3
['setosa' 'versicolor' 'virginica']

 Type of Data :  <class 'numpy.ndarray'>
Shape of Data :  (150, 4)
[[5.1 3.5 1.4 0.2]
 [4.9 3.  1.4 0.2]
 [4.7 3.2 1.3 0.2]
 [4.6 3.1 1.5 0.2]
 [5.  3.6 1.4 0.2]]
Type of Target Data :  <class 'numpy.ndarray'>
Shape of Target Data :  (150,)
[0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2
 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 2 2]

Author And Source

이 문제에 관하여(Dissecting the Practice Dataset), 우리는 이곳에서 더 많은 자료를 발견하고 링크를 클릭하여 보았다 https://velog.io/@jiselectric/Dissecting-the-Dataset

우수한 개발자 콘텐츠 발견에 전념 (Collection and Share based on the CC Protocol.)

좋은 웹페이지 즐겨찾기

개발자 우수 사이트 수집

개발자가 알아야 할 필수 사이트 100선 추천 우리는 당신을 위해 100개의 자주 사용하는 개발자 학습 사이트를 정리했습니다