250x250

Notice

Recent Posts

Recent Comments

Link

Tags more

Archives

Today

Total

관리 메뉴

흰둥이는 코드를 짤 때 짖어 (왈!왈!왈!왈!왈!왈!왈!왈!왈!왈!왈!)

(Python) 아이리스 데이터셋 본문

파이썬 머신러닝, 딥러닝

(Python) 아이리스 데이터셋

흰둥아솜사탕 2023. 6. 12. 16:35

728x90

1. Iris DataSet

사이킷런 데이터셋 페이지
데이터셋: 특정한 작업을 위해 테이터를 관련성 있게 모아놓은 것

사이킷런에서 제공하는 아이리스에 데이터셋을 가져와 선언한다.

from sklearn.datasets import load_iris

iris = load_iris()

sepal length (cm):꽃받침의 길이
sepal width (cm):꽃받침의 너비
petal length (cm):꽃잎의 길이
petal width (cm),:꽃잎의 너비

데이터 셋 분석

data = iris['data']
data

array([[5.1, 3.5, 1.4, 0.2],
       [4.9, 3. , 1.4, 0.2],
       [4.7, 3.2, 1.3, 0.2],
       [4.6, 3.1, 1.5, 0.2],
		...
       [6.5, 3. , 5.2, 2. ],
       [6.2, 3.4, 5.4, 2.3],
       [5.9, 3. , 5.1, 1.8]])

target = iris['target']
target

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

feature_names = iris['feature_names']
feature_names

['sepal length (cm)',
 'sepal width (cm)',
 'petal length (cm)',
 'petal width (cm)']

import pandas as pd

아이리스 데이터셋에 존재하는 data를 feature_names를 기준으로 데이터프레임으로 만든다.

df_iris = pd.DataFrame(data, columns=feature_names)
df_iris.head()

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

데이터 프레임에 target 컬럼을 새로 만들어내어 아이리스의 taget을 추가해준다.

df_iris['target'] = target
df_iris

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

150 rows × 5 columns

데이터프레임을 train_test_split을 이용하여 학습 모델에 학습할 데이터와 테스트할 데이터로 나눠준다.

from sklearn.model_selection import train_test_split

# train_test_split(독립변수, 종속변수, 테스트사이즈=0.25, 시드값)
X_train, X_test, y_train, y_test = train_test_split(df_iris.drop('target', 1),
                                                    df_iris['target'], 
                                                    test_size=0.2,
                                                    random_state=10)

데이터들이 무사히 테스트용과 학습용으로 나뉜것을 확인 할 수 있다.

X_train.shape, X_test.shape

((120, 4), (30, 4))

y_train.shape, y_test.shape

((120,), (30,))

X_train

Out[ ]:

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
58	6.6	2.9	4.6	1.3
97	6.2	2.9	4.3	1.3
129	7.2	3.0	5.8	1.6
114	5.8	2.8	5.1	2.4
146	6.3	2.5	5.0	1.9
...	...	...	...	...
113	5.7	2.5	5.0	2.0
64	5.6	2.9	3.6	1.3
15	5.7	4.4	1.5	0.4
125	7.2	3.2	6.0	1.8
9	4.9	3.1	1.5	0.1

120 rows × 4 columns

from sklearn.svm import SVC
from sklearn.metrics import accuracy_score

svc = SVC()

모델에 위에서 만든 학습용 독립변수와 종속변수를 학습시킨다.

svc.fit(X_train, y_train)

SVC()

이후 테스트용 독립변수를 이용하여 예측값을 선언해준다.

y_pred = svc.predict(X_test)

accuracy_score를 이용하여 기존에 만들어둔 테스트용 종속변수와 예측값을 비교하여 정답률을 추출해낸다.

print('정답률: ', accuracy_score(y_test, y_pred))

정답률:  0.9666666666666667

728x90

'파이썬 머신러닝, 딥러닝' 카테고리의 다른 글

(Python) 의사 결정 나무 (0)	2023.06.14
(Python) 선형 회귀 (0)	2023.06.12
(Python) 타이타닉 데이터셋 (0)	2023.06.12
(Python) 사이킷런 (0)	2023.06.12
(Python) 머신러닝 (0)	2023.06.12

'파이썬 머신러닝, 딥러닝' Related Articles

Comments

일	월	화	수	목	금	토
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
58	6.6	2.9	4.6	1.3
97	6.2	2.9	4.3	1.3
129	7.2	3.0	5.8	1.6
114	5.8	2.8	5.1	2.4
146	6.3	2.5	5.0	1.9
...	...	...	...	...
113	5.7	2.5	5.0	2.0
64	5.6	2.9	3.6	1.3
15	5.7	4.4	1.5	0.4
125	7.2	3.2	6.0	1.8
9	4.9	3.1	1.5	0.1

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
58	6.6	2.9	4.6	1.3
97	6.2	2.9	4.3	1.3
129	7.2	3.0	5.8	1.6
114	5.8	2.8	5.1	2.4
146	6.3	2.5	5.0	1.9
...	...	...	...	...
113	5.7	2.5	5.0	2.0
64	5.6	2.9	3.6	1.3
15	5.7	4.4	1.5	0.4
125	7.2	3.2	6.0	1.8
9	4.9	3.1	1.5	0.1

흰둥이는 코드를 짤 때 짖어 (왈!왈!왈!왈!왈!왈!왈!왈!왈!왈!왈!)

(Python) 아이리스 데이터셋 본문

(Python) 아이리스 데이터셋

1. Iris DataSet

'파이썬 머신러닝, 딥러닝' 카테고리의 다른 글

티스토리툴바

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)	target
0	5.1	3.5	1.4	0.2	0
1	4.9	3.0	1.4	0.2	0
2	4.7	3.2	1.3	0.2	0
3	4.6	3.1	1.5	0.2	0
4	5.0	3.6	1.4	0.2	0
...	...	...	...	...	...
145	6.7	3.0	5.2	2.3	2
146	6.3	2.5	5.0	1.9	2
147	6.5	3.0	5.2	2.0	2
148	6.2	3.4	5.4	2.3	2
149	5.9	3.0	5.1	1.8	2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
58	6.6	2.9	4.6	1.3
97	6.2	2.9	4.3	1.3
129	7.2	3.0	5.8	1.6
114	5.8	2.8	5.1	2.4
146	6.3	2.5	5.0	1.9
...	...	...	...	...
113	5.7	2.5	5.0	2.0
64	5.6	2.9	3.6	1.3
15	5.7	4.4	1.5	0.4
125	7.2	3.2	6.0	1.8
9	4.9	3.1	1.5	0.1