分類問題が線形分離可能であれば、単純パーセプトロンで解決できる。例として、パーセプトロンで論理回路を作るというものがある。それについては

を参照。

論理回路の場合、座標上の4点(0, 0), (0, 1), (1, 0), (1, 1)を考える。各点は、論理回路の2つの入力の組に対応する。信号を流す(真)が1、信号を流さない(偽)が0だ。これらの点を、対応する論理回路の出力が0か1かで分離するような直線を見つければいい。それが、パーセプトロンの重み(パラメータ)を設定することに相当する。ただし、適切な重みの設定をするのは人間だ。

人間が設定しなくても、適切な重みを自動で学習できることが望ましい。そこで、scikit-learnに含まれるiris(アヤメの品種データ)を使って、パーセプトロンの学習アルゴリズムを試してみる。

まず、irisのデータの中身を確認する。確認するには、視覚化してみるのが分かりやすい。視覚化にはseabornを使うことにする。ちょうどseabornにもサンプルデータとしてirisが入っている。データの意味を知りたいだけだから、ひとまずこちらを使う。

視覚化してみると、アヤメの各個体のデータであることが分かる。特徴量として、がく片の長さ(sepal length)、がく片の幅(sepal width)、花弁の長さ(petal length)、花弁の幅(petal width)が入っている。そして、そのアヤメがどの品種かを示すラベル(species)も入っている。“setosa”, “versicolor”, “virginica”の3種類だ。

ただし、scikit-learnに含まれるデータでは、品種は0, 1, 2とダミー変数になっている。また、特徴量はdataというキーでアクセスし、ラベルはtargetというキーでアクセスするようになっている。

ここで、問題を単純に設定して、“setosa”, “versicolor”の2種類を分離することにする。特徴量は、sepal widthとpetal lengthのみ使用する。

さて、どのような規則でパーセプトロンの重みを更新していくかだ。一般的な式を書いてしまうとこうなる。

$\require{color}\color{black}w^{(k)}_{ji}$ は、i番目の入力ニューロンとj番目の出力ニューロンの接続の重み(k回目の更新後)
$\require{color}\color{black}x_i$ は、i番目の入力
$\require{color}\color{black}\hat{y}_j$ は、j番目の出力
$\require{color}\color{black}y_j$ は、j番目のターゲット出力(ラベル)
$\require{color}\color{black}\eta$ は、学習率

今回は出力が1つだから、jは無視していい。

誤差が小さくなるように、(学習率を掛けて)少しずつ重みを更新している。更新量は、誤差と入力の大きさに比例する(デルタ則)。

データが線形分離可能であれば、このアルゴリズムは解に収束する。パーセプトロンの収束定理と呼ばれるものだ。ただし、解は一意ではない。無限にある。

実装して動かしてみると、あっという間に収束することが分かる。データを全て見ないうちに。問題が簡単過ぎるのかもしれない。

In [1]:

import numpy as np
import pandas as pd
from sklearn import datasets, model_selection
import matplotlib.pyplot as plt
%matplotlib inline
import seaborn as sns

In [2]:

df = sns.load_dataset('iris')
df.head()

Out[2]:

	sepal_length	sepal_width	petal_length	petal_width	species
0	5.1	3.5	1.4	0.2	setosa
1	4.9	3.0	1.4	0.2	setosa
2	4.7	3.2	1.3	0.2	setosa
3	4.6	3.1	1.5	0.2	setosa
4	5.0	3.6	1.4	0.2	setosa

In [3]:

sns.set(style='ticks')

sns.pairplot(df,
             hue='species',
             markers=["o", "s", "x"])\
    .savefig('iris.png')

In [4]:

iris = datasets.load_iris()

type(iris)

Out[4]:

sklearn.utils.Bunch

In [5]:

iris.keys()

Out[5]:

dict_keys(['filename', 'target_names', 'data', 'target', 'feature_names', 'DESCR'])

In [6]:

iris_data = pd.DataFrame(data=iris.data, columns=iris.feature_names)
iris_data.head()

Out[6]:

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

In [7]:

iris.target

Out[7]:

array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
       0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
       1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
       2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])

In [8]:

iris_df = iris_data.copy()
iris_df['species'] = iris.target
iris_df.head()

Out[8]:

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

In [9]:

iris_df = iris_df.iloc[:, [1, 2, 4]]
iris_df = iris_df[iris_df['species'].isin([0, 1])]
iris_df.head()

Out[9]:

	sepal width (cm)	petal length (cm)
0	3.5	1.4
1	3.0	1.4
2	3.2	1.3
3	3.1	1.5
4	3.6	1.4

In [10]:

X = iris_df.iloc[:, :2].values
Y = iris_df.species.values

In [11]:

x_train, x_test, y_train, y_test = model_selection.train_test_split(X, Y)

In [12]:

class Perceptron:
    def  __init__(self, n_iter=10, eta=0.01):
        self.n_iter = n_iter
        self.eta = eta
        
    def output(self, input):
        weighted_sum = np.dot(input, self.__weights[1:]) + self.__weights[0]
        return self.__activate(weighted_sum)
    
    def fit(self, X, Y):
        self.__weights = np.zeros(X.shape[1] + 1)
        
        for i in range(self.n_iter):
            for j, (x, y) in enumerate(zip(X, Y)):
                y_output = self.output(x)
                diff = y - y_output
                if diff != 0:
                    print('iter: {}, y_index: {}, diff: {}'.format(i, j, diff))
                self.__weights += self.eta * diff * np.hstack((1, x))
    
    def __activate(self, weighted_sum):
        return self.__heaviside_step(weighted_sum)
    
    def __heaviside_step(self, z):
        return np.where(z < 0, 0, 1)

In [13]:

perceptron = Perceptron()
perceptron.fit(x_train, y_train)

iter: 0, y_index: 0, diff: -1
iter: 0, y_index: 1, diff: 1
iter: 0, y_index: 2, diff: -1
iter: 0, y_index: 3, diff: 1
iter: 0, y_index: 13, diff: -1

In [14]:

perceptron.output(x_test)

Out[14]:

array([1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0,
       0, 0, 0])

In [15]:

y_test

Out[15]:

array([1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 0, 1, 0,
       0, 0, 0])

ソースコード

https://github.com/aknd/machine-learning-samples

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2

	sepal length (cm)	sepal width (cm)	petal length (cm)	petal width (cm)
0	5.1	3.5	1.4	0.2
1	4.9	3.0	1.4	0.2
2	4.7	3.2	1.3	0.2
3	4.6	3.1	1.5	0.2
4	5.0	3.6	1.4	0.2