from keras.models import Sequential
from keras.layers import Dense, initializers
from keras.optimizers import SGD
from keras.datasets import mnist
from keras.utils import np_utils
(X_train, y_train), (X_test, y_test) = mnist.load_data()
X_train = X_train.reshape(60000, 784)
X_test = X_test.reshape(10000, 784)
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
Y_Train = np_utils.to_categorical(y_train, 10)
Y_Test = np_utils.to_categorical(y_test, 10)
model = Sequential()
model.add(Dense(output_dim=625, input_dim=784, activation='sigmoid'))
model.add(Dense(output_dim=625, input_dim=625, activation='sigmoid'))
model.add(Dense(output_dim=10, input_dim=625, activation='softmax'))
model.compile(optimizer=SGD(lr=0.05), loss='categorical_crossentropy', metrics=['accuracy'])
model.summary()
history = model.fit(X_train, Y_Train, nb_epoch=100, batch_size=128, verbose=1)
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
dense_1 (Dense) (None, 625) 490625
_________________________________________________________________
dense_2 (Dense) (None, 625) 391250
_________________________________________________________________
dense_3 (Dense) (None, 10) 6260
=================================================================
Total params: 888,135
Trainable params: 888,135
Non-trainable params: 0
这个示例是用3层全连接层来做mnist数据集的训练,首先上来的是2层sigmoid激活函数的全连接层,里面有有个新出现的参数(上面的代码去掉了,也就是里面的kernel_initializer这个参数,不过这里也记一下笔记),kernel_initializer=initializers.normal
,这是keras提供的初始化训练参数权重的功能,全连接层默认是glorot_uniform。Glorot均匀分布初始化方法,又成Xavier均匀初始化,而这里选用的normal本质上用的是RandomNormal
,也就是正态分布初始化
在这个示例里面,最后一个新面孔就是SGD函数里面的lr参数了,这个参数指定了SGD的学习速率,也就是权重调整速率,下降的太快可能会越过了最优值,太小又会导致下降的太慢