作业解析-python案例

Answer 3

Learning rate:0.001 Momentum:0.6 Batch size: 16 Hidden unit:50 Epochs:100

Maximum training accuracy:93% Maximum testing accuracy:82% Answer 4:

image

image

image

Answer 5

Different learning rate

g Test Accuracy (Ep: 100 Batches: 16 H-units:50 LR:0.001 M:0.6 )

80

60

0 40

Epoch

100

8 Test Accuracy (Ep: 100 Batches: 16 H-units:50 LR:0.002 M:0.6 )

70

50

4

image

Different batch size

image

image

image

Different hidden unit

Test Accuracy (Ep: 100 Batches: 16 H-units: 10 LR:0.001 M:0.6 ) 78

74

40

Epoch

100

g Test Accuracy (Ep: 100 Batches: 16 H-units:30 LR:0.001 M:0.6 )

80

60

image

Answer 6

The maximum accuracy of the neural network in this assignment was about 82% on the test set, which is very good for a single hidden layer network, with a small training set of 10000. I think that without using convolutional neural network, the accuracy would peak at around 85% for using only one hidden layer neural network.

Learning rate:

The learning rate seems to be a very sensitive parameter, since increasing it beyond a certain limit will cause the network to not learn anything. At this point it is bouncing/skipping past the local optimum.

Hidden units:

Very less number of hidden units (e.g. 10), will cause the test and train accuracy to be low. This is because the network is not expressive enough. However, increasing the number of hidden units beyond a certain point will not further increase the test/train accuracies. Moreover, the variance was larger in using lower number of hidden units.

Increasing the number of hidden units from 10 to 30 had more effect on the testing accuracy than compared to increasing the number of hidden units from 30 to 50.

Batch size:

Batch size of 16 worked best in my neural network implementation. Batch size of 64 (maximum) performed the worst. It seems that there is less variance in the test accuracy when using a smaller batch size.