MNist Hello World Example Part 2

We will step through the Data Process. Steps 1-4 are explained in Part 1 of this project located here. This is a continuation of part 4. We do briefly describe a difference in the Data Processing step.

Data Procoss Steps

  1. Goal or Hypothesis

  2. Data Retrieval

  3. Data Processing

  4. Data Exploration

  5. Model Data

  6. Present Results

Data Processing

The difference here is how we split the data and reshape it. This way we can use the convolutions in Keras. We are using Tensorflow for backend, thus need tensorflows shape.

In [43]:
#Part 2,3 recap

from keras.datasets import mnist #Loads the Data Set
from keras.utils import to_categorical 

#load the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()

#Change values from uint8 (0 to 255) to float32 (0 to 1)
x_train = x_train.astype('float32')/255   
x_test = x_test.astype('float32')/255

#Number of images, height, width, color dimension 
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
    
#Change labels to categorical format
y_test = to_categorical(y_test)
y_train = to_categorical(y_train)

Model Data Part

In this example, we use a convolutional neural network to process our images. In this we learn what the images attributes look like. We have 2 models, a basic one, and a more complex one. Due to the limitaions of the computer this code is running on, we keep the backend using the CPU and not the GPU.

In [38]:
from keras import models #Loads models
from keras import layers #Loads Layers

def my_model_01():
    #This is a basic CNN. 
    # Train - epoch: 3,batch: 64 - loss: 0.0664 - acc: 0.9808 - val_loss: 0.0732 - val_acc: 0.9806
    # Test - epoch 3, batch - 64 - Loss 0.06305718464348466 Accuracy 0.979
    model = models.Sequential()
    model.add(layers.Conv2D(32,(3,3),activation='relu', input_shape=(28,28,1)))
    model.add(layers.Flatten())
    model.add(layers.Dense(10,activation='softmax'))
    model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
    return model
In [39]:
def my_model_02():
    #This is a basic CNN. 
    # Train - epocn:3, batch:64 - loss: 0.0695 - acc: 0.9803 - val_loss: 0.0763 - val_acc: 0.9782
    # Test - epoch 3, batch - 64 - Loss 0.06257567940205336 Accuracy 0.9807
    model = models.Sequential()
    model.add(layers.Conv2D(32,(3,3),activation='relu', input_shape=(28,28,1)))
    model.add(layers.MaxPooling2D((2,2)))
    model.add(layers.Dropout(0.5))
    model.add(layers.Conv2D(64,(3,3),activation='relu'))
    model.add(layers.MaxPooling2D((2,2)))
    model.add(layers.Dropout(0.5))
    model.add(layers.Conv2D(64,(3,3),activation='relu'))

    model.add(layers.Flatten())
    model.add(layers.Dense(64,activation='relu'))
    model.add(layers.Dense(10,activation='softmax'))
    model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
    return model
In [40]:
import numpy as np
#Randomly pick 10 epochs and batch_size 64
epochs = 20
batch_size = 64
model = my_model_01()
#model = my_model_02()
history = model.fit(np.array(x_train), np.array(y_train), epochs=epochs, batch_size=batch_size, validation_split=0.15)
Train on 51000 samples, validate on 9000 samples
Epoch 1/20
51000/51000 [==============================] - 31s 609us/step - loss: 0.2387 - acc: 0.9314 - val_loss: 0.1027 - val_acc: 0.9717
Epoch 2/20
51000/51000 [==============================] - 20s 388us/step - loss: 0.0897 - acc: 0.9737 - val_loss: 0.0783 - val_acc: 0.9782
Epoch 3/20
51000/51000 [==============================] - 20s 399us/step - loss: 0.0664 - acc: 0.9808 - val_loss: 0.0732 - val_acc: 0.9806
Epoch 4/20
51000/51000 [==============================] - 20s 394us/step - loss: 0.0546 - acc: 0.9841 - val_loss: 0.0690 - val_acc: 0.9817
Epoch 5/20
51000/51000 [==============================] - 25s 493us/step - loss: 0.0477 - acc: 0.9863 - val_loss: 0.0648 - val_acc: 0.9828
Epoch 6/20
51000/51000 [==============================] - 37s 723us/step - loss: 0.0424 - acc: 0.9882 - val_loss: 0.0752 - val_acc: 0.9804
Epoch 7/20
51000/51000 [==============================] - 36s 705us/step - loss: 0.0381 - acc: 0.9889 - val_loss: 0.0653 - val_acc: 0.9830
Epoch 8/20
51000/51000 [==============================] - 30s 583us/step - loss: 0.0344 - acc: 0.9905 - val_loss: 0.0661 - val_acc: 0.9833
Epoch 9/20
51000/51000 [==============================] - 31s 614us/step - loss: 0.0322 - acc: 0.9911 - val_loss: 0.0670 - val_acc: 0.9821
Epoch 10/20
51000/51000 [==============================] - 32s 621us/step - loss: 0.0303 - acc: 0.9915 - val_loss: 0.0667 - val_acc: 0.9834
Epoch 11/20
51000/51000 [==============================] - 32s 634us/step - loss: 0.0274 - acc: 0.9925 - val_loss: 0.0684 - val_acc: 0.9842
Epoch 12/20
51000/51000 [==============================] - 32s 622us/step - loss: 0.0257 - acc: 0.9932 - val_loss: 0.0659 - val_acc: 0.9843
Epoch 13/20
51000/51000 [==============================] - 32s 631us/step - loss: 0.0239 - acc: 0.9934 - val_loss: 0.0717 - val_acc: 0.9822
Epoch 14/20
51000/51000 [==============================] - 33s 650us/step - loss: 0.0217 - acc: 0.9940 - val_loss: 0.0763 - val_acc: 0.9820
Epoch 15/20
51000/51000 [==============================] - 33s 639us/step - loss: 0.0211 - acc: 0.9942 - val_loss: 0.0791 - val_acc: 0.9811
Epoch 16/20
51000/51000 [==============================] - 33s 642us/step - loss: 0.0199 - acc: 0.9945 - val_loss: 0.0794 - val_acc: 0.9831
Epoch 17/20
51000/51000 [==============================] - 32s 632us/step - loss: 0.0186 - acc: 0.9951 - val_loss: 0.0769 - val_acc: 0.9824
Epoch 18/20
51000/51000 [==============================] - 33s 642us/step - loss: 0.0168 - acc: 0.9956 - val_loss: 0.0893 - val_acc: 0.9782
Epoch 19/20
51000/51000 [==============================] - 33s 645us/step - loss: 0.0155 - acc: 0.9957 - val_loss: 0.0838 - val_acc: 0.9822
Epoch 20/20
51000/51000 [==============================] - 33s 646us/step - loss: 0.0158 - acc: 0.9953 - val_loss: 0.0806 - val_acc: 0.9807
In [41]:
import matplotlib.pyplot as plt

#Now lets evaluate the results we seen
history_dict = history.history

loss_values = history_dict['loss']
loss_values = np.array(loss_values)


val_loss_values = history_dict['val_loss']
val_loss_values = np.array(val_loss_values)

epochs = range(1, len(loss_values) + 1)
plt.xticks(epochs)

plt.plot(epochs, loss_values, label='Training loss')
plt.plot(epochs, val_loss_values, label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()

plt.clf()
plt.xticks(epochs)
acc_values = history_dict['acc']
val_acc_values = history_dict['val_acc']
plt.plot(epochs, acc_values, label='Training acc')
plt.plot(epochs, val_acc_values, label='Validation acc')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
In [42]:
#Fit/Train the model 
model = my_model_01()
#model = my_model_02()
model.fit(x_train, y_train, epochs = 3, batch_size=64)
 
#See how accurate the training model is against a test set
loss, accuracy = model.evaluate(x_test,y_test)
print("Loss " + str(loss))
print("Accuracy " + str(accuracy))
Epoch 1/3
60000/60000 [==============================] - 32s 536us/step - loss: 0.2177 - acc: 0.9367
Epoch 2/3
60000/60000 [==============================] - 32s 529us/step - loss: 0.0839 - acc: 0.97590s - loss: 0.0839
Epoch 3/3
60000/60000 [==============================] - 32s 541us/step - loss: 0.0637 - acc: 0.9816
10000/10000 [==============================] - 2s 240us/step
Loss 0.06305718464348466
Accuracy 0.979

Results

The basic CNN can do 97.9% accuracy. The more advance CNN can do 98.07%. With more tuning, we could likely get higher without over fitting. Adding more layers and having a more complex model could also increase our accuracy. The problem with a larger and more complex model is the amount of time it takes to run. This is where a GPU would come in handy for speeding up the process. But with 98% accuracy, we can call this solved.

In [ ]: