We will step through the Data Process. Steps 1-4 are explained in Part 1 of this project located here. This is a continuation of part 4. We do briefly describe a difference in the Data Processing step.
Data Procoss Steps
Goal or Hypothesis
Data Retrieval
Data Processing
Data Exploration
Model Data
Present Results
The difference here is how we split the data and reshape it. This way we can use the convolutions in Keras. We are using Tensorflow for backend, thus need tensorflows shape.
#Part 2,3 recap
from keras.datasets import mnist #Loads the Data Set
from keras.utils import to_categorical
#load the data
(x_train, y_train), (x_test, y_test) = mnist.load_data()
#Change values from uint8 (0 to 255) to float32 (0 to 1)
x_train = x_train.astype('float32')/255
x_test = x_test.astype('float32')/255
#Number of images, height, width, color dimension
x_train = x_train.reshape(x_train.shape[0], 28, 28, 1)
x_test = x_test.reshape(x_test.shape[0], 28, 28, 1)
#Change labels to categorical format
y_test = to_categorical(y_test)
y_train = to_categorical(y_train)
In this example, we use a convolutional neural network to process our images. In this we learn what the images attributes look like. We have 2 models, a basic one, and a more complex one. Due to the limitaions of the computer this code is running on, we keep the backend using the CPU and not the GPU.
from keras import models #Loads models
from keras import layers #Loads Layers
def my_model_01():
#This is a basic CNN.
# Train - epoch: 3,batch: 64 - loss: 0.0664 - acc: 0.9808 - val_loss: 0.0732 - val_acc: 0.9806
# Test - epoch 3, batch - 64 - Loss 0.06305718464348466 Accuracy 0.979
model = models.Sequential()
model.add(layers.Conv2D(32,(3,3),activation='relu', input_shape=(28,28,1)))
model.add(layers.Flatten())
model.add(layers.Dense(10,activation='softmax'))
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
return model
def my_model_02():
#This is a basic CNN.
# Train - epocn:3, batch:64 - loss: 0.0695 - acc: 0.9803 - val_loss: 0.0763 - val_acc: 0.9782
# Test - epoch 3, batch - 64 - Loss 0.06257567940205336 Accuracy 0.9807
model = models.Sequential()
model.add(layers.Conv2D(32,(3,3),activation='relu', input_shape=(28,28,1)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Dropout(0.5))
model.add(layers.Conv2D(64,(3,3),activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Dropout(0.5))
model.add(layers.Conv2D(64,(3,3),activation='relu'))
model.add(layers.Flatten())
model.add(layers.Dense(64,activation='relu'))
model.add(layers.Dense(10,activation='softmax'))
model.compile(optimizer="rmsprop", loss="categorical_crossentropy", metrics=["accuracy"])
return model
import numpy as np
#Randomly pick 10 epochs and batch_size 64
epochs = 20
batch_size = 64
model = my_model_01()
#model = my_model_02()
history = model.fit(np.array(x_train), np.array(y_train), epochs=epochs, batch_size=batch_size, validation_split=0.15)
import matplotlib.pyplot as plt
#Now lets evaluate the results we seen
history_dict = history.history
loss_values = history_dict['loss']
loss_values = np.array(loss_values)
val_loss_values = history_dict['val_loss']
val_loss_values = np.array(val_loss_values)
epochs = range(1, len(loss_values) + 1)
plt.xticks(epochs)
plt.plot(epochs, loss_values, label='Training loss')
plt.plot(epochs, val_loss_values, label='Validation loss')
plt.title('Training and validation loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.legend()
plt.show()
plt.clf()
plt.xticks(epochs)
acc_values = history_dict['acc']
val_acc_values = history_dict['val_acc']
plt.plot(epochs, acc_values, label='Training acc')
plt.plot(epochs, val_acc_values, label='Validation acc')
plt.title('Training and validation accuracy')
plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.legend()
plt.show()
#Fit/Train the model
model = my_model_01()
#model = my_model_02()
model.fit(x_train, y_train, epochs = 3, batch_size=64)
#See how accurate the training model is against a test set
loss, accuracy = model.evaluate(x_test,y_test)
print("Loss " + str(loss))
print("Accuracy " + str(accuracy))
The basic CNN can do 97.9% accuracy. The more advance CNN can do 98.07%. With more tuning, we could likely get higher without over fitting. Adding more layers and having a more complex model could also increase our accuracy. The problem with a larger and more complex model is the amount of time it takes to run. This is where a GPU would come in handy for speeding up the process. But with 98% accuracy, we can call this solved.