<h1>Computer Vision by Learning</h1>

<h2>General lab information</h2>
For the five days of the course, you will receive five notebooks. For each notebook, we provide the outline of the topic of the day with basic code, as well as a set of assignments. The assignments are given in red.
<br><br>
<strong>IMPORTANT:</strong>
<br>
To pass the course, we should perform all the assigments. You can submit the notebooks of all the days to <i>P.S.M.Mettes@uva.nl</i> no later than March 24th, 23:59 PM.
<br><br>
Outline of the labs for the five days:
<ol>
<li>Installation and training your first networks.</li>
<li>Visualization</li>
<li>Style transfer</li>
<li>Data augmentation</li>
<li>CIFAR challenge</li>
</ol>

<h2>Day 1: Installation and training your first networks</h2>
<h3>Installation: Tensorflow and Keras</h3>
Through this course, we will be programming using Keras, a high-level neural networks library using the Python programming language. Here, we use Tensorflow in the background, i.e. we use call in Keras, which in turn calls Tensorflow.
<br>
<h4>Step 1: Installing Python</h4>
If Python is not yet installed, we suggest you install Anaconda. This is a python distribution package with a wide collection of additional packages. Furthermore, it allows for easy installation of Tensorflow and Keras through 'pip'.
<br>To install Anaconda, follow the instructions from here: https://docs.continuum.io/anaconda/install.
<h4>Step 2: Installing Tensorflow</h4>
Tensorflow is an open srouce software library for machine learning and especially deep learning.To install Tensorflow, followthe instructions from here: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/g3doc/get_started/os_setup.md.
<br>
<strong>Note:</strong> Keras also works on top of Theano instead of Tensorflow. If you already have installed Theano, you do not need to install Tensorflow per se. To use Theano as the backend, follow the steps here: https://keras.io/backend/.
<h4>Step 3: Installing Keras</h4>
We will use Keras for the programming part. As we will find out throughout this week, Keras abstracts several elements of Tensorflow, making it possible to design, train, and test networks in minutes!
<br>
To install Keras, follow the steps here: https://keras.io/#installation.

<h3>A look at the data</h3>
Throughout this first day, we will work on a subset of the MNIST digit dataset. Let's first load, subsample, and look at the data.

In [None]:
import numpy as np
from   keras.datasets import mnist

(trainx, trainy), (testx, testy) = mnist.load_data()
print "Data shape:", trainx.shape, trainy.shape, testx.shape, testy.shape
print "Unique labels:", np.unique(trainy)

The complete dataset has 60,000 training examples and 10,000 test examples. Each example is an image of 28 by 28 pixels. There are ten unique digits in total. Since we are mostly interested in setting up an initial model and since participants might only have access to a CPU, we will take a smaller subset of the train and test data for this day.

In [None]:
# Randomly subsample train and test data given a seed.
seed = 500
np.random.seed(500)

nr_train, nr_test = 1000, 1000
tridxs = np.random.choice(trainx.shape[0], nr_train, replace=False)
trainx = trainx[tridxs]
trainy = trainy[tridxs]
teidxs = np.random.choice(testx.shape[0], nr_test, replace=False)
testx  = testx[teidxs]
testy  = testy[teidxs]
print "Data shape:", trainx.shape, trainy.shape, testx.shape, testy.shape

The digits themselves look as follows:

In [None]:
import matplotlib.pyplot as plt

# Plot each digit.
for i in xrange(len(np.unique(trainy))):
    didx = np.where(trainy == i)[0][0]
    plt.subplot(2, 5, i+1)
    plt.title("Digit: %d" %(i))
    plt.imshow(trainx[didx,:,:], cmap='gray')
plt.show()

<h2>Training a multi-layered feed forward network</h2>
For our first model, we will look at a standard feed forward network consisting of fully connected layers. The first step is to make the matrices of the digits into arrays by flattening them. We furthermore normalize them to be in the [0,1] range.

In [None]:
# Flatten.
nr_features = trainx.shape[1]*trainx.shape[2]
ttrainx     = trainx.reshape(trainx.shape[0], nr_features)
ttestx      = testx.reshape(testx.shape[0], nr_features)

# Make the data floats and Normalize.
ttrainx  = ttrainx.astype('float32')
ttrainx /= 255.
ttestx   = ttestx.astype('float32')
ttestx  /= 255.

print "Data shape:", ttrainx.shape, trainy.shape, ttestx.shape, testy.shape

So we collapsed the matrices into long vectors and normalized them by now.
<br><br>
Second, we need to convert the labels from multi-class arrays to binary matrices, as follows:

In [None]:
from keras.utils import np_utils

nr_classes = len(np.unique(trainy))
trainy     = np_utils.to_categorical(trainy, nr_classes)
testy      = np_utils.to_categorical(testy, nr_classes)
print "Label shape:", trainy.shape, testy.shape

Now that all the preparations have been made, we can start building our network. Below, we show a very basic network, consisting of 3 dense (i.e. fully connected layers), without any non-linear activations and regularizations.

In [None]:
from keras.models import Sequential
from keras.layers.core import Dense, Activation

#
# Simple fixed function that returns a new MLP.
#
def get_mlp():
    # Initialize a new network.
    model = Sequential()

    # The first layer.
    model.add(Dense(256, input_shape=(nr_features,)))

    # The second layer.
    model.add(Dense(128))

    # The third layer.
    model.add(Dense(nr_classes))

    # Add the softmax layer.
    model.add(Activation('softmax'))

    return model

# Show what is in the model.
model = get_mlp()
model.summary()

We explicitly note that the above network is limited in multiple aspects:
<ul>
<li>It disregards the spatial layout of the images (to be discussed later today).</li>
<li>There are no non-linear activations</li>
<li>No regularization such as Dropout.</li>
<li>Limited number of layers and number of nodes per layer.</li>
</ul>
<br><br>
However, we can already train and test our network. This works as follows.

In [None]:
from keras.optimizers import SGD
import tensorflow as tf

#
# A function that trains a given model.
#
# model      - Keras model.
# x          - Train features.
# y          - Train labels.
# batch_size - The number of samples for each mini-batch.
# nb_epoch   - The number of training epochs.
# device     - Either '/cpu:0' or '/gpu:0'.
#
def train_model(model, x, y, batch_size=32, nb_epoch=10, device='/cpu:0'):
    with tf.device(device):
        # Compile the model with a specific optimizer.
        model.compile(loss='categorical_crossentropy', optimizer=SGD(), metrics=['accuracy'])

        # Train the model.
        history = model.fit(x, y, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1)
        
#
# Evaluate a trained model.
#
# model - The trained model.
# x     - Test features.
# y     - Test labels.
#
def test_model(model, x, y):
    return model.evaluate(x, y, verbose=0)

# Set a number of parameters for the training.
batch_size = 32
nb_epoch   = 10
model      = get_mlp()
# Train and test.
train_model(model, ttrainx, trainy, batch_size, nb_epoch)
score  = test_model(model, ttestx, testy)
# Show the results.
print "Test score: %.4f, test accuracy: %.4f" %(score[0], score[1])

How easy can life be?! A network trained and tested in just a few lines of code.

<h3>Improving the fully connected network</h3>
The model generated in the 'get_mlp()' function works out of the box, but can be directly improved in multiple ways. Here, we look at three ways, namely by addining non-linear activations, Dropout, and by using different optimizers.
<br><br>
<strong>1. Rectified Linear Unit activations</strong>
<br>
In Keras, a relu activation can be add e.g. after a Dense layer in the get_mlp() function with the command: 'model.add(Activation('relu'))'.
<br><br>
<strong>2. Dropout</strong>
<br>
Similarly, Dropout can be added e.g. after a non-linear activation with a single command: 'model.add(Dropout(r))', where r denotes a floating point specifying the dropout ratio.
<br><br>
<strong>3. Other optimizers</strong>
<br>
In the current train_model function, a standard SGD is used. However, optimizers such as RMSprop, Adam, Adagrad etc are also available in Keras, see: https://keras.io/optimizers/. Each optimizer also has a number of tunable parameters.

<font color="red">
<h3>Assignment 1: Improving the MLP</h3>
Your first task is to investigate the effect of the above named potential improvements.
<ul>
<li>Given a sensible batch size and number of epochs, investigate the effect of each potential improvement and their combination.</li>
<li>Which settings worked best?</li>
<li>Are the improvements significant and how would you eveluate this?</li>
</ul>
</font>

<font color="red">
<i>Your text here.</i>
</font>

<h2>Training a concolutional network</h2>
So far, we have disregarded the spatial layout of the digits for training. In this part, we will incorporate the spatial layout using convolutional neural networks. Below, we show how to reshape the features and how to design a basic convolutional network.

In [None]:
from keras import backend as K

# Copy data.
ctrainx  = trainx.astype('float32')
ctestx   = testx.astype('float32')

# Change the shape for convnets.
if K.image_dim_ordering() == 'th':
    ctrainx = ctrainx.reshape(ctrainx.shape[0], 1, ctrainx.shape[1], ctrainx.shape[2])
    ctestx  = ctestx.reshape(ctestx.shape[0], 1, ctestx.shape[1], ctestx.shape[2])
else:
    ctrainx = ctrainx.reshape(ctrainx.shape[0], ctrainx.shape[1], ctrainx.shape[2], 1)
    ctestx  = ctestx.reshape(ctestx.shape[0], ctestx.shape[1], ctestx.shape[2], 1)

# Normalize.
ctrainx /= 255.
ctestx  /= 255.

In [None]:
from keras.layers import Convolution2D, Flatten

#
# Simple fixed function that returns a new ConvNet.
#
def get_convnet(kernel, nr_filters_l1, nr_filters_l2, input_shape):
    # Initialize a new network.
    model = Sequential()

    # First convolutional layer.
    model.add(Convolution2D(nr_filters_l1, kernel[0], kernel[1], \
                        border_mode='valid', input_shape=input_shape))

    # Second convolutional layer.
    model.add(Convolution2D(nr_filters_l2, kernel[0], kernel[1]))
    
    # Move to fully connected layers.
    model.add(Flatten())
    model.add(Dense(nr_classes))

    # Add the softmax layer.
    model.add(Activation('softmax'))

    return model

# Yield an instance of the model.
convnet_model = get_convnet([5,5], 32, 32, ctrainx.shape[1:])
convnet_model.summary()

In [None]:
# Set a number of parameters for the training.
batch_size = 32
nb_epoch   = 16
model      = get_convnet([5,5], 32, 32, ctrainx.shape[1:])
# Train and test.
train_model(model, ctrainx, trainy, batch_size, nb_epoch)
score  = test_model(model, ctestx, testy)
# Show the results.
print "Test score: %.4f, test accuracy: %.4f" %(score[0], score[1])

In the basic convolutional network example, we have again excluded potential improvements such as regularization, non-linear activations, etc. We have furthermore excluded the use of max-pooling, which can be added as a layer using 'model.add(MaxPooling2D(pool_size=ps))', where ps is a 2D list specifying the pool region in the x and y dimensions.
<br>
<font color="red">
<h3>Assignment 2: Deep and small, or shallow and wide?</h3>
For the second assignment, your task is to extend the convnet with the improvements from assignment 1, as well as max-pooling.
<br><br>
After this, you goal is to generate three different networks; one using 1 convolutional layer, one using 4 convolutional layers, and one using 10 convolutional layers. You should design the network such that the networks have roughly the same number of parameters (which can be inspected using model.summary()).
<br><br>
Using a sensible batch size and number of epochs, investigate whether we prefer a shallow and wide network, or a deep and thin network. Are the difference large? Are they in line with your expectations? Why?
</font>

<font color="red">
<i>Your text here.</i>
</font>