wiki:2015/maze15

Labyrinth Game 2015 - Convolutional Control

This page will house the information related to controlling the labyrinth game with a convolutional network.

Quick introduction to Convolutional Neural Networks

First, you should learn what exactly a convolutional network is. It's a very, very popular architecture, so you can find lots of articles online and youtube videos from courses describing how they work. I wouldn't recommend reading the old papers - they are often inaccurate, and do not reflect how the field currently thinks about them. I haven't watched these in their entirety, but:

Okay, to test yourself to see if you have the basic idea:

  • Why are they called "convolutional" neural networks?
  • What parameters are learned during the training phase?
  • What are the fundamental layer types involved in a convolutional neural network?
  • What's the typical architecture? If I told you to make it deeper, what kind of layer would you add?
  • What's special about the output layer of a convolutional network? Why is the connectivity different there?
  • Can you name a task a convnet would be bad at? Why would it be bad?

Once you feel comfortable with the basic idea (don't worry too much about the backpropagation algorithm - it's a bit involved to derive, but ultimately you won't need to change the learning, you just need a sense of how & why it works), you can push forward to learning about the code. Go here:

https://github.com/rasmusbergpalm/DeepLearnToolbox

Download it (there's a zipfile download on the side, if you don't want to git clone it), unzip it, set up the paths ("addpath(genpath('.'))") and run this code to train a network on MNIST (from their github page):

load mnist_uint8;

train_x = double(reshape(train_x',28,28,60000))/255;
test_x = double(reshape(test_x',28,28,10000))/255;
train_y = double(train_y');
test_y = double(test_y');

%% ex1 Train a 6c-2s-12c-2s Convolutional neural network 
%will run 1 epoch in about 200 second and get around 11% error. 
%With 100 epochs you'll get around 1.2% error
rand('state',0)
cnn.layers = {
    struct('type', 'i') %input layer
    struct('type', 'c', 'outputmaps', 6, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %sub sampling layer
    struct('type', 'c', 'outputmaps', 12, 'kernelsize', 5) %convolution layer
    struct('type', 's', 'scale', 2) %subsampling layer
};
cnn = cnnsetup(cnn, train_x, train_y);

opts.alpha = 1;
opts.batchsize = 50;
opts.numepochs = 1;

cnn = cnntrain(cnn, train_x, train_y, opts);

[er, bad] = cnntest(cnn, test_x, test_y);

% Tell us how well we did!
fprintf('Accuracy: %2.2f%%\n', (1-er)*100);

%plot mean squared error
figure; plot(cnn.rL);

That trains a basic convolutional neural network to solve the MNIST handwritten digit identification task. What I like about this toolbox is that it is completely self-contained and almost doesn't rely on any outside code to run. If you press control+d (on Windows and Linux, not sure about OS X), Matlab will open the function under the cursor ("dive into it"). Doing this, you can step through all the code, line by line, and watch it build, train, and test a convolutional network. The "keyboard" command in Matlab is also useful, as are breakpoints - put breakpoints all over the place and examine the variables (using the "whos", "imagesc", "reshape", "size", and "squeeze" commands or printing to the console) as it runs.

Here are some example questions that will force you to dive into the code. Ask me if you have any questions.

  • What are the parameters that are learned during training? Show them before and after training. Did they change? Did they do what you expect?
  • What happens to an input image during the feed-forward pass? Choose one input digit, and show the activation of the network in response to that digit at all layers.
  • What is the final accuracy of the network on the test set? Show some of the digits that it got wrong.
  • (More general question) Why is the network tested on a different dataset than it was trained on?
  • What is the plot of the decreasing line that is shown when you run the example above? How do you interpret it?
  • Modify the code to show the accuracy of the network on the test set at every epoch. Compare with the cnn.rL line. How is it different? What does this mean for training?
  • Change the architecture in a couple different ways - add more kernels, or change their size, or change the subsampling size. How does this affect the cnn.rL line?

Advanced Convolutional Networks

For the state of the art, see here for code, lectures, and homework: http://cs231n.github.io/

Data Description

The original DVS data can be loaded from the "table.mat" files - this contains:

  • X, Y: the x and y address of an event from the DAVIS camera.
  • ts: the time of each event (in microseconds)
  • pol: polarity of each event

The controller data in "controller.mat" contains:

  • t - time (unknown scale)
  • pan - pan of the servo
  • tilt - tilt of the servo
  • targetX - x position of the currently selected target
  • targetY - y position of the currently selected target
  • ballX - tracked x position of the ball
  • ballY - tracked y position of the ball
  • ballVelX - velocity of the tracked ball, x direction
  • ballVelY - velocity of the tracked ball, y direction

Frames in "frames.mat":

  • frames: 51781 frames of a 44x33, taken at 60 fps.

You can use vis_labyrinth_data.m to visualize.

Simple NN Controller

This is a fully self-contained package that implements a very simple (inp-20-2) neural network that can control the ball as a proof-of-concept. It loads the data, cleans the data, and tests the data in a self-complete zip file:

simple_nn.zip

Last modified 4 years ago Last modified on 05/06/15 16:58:46

Attachments (11)