wiki:2015/noise15

Ensemble methods for classification: making machine learning more neuromorphic

Ensemble methods are used very effectively in machine learning. The main idea is to combine multiple decisions from learned models to obtain a super model with better performance. The specific implementation details seem not to be of main relevance as long as the single classifiers have somewhat decent performance and, most importantly, their responses are poorly correlated. This concept is at the basis of random forest, introduced by Leo Breimann in 2001, and used extensively since then.

There are different ways of decorrelating the responses of the single classifiers. For example, random initialization of the parameters of the single models is one way. In random forest, decision trees on randomly selected features are grown at each level. Sources of noise are ubiquitous in analog neuromorphic engineering, e.g., the strength of synapses implemented in analog CMOS circuits vary because of the low precision of VLSI and the consequent transistor mismatch. In memristive crossbars, the properties of synaptic elements vary over large scales. The idea behind this workgroup is to exploit these sources of noise to build effectively ensembles of simple classifiers. We will model in software simple, realistic classifiers, i.e., classifiers that we already know have a natural mapping into existing neuromorphic hardware, and test their abilities on traditional machine learning datasets.

Ideas

Possible activities

  • 1) introduce new models of synapses, e.g., memristor like or stochastic: Fabien, Roy, ...
  • 2) compete on classification task on MNIST, UCI dataset: Qian, Jordi
  • 3) map existing DNN models into simpler models that can be deployed in hardware: Danny, Fabio
  • 4) implement tracking using classifiers, see TLD by Zdenek Kalal (uses randomly connected neurons as feature vectors)
  • 5) introduce unsupervised classifiers and adaptive classifiers in the software framework
  • 6) develop spiking model: Amir

Notes

  • In the first meeting we introduced the main concepts behind ensemble methods, random forests, the c/s2 measure, delineated possible activities.
  • In the second meeting we looked at the software that we will use to create ensembles of classifiers. Ideas were also highlighted throughout the day, e.g., how to classify tissue samples based on sequenced mRNA filaments and how to simulate synapses on memristive device models.
  • In the 3rd meeting we finalized our proposals into the following activities:
    • Gianvito: the goal is to create a database of sequenced samples of biological tissue and train classifiers to identify from what tissue the samples come from. Gianvito is working on mRNA sequencing and is interested in understanding whether quantifying the single mRNA variations within samples is helpful for diagnosis.
    • Jordi and Ole are interested in deploying a deep-network into a shallow, hardware-ready network using the pynapse software. They will start from Cafe, a public library for deep learning, and a pre-trained deep network that has been used in recent machine learning competitions. Moreover, this activity will explore Andrew Saxe's idea of first exploring across the space of architectures of deep-networks with random weights to select the ones the perform best and then improve their ability by using unsupervised learning.
    • Roy is interested in understanding the role of diversity in the ensembles. Specifically, he's planning to use random networks as a tool to induce (static) diversity in the single classifier's responses.
    • Victor is interested in studying the performance of the classifiers when realistic models of synapses are used.
    • Sergio will try to use the classifier to classify the activity of the reservoir network developed in the wiki:memristors15 memristor's workgroup.
    • Others will participate in the machine learning competition.

Machine learning competition

Subscribe if you are not already in the list!

List:

  • Qian
  • Roy
  • Jordi
  • Ole
  • Lukas

References

Tools

We developed a minimalistic machine learning pipeline in Python called pynapse, to be released publicly here at the workshop. The idea behind this software is to let modelers introduce their own models of synaptic plasticity and test their properties easily with the existing pipeline. The framework is that of supervised learning and perceptron-like classifiers, i.e., pattern and label arrive at time t, $\delta w$ is computed. There is going to be tutorials on the software.

Results

Please log in to see this part

Last modified 4 years ago Last modified on 06/15/15 21:29:55

Attachments (15)