2011/DailyLogs/FSM

Biological motivation

Winconsin card sorting task (WCST) is a neuropsychological test to see if your prefrontal cortex is working. Group stimuli according to color or shape. (K Tanaka et al., 2006, 2007)

What they saw  Selective sustained activity to a rule (context)  Mixed selectivity (high selectivity to a combination of rule and stimulus)

Patterns of activity in the WCST are non linearly seperable. This can be solved by having a population of neurons that are specifically selective to a combination of color and stimulus. However, there is a scalability issue here.

# of neurons required = #stimulus*number of states

Random connection is employed so that statistically, we can have a combinatorial mechanism where the mixed selectivity is employed in a distributed manner. Trained using a percepteron like learning

Problem of approach: neurons that are not used. Salzman showed that neuron selectivity in the monkey also looks like a random selective task. Learning stops when the post synaptic activity is strong enough to guarantee classification.

How to implement finite state machines vs how to discover finite state machines thorugh perhaps reinforcement learning or partly observable markov decision processes.

Difference with backpropagation, because the first layer is randomly hooked up, but the second layer alone is learnt using percepteron rule.

How do you pile up dynamical systems to get bigger dynamical systems? How can you guarantee stability and functionality preservation when you generate a bigger system??

Global exponential forgetting of initial condition where the jacobian of x is uniformly negative definite in some metric. X*=f(x_,t) is contracting

This provides a convenient tool to look at synchronized systems such as WTA or oscillators. Also has applications to local field potential effects. If two systems are contracting, then we can add them together with specific combination rules (negative feedback) will result in a larger system that is also contracting X*i=f(xi,t) +ksumj(xj-xi) All to all coupling =f(xi,t)-Nkxi+ksumj(xj) Ksumj(xj) is a common signal and feeding to everybody else, and requires only 2N signals. Y*=f(y,t)-Nky+ksumj(xj)

The system y contracts when N is large enough, since the jacobian will be negative. The complexity is same even if the coupling is changed to more general.

==== Memory: Having an activity pattern that stays infinitely. Without changing connections/ synapses on the fly, you can do this using recurrence – positive feedback. Critical tradeoff of strength of connection Large number of random weak connections finetuned by learning. Alternative mechanistic process. How to use the concept of contraction for that?? Contraction- exponentially forget initial condition. Contract to initial condition and given the inputs contract to another state, but implicity not forget the initial condition. Connect two recurrent networks using positive feedback. You can pile up systems this way and guarantee stability in the sense of contraction. Alpha: E-E, Beta1: E-I, Beta 2: I Em lambda: >0 and <f(alpha, beta1, beta2) is the positive feedback connection between 2 recurrent networks. Assuming one of the betas to be constant, then only 1 alpha and 1 beta, and the learning rule can be implemented at the level of a single neuron. What are the rules? State 1 and State 2. State1------A--State2-----A--State1 A+1=>2 A+2=>1 Memory to remember which state I am in. A network connectivity where transition A is connected to 2 recurrent neurons, and from there to neurons indicating the activation of states S1 and S2. Winner take all is employed to deal with issues like multiple transition inputs etc. Once the rules are found, we can have arbitrarily large networks which will be stable and functional. Contraction to impose both upper bounds but also lower bounds on the connection strengths. What about mapping to biology?? Can we use this is spiking neurons: Yes. Is this connectivity realistic biologically: yes, as there are cortical and non cortical areas where this kind of non specific connectivity in layer 2/ 3 (dense inhibitory connectivity). Long range connectivity which is strong and sparse is also realistic as they could be inter columnar connection Input data presentation realistic?? Distributed representation. But it is not there in the PFC. Could use Fusi’s model and then it would work. Neurons operate at low firing rates, but synapses saturate at low frequencies like NMDA synapses.

Why do we need to use neurons in finite state machines when such methods already exist?? A systematic method to embed functionality. They are not based on symbols, and can generalize. Does that mean relisience to noise or more states? Doesn’t look like. Fusi: Huge basin of attraction. If you flip 10% of the neurons, you can still converge to the same attraction state. Rodney: Pass analog signal through the state, which is not there in digital finite state machines. For routing pure analog signals or spike trains. You couldn’t do this if the networks/ synapses are in saturation, as this requires dynamic range. Aurel: programming model where you set up a data path and program it from outside in a digtal system (though you cant pass floats). We can define a network to do a certain task. What sort of task can you resolve by doing so? If the states are analog, then is it another conventional technology or is it?? How to pass distributed representation thru analog signals.

What are the advantages of a neural representation? No perfect universal solution. What can we do with imperfection to create a system which is more reliable, not to make things perfect at the device level.

We think that the elements have to be self constructed. How can you build these networks. How the cell expands into networks. A technology in the future of molecular technologies. Modelling development. By construction of a first cell, it casts itself into an environment where efficient instantiating of circuits are happening.

We can see what are the rules for creating systems that learns – by combining parts together. What are potential ways to propose learning/ self construction during interaction with environment.

Fusi: Potentiality of creating states. No general theory for doing this. Show that in a network like this, we can create states that can represent temporal context, which was not there in the system in the first place. We can use the system to generate a hierarchical temporal context (Fusi et al., Neuroimaging 2010)

Finite state machine converging to a different set of states. Is it possible for such systems to generate continuous states for motor control?? Attractors can switch almost instantaneously. You can have all particular times scales. But at some point you have to implement all kind of timescales Relational networks built with these kind of structures??

State machines: moving from one state to another. Circuits with one way connections to enforce this mode of operation. Possible with Hebbian learning – bidirectional relationships. (Mathew cook) Use the relationships in an omnidirectional manner.

Attachments