Friday, July 8, 2011

Neural Distraction

The IPN distributor has been in production for a couple of days now, passing actual PayPal transactions to one of the small subscription systems. This is the "run-in" period, (another machinist's term IT has appropriated) as distinct from the "smoke test" for hardware. (Literally, "plug it in and see if smoke comes out". A tense moment for anyone who has built an electronics kit.)

Basically, there's nothing to do at this point than sit back and wait for the Big Nasty Real World to break things. And I'm out of town for a couple of days, so I'm literally forced to just let the run-in version sit there and not mess with it, which is exactly what I should do.

So, in the meantime, I'm happily distracted by Biologically-Inspired Massively-Parallel Architectures – computing beyond a million processors by Steve Furber and Andrew Brown. 

This project neatly involves quite a lot of things I care about, as one of my Majors was Artificial Intelligence, I've written code for massively parallel machines (the 4096 processor MASPAR) and built Artificial Neural Networks on my old ARM-based Archimedes A310 which was co-designed by... Steve Furber.

Steve Furber and Sophie Wilson are together responsible for ARM, perhaps the most successful processor of all time, if you count cores shipped / in use rather than dollars or profit. The Nintendo DS has two. Your smartphone may have up to five. There are more ARM processors on the planet than people. They made 2.4 billion CPUs in 2006 alone.

I doubt I'm uniquely qualified to critique this paper, but there's probably not many of us. 

First, the good. The paper contains the general outlines of a computer that contains 1 million processors, capable of simulating a neural network about 1% the complexity of the human brain (a billion neurons)  in real-time. And this isn't just hand-waving. The design is based on some custom ASICs (standard cores, but some specialised network interfaces) but then Steve does that kind of thing in his lunch break. ARM and Stilistix are on board, and ready to make silicon for him.

I'm quite sure he could build this machine, turn it on, and it would work pretty much as advertised. And a million processors might sound like a small building's worth of rack space, but your average ARM core probably costs two dollars, and will run off a AAA battery. This is how you can pack 50,000 card-sized nodes into a couple of cabinets without melting through the floor - or blowing the budget. This is a cheap machine, relatively speaking.

Steve makes the point that your average computer costs as much in electricity as the original hardware. Since CPU power increases exponentially with every generation, but energy is getting more expensive, your power bills become the limiting factor.

However, the most impressive thing about this machine is not the processing power, but the networking. If you're simulating a billion neurons, you have the problem that each one connects to a thousands others. Every (easily simulated) neuron event generates a thousand messages that need to get to other nodes. It's not enough to have massively parallel processing - you need massively parallel connectivity too. Steve's self-healing toroidal mesh is just the ticket.

So, that's the hardware sorted. How about the software?

Well... ahem. I think the most surprising thing is how little the field of Artificial Neural Networks has progressed since I left University, nearly two decades ago. Either that, or our little group was more advanced than I realized. Reading the new paper, all the descriptions of neuronal behaviour are pretty much the same as my day. This shouldn't be too surprising, I suppose. How neurons physically work was sorted out by medicine a while ago.

And this is probably why Steve Furber's "Spiking Neural Network architecture" is basically the same as the scheme I came up with in 1992. (Ah, if only I'd published...) We seem to have gone down the same path, for the same computational reasons, Only I think he's missed some things about how the biology works. 

Before I get into that, I should point out that even though I had a cool idea long ago, there was absolutely no possibility at the time of putting it into practise. Steve has done the hard part, designing (and funding) a machine on which to test these theories. Finally. 

And when I say "he's missed something", he's done it for the best of reasons; by approaching the problem a little too logically. Being an outstanding engineer, he has sought to remove the inefficiencies of squishy biological neurons and replace it with clean and perfect silicon.

However, it's my belief (any untested theory is just a belief) that two of the squishy inefficiencies that he's trying to remove are actually vital to the entire process: the time it takes signals to propagate along the dendrites, and the recovery time of the neuron. 

I could go into a little chaos theory here, but let's just stick with the concept that, after a neuron fires, it's unavailable for a short time. And since timing is everything, this is equivalent to inhibiting the activity of the neuron for the window of the next spike. Perhaps an important one. We don't have to turn a neuron off to inhibit it (this takes a lot of time and effort) we just have to fire it prematurely. We are deliberately creating high-speed 'disappointment'.

Let's combine that with the dendrite propagation time, for some simple mind experiments.

Imagine a neuron that has it's output connected to another neuron by two connections of different lengths. So when the first neuron fires, the second receives two signals, with a delay in between. If the potential is large enough to fire (spike) the second neuron, then it will probably fire when the first signal reaches it, but can't fire when the later signal reaches it, because it will still be in recovery.

Let's say it's that second firing window which is actually the important one, needed to trigger a final neuron at the perfect time. So if the first neuron is prematurely firing the second neuron (which means it's still recovering when it gets the 'proper' signal later on) then it's effectively inhibiting it from doing it's job.

To let the second (wanted) signal through , all we have to do is fire the poor second neuron even earlier. An earlier spike (from yet another neuron) would inhibit the inhibition spike from the first neuron. Doing it at just the right moment means the neuron has recovered in time for the final spike.

So we can alter which of two time windows the second neuron spikes in by sending a 'control' signal a little earlier. Not exactly boolean logic.

Any percussionist knows all this. Hit the drum too soon (when it's still vibrating from the previous strike) and it sounds different.

This is of course completely inefficient, although elegant in it's own way. Any decent engineer wants to simplify all this inhibiting of inhibitions via multiple redundant paths. But I think it's an important part of how our brains process information, so removing that 'inefficiency' will prevent the network from behaving like it's biological counterpart. All that chaotic delay might be the seat of conciousness. Who the hell knows.

But I don't see this level of temporal control in Steve Furber's paper. In fact, he seems to be actively working against it by desynchronizing the mesh, although that might just be the lowest hardware level, and perhaps 'global time' is accurately established by each node's clock. The spike packets apparently contain no time information, and seem to be broadcast to all receiver neurons instantly. That sounds like dendrite propagation is being ignored, to me.

Without very tightly modelling propagation delays, how do we recreate all the subtle phase timing of a biological brain? This is my concern. From what I've seen, nature takes advantage of every trick it can use.

Fortunately the fix is pretty simple: Just tag each strike broadcast packet with a global time, and then have a delay queue on each neuron's inputs to process them in time order. If the mesh communications is fast and reliable enough, perhaps just the delay queue is enough.

There you go, concious computer, no problem. Next?

No comments:

Post a Comment