Artificial neural networks (ANNs) form the architecture behind machine learning and artificial intelligence. These technologies are undergoing an explosion in use – from computer vision systems for self-driving cars and Snapchat filters, to voice recognition and big data processing for research. These algorithms are powerful because they effectively program themselves; they learn associations between input and output from large training datasets, and then apply this ‘knowledge’ to new situations.
Computer vision, for example, is a complex problem that is incredibly difficult to write software for. It is hard enough for a computer to understand the outline of a cat against a white background, but put a cat in front of a busy background and the task is nearly impossible. With machine learning, however, you can feed thousands of images of cats into a neural network, and the computer will learn what a cat looks like. It can then take a previously unseen image and calculate the probability of a cat being in the photo.
So, how does it do this?
The original ANNs were modelled on the brain. Nodes, which are analogous to neurons, are arranged in sheets so that each node is connected by a ‘synapse’ to all nodes in the sheet below. Like biological synapses, these connections change their strengths, or ‘weights’, so that information fed into the network is recombined as it filters through. The output – a number between 0 and 1 representing the probability of a cat being present – would be compared to the true value, i.e., there is a cat (1) or there isn’t (0). The difference between the prediction and the true value is then the ‘error’ of the network, which is used to change the weights to bring the output closer to the true value, and thereby decrease the error over many iterations.
One of the most interesting things about image-based networks is that as they learn, each layer in the network starts to resemble areas in the visual pathways in the brain. The nodes in lower layers learn to identify basic features, such as edges, and higher layers steadily build up to more complex features, resulting in ‘cat’. However, a model is only as good as its assumptions, and several assumptions underpinning ANNs were known to be flawed from the start.
The name ‘neural network’ is itself a misnomer, conjuring images of artificial brains. Indeed in recent years there has been a shift away from using the term ‘artificial neuron’ in favour of the more neutral ‘node’. This is because, although they were loosely based on biological systems, they are not biologically plausible in many ways. The output of a node in a network – the activation, which is analogous to the firing rate – can be positive or negative unlike its biological counterpart. Furthermore, the error function that the network calculates is ‘backpropagated’ (fed through the network in reverse) so that individual weights can be changed. For a brain to do this, it would require bidirectional neurons – which do not exist – or neurons with identical synaptic strengths going the opposite way to the neurons in the network – which, if they do exist, are the exception rather than the rule. In addition, the alternating forward and backwards propagation of weights through the network would require the system to be ‘clocked’ so all cells are in sync. Otherwise the backwards-running error signal would interfere with the forward-running information flow. Although there is evidence of such rhythmic synchronisation in some parts of the brain, those suitable for this kind of synchronisation are uncommon.
Nevertheless, although the processes the brain uses aren’t mimicked exactly in neural networks, they still go about things in a similar fashion. They learn by trial and error, changing the strengths of connections between nodes, and each node performs simple computations which collectively result in a complex operation.
As mentioned previously, certain visual system models are known to have parallels with the brain’s visual cortex. My research project posed a simple question: if a biological system and an artificial system perform the same function, with the same input and the same output, do they do it in the same way?
For my conclusions to have any meaning, it is important for the inputs and outputs of the ANN and the biological system to be as close as possible. The computation chosen for this project was the post-saccadic eye angle of larval zebrafish, Danio Rerio. These fish have transparent young which allows easy, non-invasive visual access to the brain. They can also be genetically engineered so all neurons to produce a calcium indicator, GCaMP5, which fluoresces when calcium is present in the activated presynaptic bulb. This can then be seen with a camera and recorded. Critically, zebrafish also have stereotyped hunting behaviours from an early age, which can be elicited by a dot moving on a screen. When they initiate a ‘hunting bout’, their eyes converge quickly on their target. The angle formed by their eyes upon convergence is based on visual stimuli, prior experience, and instinct. More specifically, it is decided by a computation on the neural representation of its surroundings – specific patterns of neurons firing which describe the visual input. This is combined with neural representations of memories stored as synapses and synaptic strengths. The system uses these processes to arrive at an output – a neural representation of the angle to which they move their eyes.
To replicate this computation, I trained a recurrent ANN (which can learn temporal connections between inputs) on neural activity data from 38,901 neurons in the zebrafish brain, and used it to predict the post-saccadic eye angle. I then looked at the activations of nodes in the network and compared them to firing rates of ‘assemblies’ – neurons with similar activity that are clustered together. These activations represent outputs of computations, some of which are relevant to calculating eye angle.
Early results appear promising – some nodes in my network appear to be selective for rightward or leftward saccades, and others appear selective for very high-angle saccades in either direction. Similar features are found in equivalent time periods of assembly data. If this association holds under further analysis, it would show that the network is computing in a similar way to the zebrafish brain. Since a model is only as good as the assumptions it is based on, this would indicate that neural networks might not be so far from how the brain works after all.
By Adam Selway