Why Return to the Animal Kingdom?
This article accompanies a talk given at UKBUG in November 2003. It follows on from another talk given in September, entitled “Emergent Behaviour and Artificial Life”, material for which is also available on this site. If you haven’t done so already, it might be a good idea to read it before you go on to read this article.
The creatures in the original Animal Kingdom used simple rules in order to move, eat and reproduce within a simulated environment. Rabbits and Hares used rules with different levels of complexity to find grass to eat within the square “field” in which they lived. Foxes hunted rabbits and hares to eat. A degree of emergent behaviour was exhibited by the three species, especially when they were allowed to interact.
This work follows directly on from where I left off in the last article and addresses some questions about the system that I had – and some others which were asked of me.
- What would happen if creatures behaviour was dictated by a neural network instead of a set of predefined rules?
- Could creatures learn to find food?
- Could evolutionary techniques be used to allow parents to pass on behavioural characteristics to their children?
The answers to these questions can be found in the remainder of this article. Other articles on this site cover the basics of neural networks in nature, learning in artificial neural networks and a Delphi implementation of a neural network. You may be interested in reading these too. You can download the code and executable from the bottom of the page and play with the sheep yourself!
Sheep are a new type of animal, which inhabit the two dimensional world developed for the previous talk. Sheep live on grass and spend their time (hopefully) searching for more grass or bonking. The difference between sheep and the rabbits, hares and foxes in
the previous simulation, is that they have brains. Figure 1 shows the first appearance of sheep in the Animal Kingdom.
Figure 1: The first appearance of sheep in the Animal Kingdom
Each sheep has a small neural network which it uses to decide which direction to move in. The network has sixteen inputs and eight outputs. The inputs correspond to the food levels and the existence of other sheep on the tiles surrounding the sheep’s current location. The outputs correspond to the eight possible directions to move in.
Network outputs On, where n = 0 to 7 and corresponds to the index of the output with the highest value:
- O0 = Travel North
- O1 = Travel North East
- O2 = Travel East
- O3 = Travel South East
- O4 = Travel South
- O5 = Travel South West
- O6 = Travel West
- O7 = Travel North West
The first eight inputs are associated with the food value of the surrounding eight tiles. The last eight inputs are used to show whether there is an animal on each adjacent square and whether that animal is male or female. Inputs 0 and 8 correspond to the northern tile, inputs 1 and 9 correspond to the north eastern tile and so on.
For each input In, where n = 8 to 15:
- In = 0 if associated square is empty
- In = 1 if the animal on associated square is the same sex
- In = -1 if the animal on associated square is not the same sex
Every turn the appropriate food level and neighbour values are presented to the inputs of the neural net and allowed to propagate through to the outputs. The output which has the highest value (i.e. the highest positive number) is chosen as the direction
to move in.
The initial random initialisation of the weights within the sheeps’ neural network generally cause them to be very bad at finding food. A randomly assigned high value on the input of one of the neurons causes the sheep to have a bias towards travel in a particular direction. Some form of very rapid learning is required to allow the sheep to learn to make more sensible decisions about which direction to travel in.
There are two ways which animals in the real world are able to adapt and become more successful within their environment: Adaptation through an evolutionary process and learning. We will examine both of these (in a very simple form) here.
Breeding Sheep with an Evolutionary Algorithm
A number of changes to the bonking system have been made to allow males to pass on genetic material to females. The genes of the two parents can then be combined to create a child. In addition to this, alterations have been made to allow animals to have more than one offspring. All of the animals in the first simulation (foxes, rabbits and hares) are still limited to one offspring, but the sheep are allowed more.
Sheeps’ genetic material is simply a one dimensional representation of their neural network. The weight value associated with each synapse in a sheep’s neural network is appended to an array. Because the behaviour of the network is defined entirely by the weight values, it is possible to represent a sheep’s brain in a lossless way using this simple approach.
A sheep’s gene is simply a list of the real numbers which correspond to the weights in its neural net.
[W0, W1, W2, W3 … Wn – 1, Wn]
When two sheep bonk they combine their genetic material to create a child. The mother and father’s genes are copied and chopped into small chunks. These chunks are combined at random to create a new string of the same length. This new string of real numbers is the genetic representation
of the child. The child’s neural network is constructed from its gene and, because it is inherited from its parents, its behaviour should be similar to that of its parents.
Sheep which are better at finding food (due to the setup of their neural networks) will live longer. Sheep which live longer will have more opportunities for bonking, and will therefore pass on their genetic material to more offspring. Sheep which are not good at finding food will die earlier, so their genetic material (which makes them bad at finding food) will be removed from the population quite quickly. Offspring of better parents should behave in a similar way to their parents, so after several generations the behaviour of the sheep and their ability to find food should have become more efficient.
You may, at this point, want to glance at another article on genetic and evolutionary algorithms.
It seems, however, that the pressures of natural selection are not enough to cause sheep to learn to find food. In fact, the animals which do best are exactly the type of animals I wanted to get rid of: the ones which madly dash in one direction regardless of what is around them. The edges become very crowded areas and, thus, offer more bonking opportunities. Although the sheep live for only a very short time, it is clear that the ones which blindly head in one direction do better, because they meet all the other sheep with the same idea when they reach the coastline and therefore have more chances to reproduce.
This is a good illustration of a key point about natural selection: In order to cope with environmental pressures a species will generally develop the most simple solution, rather than the most elegant one (or the most exciting one to present to an audience). Evolutionary pressures will not favour the individual, but will instead favour the species as a whole. So, even though the average sheep lifespan is greatly decreased, the overall population of sheep is greatly increased. This is, interestingly, why humans have never evolved an immunity to diseases like arthritis and cancer, which generally strike long after we have already had children and passed on our genetic material.
Here’s a picture of one of the colonies of sheep which has formed on the edge of the map:
Figure 2: A colony of sheep at the northern edge of the map. All these sheep have travelled North, without making any really intelligent choices along the way!
Reinforcement Learning for Sheep
Because the evolutionary approach didn’t really help me in my quest to make the sheep search for food, the next area for investigation is reinforcement learning. A very simple neural network learning algorithm is used to allow the sheep to assess the outcome of their actions and alter their synaptic weights based on whether they were a good or bad.
After each sheep moves in a particular direction it will assess whether moving that way was beneficial to it – i.e. whether is found some food or a bonking partner. If the move was beneficial then the sheep will reinforce the synaptic connections which lead it to make the decision and weaken the others. If the move was detrimental the creature will weaken the synapses associated with the offending output neuron and strengthen the others.
If a creature is unable to move in the direction it has chosen then the move will be seen as detrimental.
Here is a screen shot of the “Agent Details” tab of the demo application. Details of the agent’s health, age, sex and foodstore levels can be seen, along with its current neural network outputs and a historical view of the directions in which it has moved over the past few turns (the list view with red arrows).
Figure 3: The Agent Details Page
When learning is switched on, the sheep very quickly find out that constant movement in a straight line is the simplest way to find food. They will start to move in a particular direction and then find some food. Finding food strengthens the synaptic connections associated with that direction, so they move the same way next time – regardless of the food distribution around them.
They finally have to stop when they hit the coast. Obviously they can’t move any further, so they begin tochange their minds about the direction they used to be so sure of. After a while they will choose another direction and, if unperturbed, head that way until they bounce off the other side of the island. The edges of the map continue to get quite congested and, therefore, provide great breeding grounds.
Figure 4: A Flock of sheep at the corner of the map
The screen shot in figure 4 shows a corner of the map. Corners are one of the main areas where sheep seem to congregate. This is because the probability of them choosing another direction to head in which will move them away from the corner is very small (3/8), unlike at an edge (5/8 ) or in open space (1). Moving around in crowded areas is made even harder by collisions with other sheep.
The “flocks” which form at edges and corners do, however, provide excellent opportunities for breeding, due to the higher likelihood of meeting a mate. The sheep, therefore, breed very quickly. So quickly in fact, that I was forced to lower the number of children born at a time back down to one to avoid population explosion.
Figure 5: The Mini Map, showing the distribution of sheep around their environment
So the sheep don’t do what I thought they’d do, but they are a highly successful species! Yet again this has proved that artificial life (like real life) will generally always find the simplest way to do things. Why learn the fine art of food seeking when you could just head in one direction and receive a reasonable reward for doing so?
This last section details some of the final changes I made to try to make the sheep a more successful species.
A common behaviour of sheep is to start heading in a particular direction and to have that direction further and further reinforced in their minds as the best way to go. However small the quantity of food they found, it still causes a strengthening of the weights associated with that direction. Sheep very quickly learn to be, for example, Southwest Lawnmowers.
In order to combat this I linked the reward function not to the amount of food eaten, but to the gain in health after the move. This introduced a delay into the training, as it takes more than one turn for a sheep to digest food from its food store into health. After this change was made the sheep became much more varied in their movements. The new system rewards sheep for adopting a general behaviour, rather than for any single action.
Here’s the code from the sheep’s ProcessTurn procedure:
procedure TSheep.ProcessTurn; var Index, Reward, StartHealth : Integer; Candidate : IAnimal; begin // Make a note of the current health StartHealth := Health; // Update health, give birth, digest food etc inherited ProcessTurn; // Initialise the reward variable Reward := 0; // Males get the opportunity to initiate bonking if IsMale and Assigned(Candidate) and (Age > 16) then begin Candidate := RandomBonkCandidate; if Bonk(Candidate) then Exit; end; // Set up the NN inputs (food on surrounding squares) SetupNNInputs; // Choose the direction to head in Index := HighestNetworkOutput; // Attempt to move - negative reward if we can't move if CanMoveTo(Tile.Neighbours[Index]) then Move(Tile.Neighbours[Index]) else Reward := Reward - 10; // Eat some food if (Tile.Food > 0) then Eat(20); // Calculate the reward factor this turn Reward := Reward + (StartHealth - Health); // Do the reinforcement here Learn(Index, Reward / 100); end;
Listing 1: The method which defines a sheeps behaviour each turn
Figure 6 shows the sheep population over time.
Figure 6: The population graph shows the rise and fall of the sheep population. Blue = Male, Pink = Female, Orange = Lambs
Figure 7: The mini map after the sheep population has reached a stable level. Note that the sheep are more evenly spread around the map than in the previous example
Feel free to download and play with the application and source code. Note that you use it at your own risk! If you do discover something exciting, or write a new TAnimal descendant then please drop me a line and let me know how it works.
AnimalKingdom2.zip Zipped application (should run on any PC) (211Kb)
AnimalKingdom2Code.zipZipped project code (testing in Delphi 6 and 7) including DUnit tests and Model Maker project for neural network stuff (228Kb)