This blog was originally posted on 2nd January 2016.
This is a simulation of just over 1 million entities distributed dynamically over 40 – 50 separate physics engines, represented by the different colours. At any one time, between 20k – 60k people are walking between buildings using the Unity game engine for path-finding while avoiding colliding with each other. Buildings can also be destroyed, and the rubble dynamically alters the navigation mesh, causing people to route around. There are no server boundaries: each person can walk from one edge of the world to the other with no discontinuities.
A single instance of Unity can only simulate up to 2,000 dynamic path-finding entities at one time, perhaps more if you have a powerful machine. To allow all 60,000 to path-find smoothly, we need many instances of Unity to work together.
The simulation above is using somewhere between 40 to 50 separate instances of Unity at a time. In SpatialOS, we call these “workers”. Each worker is allocated a small region of space that may change as the simulation workload evolves. Some of our bigger simulations have used over 600 physics workers to simulate huge areas.
A worker is simply a program responsible for simulating certain properties of a set of entities. Here we are using the Unity game engine as a physics worker. This means it is responsible for the physical properties of the entities it simulates, such as position and rotation. It also handles the path-finding behaviour of the people moving between buildings.
There is another kind of worker involved: the client. The client doesn’t simulate any physics; it is simply responsible for moving the camera and rendering the scene.
A single worker doesn’t have to know about the rest of the simulation; its job is simply to simulate a specific area and to do it well. While it can access information about the entities around it if necessary, it otherwise has no knowledge of how big the rest of the simulation is. Physics workers will actually end up co-simulating small regions at their boundaries to allow for completely seamless handover between them (see this article for more details).
SpatialOS manages the distribution of work so that each entity is simulated without giving more work to a worker than it can handle, and ensuring that only those areas that need simulating are given to workers.
All the distribution of work is done dynamically at runtime. Workers do not simulate a fixed region of space but are allocated different areas of work by SpatialOS as the workload of the simulation changes over time. In fact, the region of space a worker simulates doesn’t even need to be contiguous. Workers will be created or destroyed to optimise the use of resources as more load is placed on the system.
As more people move into one region, more workers are brought up to handle the load. You can also see workers moving to follow groups as they move around. None of the work is statically allocated – all work allocation is dynamically managed by SpatialOS at runtime.
Another benefit of this approach is a high degree of fault tolerance to workers failing. If a worker fails, the work is simply reallocated to other nearby workers within milliseconds. If those workers end up overloaded, more are brought up to handle the extra load. All of this happens automatically and with no visible effect on the simulation.
Workers don’t have to simulate just physics; they can simulate anything you want. From traffic flow to disease propagation, to training neural networks, workers can be used to simulate any component of an entity, or simply as a view onto the simulation. You can use off the shelf simulation software, existing game engines, or integrate your own specialised code. For this simulation, we used the popular game engine Unity, with an integration wrapper that allows it to work with SpatialOS.
In the video above, people disappear when they enter a building, but they aren’t removed from the simulation; they simply become invisible. This means they are no longer physically simulated on a Unity worker, and no longer rendered on the client. But they still exist in the simulation, retaining whatever properties they had before. In fact, other workers could still be simulating other components of that entity; for example, a person can still age or become hungry while in a building, but these things don’t need a physics engine to simulate so the entities are not present on any of the physics workers. If you wanted to have people move around rooms inside a building and interacting with things inside, then you could have another type of worker dedicated to that task, which would take over when people move inside.
This process is called component delegation. SpatialOS holds the canonical form of an entity as a set of components each with various properties. For example, the physical component has properties for position, rotation, mass, etc. If a worker needs to know a particular property of an entity, SpatialOS sends it the relevant component of that entity and ensures it stays synchronised with any changes made in the simulation.
When you want a worker to simulate a particular property, it must have write authority. We call Unity a physics worker, but really it’s just a worker we give authority on the physical component. If you are making a game and want to change an entity from being server-side authoritative to client-side authoritative, it’s as simple as re-delegating the physics component. This can even happen at runtime, on a per-entity basis, based on the state of that entity.