Gabriel Gambetta is Tech Lead/Manager of the SpatialOS Community team at Improbable.
This article was first published on Improbable.io/news on 21 April 2016.
Gabriel Gambetta is Tech Lead/Manager of the SpatialOS Community team at Improbable.
The current generation of games is pushing the boundaries of what’s possible with today’s technology, both in terms of art and gameplay. Game engines such as Unity and Unreal do a fantastic job of rendering game worlds for the player, yet they are also doing all the work under the hood to make the game itself happen — running physics, AI, pathfinding, and game logic.
For a single player game, the engine will be doing all this work on the player’s device. The concept of the game loop hasn’t changed much since its invention decades ago.
You can fit only so much physics and AI on a single server
Multiplayer games aren’t very different. The game loop is split between client and server, with the client dealing with inputs and rendering, and the server performing the world simulation work.
But there is a limit to what can be achieved building games this way. You can only fit so much physics and AI on a single server, and this inevitably leads to compromises in terms of the size of the world, the number of players, or the complexity of the physics or the AI.
This, in turn, imposes limits on the creative vision. When game logic, physics and AI are already competing for a share of the available resources, there’s not much room left to experiment with truly innovative ideas.
Most games follow the Entity-Component-System (ECS) pattern to some degree. Every “thing” in the game world is represented by entities. These can be characters, trees, parts of cars or entire spaceships, depending on your game.
Entities are defined by their components, which group the properties that describe their state. Components can be shared between different entities, which leads to a modular design. For example, both characters and trees can share a Physical component that defines their position and rotation, whereas an Inventory component will be used only by characters.
These entities and components are brought to life by systems, typically threads that deal with the different aspects of the game. The physics system updates the Physical component, the AI system will observe the world and make decisions, the game logic system will make the game world behave in the way its designers intended, and so on.
The limitation of this approach comes from the fact that all these systems run on a single server with limited capacity; each system is competing for cycles with every other system.
But what if each system could be a distributed system?
This is the same epiphany Google had in the early 2000s: it is cheaper, simpler and infinitely more scalable and reliable to develop applications that run in clusters of commodity hardware instead of running on monolithic mainframes.
Of course, building distributed systems is hard, insanely hard. You need to debug race conditions, write netcode, think about concurrency and consistency, try to understand Paxos, give up and implement Raft, build fault tolerance into your design, have a strategy to deal with network latency and packet loss, put metrics and monitoring systems in place, and a myriad of other issues. None of which are remotely related to the fun and joy of making games!
It seems like a hopeless situation. The solution exists in theory, but in practice it seems impossibly complex for game developers, who are already stretched to their limits by everything involved in building a game!
Fortunately, it is possible to formulate this problem in an elegant way, which preserves the simplicity of the ECS pattern but makes it possible for a platform to abstract away all of the unpleasantness of building distributed systems.
The key innovation is introducing the concept of workers. We take each system in ECS and replace it with a distributed system; each distributed system is built out of many workers, each of them being a program that can simulate some components of a subset of the entities in the world.
We call this the Entity-Component-Worker (ECW) architecture:
The most immediate examples of what workers can be are general-purpose game logic workers, and a physics engines. Perhaps surprisingly, a game client is also a worker that runs on the player’s machine, but that isn’t treated in any special way otherwise.
It’s also possible to use workers for specialised simulation that doesn’t fit the game logic workers. These could include anything from accurate weather simulation to flocking behaviours.
SpatialOS is an implementation of the Entity-Component-Worker architecture on top of a distributed compute platform that runs on public cloud providers. This means the size of the game world is no longer limited by, for example, how much the physics engine on a single server can handle, because you can run hundreds of instances of the physics engine to simulate different parts of the world.
Workers are managed by SpatialOS and run in the cloud. The number of workers is dynamic; workers are brought up and down to reflect the current workload of the simulation. The set of entities assigned to each worker to simulate also changes continuously, to ensure the best use of the available resources.
All of this is invisible to the developer. In the ECW architecture, a developer would implement their game in a way very similar to writing a single-player game with the ECS pattern, and by specifying the kind of worker responsible for updating each component (for example, “Physical should be simulated by a physics worker”), the game world can scale up to millions of entities across thousands of workers running on hundreds of servers.
Most games built on SpatialOS use at least three kinds of workers: logic workers, physics workers, and game clients.
You can use logic workers to write any kind of game logic in Scala. You write behaviours associated with the different components, using the actor model. For example, you can write a behaviour that decreases the currentHealth of the Health component whenever it receives a TakeDamage message.
The logic worker is also used to determine what kind of worker should simulate each component, since this can change at runtime. For example, an item laying on the ground doesn’t need to be physically simulated once it’s picked up by a player and stored in their inventory.
In principle, you could implement anything in the logic worker, but it may not be the optimal way for every situation. Take physics, for example. There are several industry-standard physics engines that are the product of years of development. It may not be a great idea to implement your own physics engine, let alone writing it using an asynchronous, actor-based model.
Since physics is central feature of immersive game worlds, SpatialOS includes built-in integration with Unity’s physics engine. You can just delegate your behaviours to Physics, and SpatialOS will run as many instances of Unity in the cloud as necessary to run the game at whatever scale you want.
The third kind of worker most games will have are game clients. In ECW, a game client is just another worker — a worker responsible for rendering the game world for the player, and for sending player’s inputs to the simulation. SpatialOS includes built-in integration with Unity, letting you build game clients using the tools you’re already familiar with.
Interestingly, because the game world runs independently of the players connected to it, you can build multiple clients that interact with the same game world: players on PC, console, VR, and mobile can play together.
The argument for having specialised physics engines holds for many other kinds of computation your game may need. For example, you may want very accurate weather effects, or use an off-the-shelf pathfinding package, or use a different game engine for rendering.
The SpatialOS SDK includes a C++ API that lets you write a custom worker from scratch, or integrate a third-party package with your game world. A worker is simply an application that connects to the simulation using parameters supplied via the command line, and communicates with the simulation through a set of simple APIs.
In summary, SpatialOS gives each instance of a worker authority to simulate a component for a set of entities. For these entities, it sends periodic updates reflecting their state. The worker then performs the simulation it is designed to do, and sends updates back to SpatialOS.
As a reference example, we have built a custom worker that implements flocking behaviours for a bird simulation. You can read a detailed explanation here.
We envision the next generation of games will be built on a set of workers that cooperate to simulate a game world far bigger and far more detailed than is possible with a traditional architecture in a single server. This frees designers from many technical constraints, enabling the creation of truly engaging game worlds, brought to life by millions of rich entities that thousands of players can interact with in meaningful ways.
The ability to integrate custom workers in your simulation is one of the core features of SpatialOS. You can integrate off-the-shelf simulation packages, such as physics engines or road traffic simulators, and use them at a larger scale than they were designed for; or you can write your own specialised simulation code, as in the example above, to provide an unique experience for your users.