Launching any game can be stressful because there’s always going to be a degree of unpredictability that no one can control. You step off the cliff and hope your wings will be fully formed before you hit the ground. With multiplayer games, that worry is amplified as the need to scale effectively becomes fundamental to the game running smoothly.
When the observability of your multiplayer game’s infrastructure is limited, launch fear can be very real. But, with the right tech and people behind you, there’s room for this apprehension to be managed by increasing transparency over the state of the backend systems as you launch and scale.
A studio can gain a level of comfort by logging in to a server dashboard and getting metrics on the state of your multiplayer game launch. But it isn’t always as easy as that.
In this blog, Joshua Harris, Principal Software Engineer, and Lucas Valtl, Technical Product Manager, look at the specific challenges of scaling metrics around backend infrastructure – and how conquering these can increase observability, and ultimately contribute to an even better gaming experience for your players. They outline four key areas where many studios come up against obstacles to observability.
1. The balancing act of measuring and scaling metrics
Simply put, observability infrastructure is about having a pipeline to get information from A to B, and then being able to query and understand that information. Having the right technology and information at hand to keep up with load and player demands for your game is a complex thing to do. There are so many possible permutations, that you’ll never be able to get observability 100% right out the gate. You’ll always have to tweak and change things.
To look closer at this, Joshua explains that there are three data characteristics that games typically trip up on with metrics when they’re launching, including: - Cardinality of data - Volume of data - Signal to noise ratio.
The cardinality is the uniqueness of the data you’re indexing. The more unique data you’re trying to index, the more resources you need to handle the indexes and data up to a point where things tip over.
Volume of data is increased through the number of things you’re trying to capture multiplied by the number of systems or users you’re trying to capture data from. Poorly designed schemas can cause significant performance issues or failures when trying to use your data, insufficient scaling can result in systems and networks tipping over, and insufficient planning can make your data difficult to use for gaining the insights you desire.
And finally, until you actually operate the game, it's really difficult for you to know what's important to you, and what's important to your game. That’s where the signal to noise ratio comes in. Capture too little data and you’re missing the critical pieces of information you need. Log too much data and you end up with an immense amount of information that is costly or impossible to sift through and can hide critical signals you need to be paying attention to.
With metrics, it's a balancing and fine-tuning game to get things right.
2. Choosing build vs partner
When kicking off a project, a studio will often opt to build observability pipelines themselves. Open source components are fantastic for getting you off the ground. Once you want deeper insights or get into a more operational phase, there's a lot more involved in trying to scale this up, particularly for something like a game launch. There’s disaster recovery, fault tolerance, replication, scalability and the operations team. These are all hidden costs when it comes to observability that often studios don't see upfront when they first try to stand up their game.
The ease with which you can stand up an observability stack to meet the needs of basic development combined with the fact that gaining insights into player behavior and deeper systems tends to come late in the development process feeds into a trend that observability is often trivialized until much later than it should be. When specialist third parties offer expensive observability-specific offerings it’s easy to see why many studios opt to make things work the best they can. However, when done right, observability is much more than spitting out a log to the log file so it can be searched for easily.
Regardless of maturity level, a game should and can benefit from high observability infrastructure. The complexity of trying to get the right things, to hold the data in the right way to figure out what the right questions are to ask on the data, is key. A fundamental starting point when choosing whether the build or third partner route is right for your studio is: are we in a position to turn this volume of data we have into something useful?
3. Getting storage just right
When dealing with your game data, you typically have different aspects of the business you’re trying to serve that have different requirements on the data being captured. The mechanisms for handling these different requirements and surfacing the right data at the right time are often referred to as hot, warm, and cold data pipelines.
The hot pipeline is intended to capture your live operational data. The data you need at hand right now in order to see how your systems are performing or to respond to incidents. Because this data has such a high degree of granularity, it tends to be kept for short periods of time; often around 1-7 days.
Next up you've got your warm pipeline. This is often used for week-over-week business insights, tracking player behaviors, observing store trends, and monitoring short-term retention and engagement characteristics. This likely has roll-up information or lower granularity copies of subsets of the information from your hot pipeline.
Then you've got your cold pipeline or data lake. This is your long-term storage with the lowest granularity that often takes the longest to access, due to the way it’s stored and the type of systems it’s contained in. Typically you have to do more complex batch operations to get usable information out of your cold pipeline. However, it's well worth the investment of time and effort. This is where you start looking at your long-term business trends, understanding how you want to shape and evolve your game and business over time to get the most added value from it.
Out-of-the-box observability gives you the ability to know what’s going on, but more importantly what is going wrong. It then gives you the tools to drill into the issue to find the root cause and fix it quicker.
4. Charting a course for success
Charting is all about displaying information in a way that’s usable, and allows the user to get meaningful insights as an output.
However, according to Lucas, a common mistake that studios can make with charting is that they’ll put every metric they have on a dashboard, and before they know it, they’re overwhelmed with graphs, and it's really difficult to figure out what's going on. Whenever you’re making graphs, you want the metrics to be as specific as possible so they can become actionable.
So, what would be a useful metric as an example? Concurrent players that are in the game is a good thing to track because they give you a high level view of your game: When players are playing, usually most things should be working. However, when we spot a sudden dip in this, there could be an issue that needs fixing.
When it comes to charting, it's not about generating as many numbers as possible with all your metrics, it's about layering specific metrics one by one to build a clear picture of the game and player status.
Although launch fear can never be fully eradicated, knowledge is power. When you have the correct observability of your game, you can make the most of the product opportunities that arise during the product lifecycle, putting you back in the driver’s seat.
Trusting others with something so vital to your business can be nerve-racking, but having access to experts as an extension of your team can help you to make the most of your data and your product opportunities.
Want to find out more about optimising your game server hosting to increase launch certainty?