How Google Cloud helped Phoenix Labs meet demand spikes with ease for its hit multiplayer game Dauntless

In the role-playing video game Dauntless, players work in groups to battle monsters and protect the city-island of Ramsgate. Commitment reaps big rewards: with every beast slayed, you earn new weapons and armor made of the same materials as the Behemoth you took down, strengthening your arsenal for the next battle. 

And when creating Dauntless, game studio Phoenix Labs channeled these same values of resourcefulness, teamwork, and persistence. But instead of using war pikes and swords, it wielded the power of the cloud to achieve its goals.  

Preparing for unknown battles with containers and the cloud

For the gaming industry, launches bring unique technological challenges. It’s impossible to predict if a game will go viral, and developers like Phoenix Labs need to plan for a number of scenarios without knowing exactly how many players will show up and how much server capacity will ultimately be needed. In addition, since Dauntless was the first game in the industry to launch cross-platform—available on PlayStation 4, Xbox One, and PCs—it would be critical for all the underlying cloud-based services to work together flawlessly and provide an uninterrupted, real-time and consistent experience for players around the globe.

As part of staying agile to meet player needs, Phoenix Labs runs all its game servers in containers on Google Cloud Platform (GCP). The studio has a custom Google Kubernetes Engine (GKE) cluster in each region where Dauntless is available, across five continents (North America, Australia, Europe and Asia). When a player loads the game, Dauntless matches him or her with up to three other players, forming a virtual team that is taken to a neighboring island to hunt a Behemoth monster together. Each “group hunt” runs on an ephemeral pod on GKE, lasting for about 15 minutes before the players complete their assignment and return to Ramsgate to polish their weapons and prepare for the next battle. 

“Containerizing servers isn’t very common in the gaming industry, especially for larger games,” said Simon Beaumont, VP Technology at Phoenix Labs. “Google Cloud spearheaded this effort with their leadership and unique technology expertise, and their platform gave us the flexibility to use Kubernetes-as-a-service in production.”


Addressing player and customer needs at launch and beyond

When Dauntless launched out of beta earlier this year, the required amount of server capacity turned out to be a lot. Within the first week, player count quickly climbed to 4 million—rapid growth that was no small feat to accommodate.

Continuously addressing Reddit and Twitter feedback from players, Phoenix Labs’ lean team worked side by side with Google Cloud Professional Services to execute over 1,700 deployments to its production platform during the week of the launch alone. 

“Google Cloud’s laser focus on customers reaches a level I’ve never seen before,” said Jesse Houston, CEO and co-founder at Phoenix Labs. “They care just as much about our experience as a GCP customer as they do about our players. Without their ‘let’s go’ attitude, Dauntless would have been a giant game over.”


“Behemoth” growth, one platform at a time 

Now that Dauntless has surpassed 16 million unique players and launched on Nintendo Switch, Phoenix Labs is preparing to expand to new regions such as Russia and Poland (they recently launched in Japan) and take advantage of other capabilities across Google. For example, by leveraging Google Ads and YouTube as part of its digital strategy for Dauntless, 5 million new gamers were onboarded in the first week of launch; using YouTube Masthead ads also increased exposure to its audience. Phoenix Labs has migrated to Google Cloud’s data warehouse BigQuery for its ease of use and speed, returning queries in seconds based on trillions of rows of data. They’re even beginning to use the Google Sheets data connector for BigQuery to simplify reporting and ensure every decision is data informed. 

At Google Cloud, we’re undaunted by behemoth monsters—and the task of making our platform a great place to launch and run your multiplayer game. Learn more about how game developers of all sizes work with Google Cloud to take their games to the next level here.

GKE Sandbox: Bring defense in depth to your pods

Editor’s note:This is one of several posts in a series on the unique capabilities you can find in Google Kubernetes Engine (GKE) Advanced.

There’s a saying among security experts: containers do not contain. Security researchers have demonstrated vulnerabilities that allow an attacker to compromise a container and gain access to the shared host operating system (OS), also known as “container escape.” For applications that use untrusted code, container escape is a critical part of the threat profile.

At Google Cloud Next ‘19 we announced GKE Sandbox in beta, a new feature in Google Kubernetes Engine (GKE) that increases the security and isolation of your containers by adding an extra layer between your containers and host OS. At general availability, GKE Sandbox will be available as part of the upcoming GKE Advanced, which offers enhanced features to help you build demanding production applications on top of our managed Kubernetes service.

Let’s look at an example of what could happen with a container escape. Say you have a software as a service (SaaS) application that runs machine learning (ML) workloads for users. Imagine that an attacker uploads malicious code that generates a privilege escalation to the host OS, and from that host OS, the attacker accesses the model and data of the other ML workloads, when the model and data aren’t theirs.

GKE Sandbox is based on gVisor, the open-source container sandbox runtime that we released last year. We originally created gVisor to defend against a host compromise when running arbitrary, untrusted code, while still integrating with our container-based infrastructure. And because we use gVisor to increase the security of Google’s own internal workloads, it continuously benefits from our expertise and experience running containers at scale in a security-first environment. We also use gVisor in Google Cloud Platform (GCP) services like the App Engine standard environment, Cloud Functions, Cloud ML Engine, and most recently Cloud Run.

gVisor works by providing an independent operating system kernel to each container. Applications then interact with the virtualized environment provided by gVisor’s kernel rather than the host kernel. gVisor also manages and places restrictions on file and network operations, ensuring that there are two isolation layers between the containerized application and the host OS. By reducing and restricting the application’s interaction with the host kernel, attackers have a smaller attack surface with which to circumvent the isolating mechanism of the container.

GKE Sandbox takes gVisor, abstracts the internals, and presents it as an easy-to-use service. When you create a pod, simply choose GKE Sandbox and continue to interact with your containers as you normally would—no need to learn a new set of controls or a new mental model.

In addition to limiting potential attacks, GKE Sandbox helps teams running multi-tenant clusters, such as SaaS providers, who often execute unknown or untrusted code. There are many components to multi-tenancy, and technologies like GKE Sandbox take the first step toward delivering more secure multi-tenancy in GKE.

How users are hardening containers with GKE Sandbox
Data refinery creator Descartes Labs applies machine intelligence to massive data sets. “At Descartes Labs, we have a wide range of remote sensing data measuring the Earth and we wanted to enable our users to build unique custom models that deliver value to their organizations,” said Tim Kelton, Co-Founder and Head of SRE, Security, and Cloud Operations at Descartes Labs. “As a multi-tenant SaaS provider, we still wanted to leverage Kubernetes scheduling to achieve cost optimizations, but build additional security layers on top of users’ individual workloads. GKE Sandbox provides an additional layer of isolation that is quick to deploy, scales, and performs well on the ML workloads we execute for our users.”

We also heard from early customer Shopify about how they’re using GKE Sandbox. “Shopify is always looking for more secure ways of running our merchants’ stores,” said Catherine Jones, Infrastructure Security Engineer at Shopify. “Hosting over 800,000 stores and running customer code (such as custom templates and third-party applications) requires substantial work to ensure that a vulnerability in an application cannot be exploited to affect other services running in the same cluster.”

Jones and her team developed proof-of-concept trials to use GKE Sandbox and now plan on upgrading existing clusters and enabling it for all new clusters for developers. “GKE Sandbox’s userland kernel acts as a firewall between applications and the cluster node’s kernel, preventing a compromised application from exploiting other applications through it,” said Jones. “This will allow us to provide more security to our 600+ applications without impacting developers’ workflows or requiring our security team to maintain custom seccomp and apparmor profiles for each individual application. In addition, because GKE Sandbox is based on the open-source gVisor project, we can troubleshoot it more effectively and contribute code to support our use cases as need be.”

Getting started with GKE Sandbox
When we say that running a cluster with GKE Sandbox is easy, we really mean it. The following command creates a node pool with GKE Sandbox enabled, which you can attach to your existing cluster.

To run your application in GKE Sandbox, you just need to set runtimeClassName: gvisor in your Kubernetes pod spec. The following example creates a Kubernetes deployment to run on a node with GKE Sandbox enabled.

For a more detailed explanation of GKE Sandbox, check out the documentation.

Applications that are a great fit for GKE Sandbox
GKE Sandbox uses gVisor efficiently, but running in a sandbox can still have additional costs. Memory overhead is typically on the order of tens of megabytes, while CPU overhead depends more on the workload. Therefore GKE Sandbox is well-suited to run compute and memory-bound applications, such as:

  • Microservices and functions: Microservices and functions built with third-party and open-source components often have varying levels of trust. GKE Sandbox enables additional defense in depth while preserving low spin-up times and high service density. gVisor itself can launch in less than 150ms and its memory footprint can be as low as 15MB.
  • Data processing: Processing untrusted sensor inputs, complex media, or data formats may require using potentially vulnerable tools or parsers. Isolating these activities in sandboxed services can help to reduce the risk of exploitation. The CPU overhead of sandboxing data processing depends on how I/O intensive the service is, but is less than 5 percent for streaming disk I/O and compute-bound applications like FFmpeg. Other examples are MapReduce, ETL (Extract, Transform, Load), and media processing.
  • CPU-based machine learning: Training and executing machine learning models frequently involves large quantities of data and complex workflows. Often the data or the model itself is from a third party. Typically, the CPU overhead of sandboxing compute-bound machine learning tasks is less than 10 percent.

The above list is not exhaustive, and GKE Sandbox works with a wide variety of applications. Keep in mind that the extra validation for file system and network operations can increase your overhead. We recommend that you always test your specific use case and application with GKE Sandbox.

Try GKE Sandbox today
To get started using GKE Sandbox today, visit our feature page here. To learn more, check out our GKE Sandbox and gVisor sessions:

As GKE Sandbox gets closer to general availability, look for a free trial of GKE Advanced coming soon.