Release with confidence: How testing and CI/CD can keep bugs out of production
With today’s dueling demands to iterate faster while keeping quality standards high, minimizing both the frequency and severity of bugs in code is no easy task. This is doubly true in serverless environments, where lightweight code bases and fully-managed architectures enable developers to iterate more rapidly than ever before. Thorough testing is an effective method of finding potential bugs and protecting against errors in production that can have real business impact.
Testing can be somewhat of a double-edged sword, however: it’s a critical part of a successful launch, but it can easily take developers away from other tasks. Therefore, it’s important to know the types of testing available, and which ones to utilize for your specific needs.
While there is no shortage of ways to test your serverless applications, all of them come with trade-offs around speed, cost, accuracy, and scope. Which combination works best for you will depend on variables like how critical, long-lasting, and well-maintained your code is. Code that is critical and reused often requires in-depth testing with a wide range of scopes, while less important, non-reused code can often get by with fewer, higher-level tests.
In the next couple posts, we’ll look at testing and other important strategies that help minimize the frequency of bugs in production serverless deployments and reduce the severity of those that inevitably sneak past your test suite. We’ll also take a look at some example code that was developed for Cloud Functions, Google Cloud’s function-based serverless compute platform. This first post will discuss two important strategies for minimizing the frequency of bugs in production: testing and CI/CD (continuous integration and continuous deployment), and will cover general testing techniques, followed by examples of how to apply them with Cloud Functions.
Keeping tests real
Testing your functions locally and on a CI/CD machine is a good defense against most bugs, but it won’t catch everything. For example, it won’t identify issues with environment configuration or external dependencies that could impact your production deployment.
To get over this hurdle, we need an environment to test in that has all the functionality of a production Cloud Functions environment, but none of the associated risk should the environment get corrupted. To do this, we can set up a test—or canary—environment that resides somewhere between a local machine and production and replicates the production environment. One common approach is to use a separate Google Cloud Project as a canary environment.
Once our canary Cloud Functions environment is set up, we can start to talk about the three primary testing types that we’ll use: unit tests, integration tests, and system tests. Let’s look at each type individually, stepping up from the easiest to the most involved.
Lightweight: unit tests
Perhaps the easiest, quickest tests you can run are unit tests. Unit tests focus on a single feature and confirm that things work as expected. They have a few great things going for them, but are generally limited in their scope and the types of issues they identify.
Unit tests use mocking frameworks to fake external dependencies. For example, let’s say you have a feature that calls an API, the API returns a certain response, and then the feature does something based on that response. Unit testing takes that API out of the equation. The mocking framework returns a pre-defined response—what you would expect the API to return if it were working properly, for example—and simply makes sure that the feature itself behaves how we think it should whenever it gets that response.
Unit tests at a glance:
- Are fast and cheap to run since they rarely require billed cloud resources
- Confirm that the details of your code work as expected. For example, they’re great for edge case checking and other similar tests.
- Are useful for investigating known issues, but not great at identifying new ones
- Have no reliance on external dependencies (like libraries, APIs, etc). Of course, this means they also can’t be used to verify these things.
Let’s take a quick look at an example. First, here is a sample HTTP function that creates a Cloud Storage bucket based on the
name parameter in the request body:
And here is a very basic unit test for our function. This test creates a mock version of the
@google-cloud/storage library using
sinon. It then checks that the mock library’s createBucket function is being called with the correct arguments.
Middleweight: integration tests
Stepping up a bit from unit tests are integration tests. As the name suggests, integration tests verify that parts of your code fit together as you expect.
Integration tests can use a mocking framework, as we described in unit testing, or can rely on real external dependencies. Using a mocking framework is quicker and cheaper, while bringing in external dependencies provides a more robust test. As a rule of thumb, we recommend mocking any dependencies that are slow (more than one second) and/or expensive. This enables these tests to be run quickly and cheaply.
Integration tests at a glance:
- Balance problem detection and isolation. They are large enough in scope to detect some unanticipated bugs, but can still be run relatively quickly
- May require small amounts of billed resources, depending on how you run your tests. For example: if a test run depends on actual build resources, then those runs would cost money.
Here is an integration test for our sample function. This test sends an HTTP request to the function and checks that it actually creates a Cloud Storage bucket with the correct name. For integration tests, the value of
BASE_URL should point to a version of the function running locally on a developer’s machine (such as
Heavyweight: system tests
System tests broaden the scope to verify that your code works as a system. To that end, system tests rely heavily on external dependencies—making these tests both slower and more expensive.
One important thing to keep in mind with system tests is that state matters, and it may introduce consistency or shared-resource issues. For example, if you run multiple tests at the same time and your test tries to create a resource that already exists (or delete a resource that doesn’t exist), your test results may become flakey.
At a glance:
- Since you’re directing traffic at an actual cloud deployment, system tests can require moderate amounts of billed resources.
- System tests provide good bug detection. They can even catch unanticipated bugs and bugs outside your codebase, such as in your dependencies or cloud deployment configuration.
- Since the scope of system tests is so large, they aren’t as good at isolating problems and their root causes as the other types of tests we’ve discussed.
Here is a system test for our sample function. Like our integration test, it sends an HTTP request to the
newBucket Cloud Function and checks to make sure the correct bucket was created.
If you look closely, you’ll notice that this test is exactly the same as the integration test. In fact, the only difference is that the
BASE_URL variable is set so that the test points at a deployed Cloud Function instead of a locally-hosted one.
Though this trick is often specific to HTTP-triggered functions, reusing integration test code in system tests (and vice versa) can help reduce the maintenance burden created by your tests.
Other testing options
Let’s take a quick look at some other common types of testing, and how you can best utilize them with Cloud Functions.
Static tests verify that your code follows language and style conventions and dependency best practices. While they are relatively simple to run, one major limitation you have to account for is their narrow focus.
Many static test options are free to install and easy to use. Linters (such as
prettier for Node.js and
pylint for Python) enforce style conventions, while dependency tools (such as
Snyk for Node.js) check for dependency issues.
Load tests involve creating vast amounts of traffic and directing it at your app to make sure your app can handle real-world traffic spikes. They verify that the entire, end-to-end system—including non-autoscaled components—are capable of handling a specified request load, which is usually a multiple of the peak number of simultaneous users you expect.
Load tests can be expensive, since they require lots of billed resources to run, and slow due to the external dependencies they rely on. On the plus side, many of the actual testing tools are free, including Apache Bench (“ab” on most Mac and Linux systems),Apache JMeter, and Nordstrom’s serverless-artillery project.
Security tests verify that code and dependencies can handle potentially malicious input, and can be part of your unit, integration, system, or static testing. Beware: security tests have the potential to damage their target app environment. For example, a testing tool may attempt to drop a database or otherwise compromise the resources in its environment. The lesson here is: make sure to use a test or canary environment unless you are 100% sure the tool in question won’t hurt your production environment.
There are many free security testing options out there, including Zed Attack Proxy, Snyk.io, the Big List of Naughty Strings, and
oss-fuzz, just to name a few. However, no automated security testing tool is perfect. If you are serious about security, hire a security consultant.
At the beginning of this post, we mentioned two ways to minimize the frequency of bugs: testing and CI/CD. Now that we’ve covered testing, let’s take a look at how continuous integration and continuous deployment can provide an additional layer of defense against bugs in production.
The motivation for CI/CD is fairly straightforward. If you’re a developer, version control—whether it’s git branches or another system—is your source of truth. At the same time, code for Cloud Functions has to be tested and then redeployed manually. This presents no shortage of potential issues.
CI/CD systems automate this process, letting you automatically mirror any changes in version control to GCF deployments. CI/CD systems detect code changes using hooks in version control systems that are triggered whenever new code versions are received. These systems can also invoke language-specific command-line functions to run your tests, followed by a call to
gcloud to automatically deploy any code changes to Cloud Functions.
There are many different CI/CD options available, including Google’s own Cloud Build, which natively integrates with GCP and source repositories. A basic CI/CD for Cloud Functions is fairly simple to set up and deploy with Cloud Build—see this page for more details.
Writing a thorough and comprehensive test suite, running it in a realistic “canary” environment, and automating your deployment process using CI/CD tools are techniques that can help you reduce your production bug rate. When used together, they can significantly increase the reliability and availability of your services while decreasing the frequency of buggy code and its resulting negative business impacts.
However, as we cautioned at the beginning, testing simply can’t catch every bug before it hits production. In our next post, we’ll discuss how to minimize the business impact of bugs that do make their way into your Cloud Functions based applications using monitoring and in-production debugging techniques.