An easier way to move your App Engine apps to Cloud Run

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Blue header

An easier yet still optional migration

In the previous episode of the Serverless Migration Station video series, developers learned how to containerize their App Engine code for Cloud Run using Docker. While Docker has gained popularity over the past decade, not everyone has containers integrated into their daily development workflow, and some prefer “containerless” solutions but know that containers can be beneficial. Well today’s video is just for you, showing how you can still get your apps onto Cloud Run, even If you don’t have much experience with Docker, containers, nor Dockerfiles.

App Engine isn’t going away as Google has expressed long-term support for legacy runtimes on the platform, so those who prefer source-based deployments can stay where they are so this is an optional migration. Moving to Cloud Run is for those who want to explicitly move to containerization.

Migrating to Cloud Run with Cloud Buildpacks video

So how can apps be containerized without Docker? The answer is buildpacks, an open-source technology that makes it fast and easy for you to create secure, production-ready container images from source code, without a Dockerfile. Google Cloud Buildpacks adheres to the buildpacks open specification and allows users to create images that run on all GCP container platforms: Cloud Run (fully-managed), Anthos, and Google Kubernetes Engine (GKE). If you want to containerize your apps while staying focused on building your solutions and not how to create or maintain Dockerfiles, Cloud Buildpacks is for you.

In the last video, we showed developers how to containerize a Python 2 Cloud NDB app as well as a Python 3 Cloud Datastore app. We targeted those specific implementations because Python 2 users are more likely to be using App Engine’s ndb or Cloud NDB to connect with their app’s Datastore while Python 3 developers are most likely using Cloud Datastore. Cloud Buildpacks do not support Python 2, so today we’re targeting a slightly different audience: Python 2 developers who have migrated from App Engine ndb to Cloud NDB and who have ported their apps to modern Python 3 but now want to containerize them for Cloud Run.

Developers familiar with App Engine know that a default HTTP server is provided by default and started automatically, however if special launch instructions are needed, users can add an entrypoint directive in their app.yaml files, as illustrated below. When those App Engine apps are containerized for Cloud Run, developers must bundle their own server and provide startup instructions, the purpose of the ENTRYPOINT directive in the Dockerfile, also shown below.

Starting your web server with App Engine (app.yaml) and Cloud Run with Docker (Dockerfile) or Buildpacks (Procfile)

Starting your web server with App Engine (app.yaml) and Cloud Run with Docker (Dockerfile) or Buildpacks (Procfile)

In this migration, there is no Dockerfile. While Cloud Buildpacks does the heavy-lifting, determining how to package your app into a container, it still needs to be told how to start your service. This is exactly what a Procfile is for, represented by the last file in the image above. As specified, your web server will be launched in the same way as in app.yaml and the Dockerfile above; these config files are deliberately juxtaposed to expose their similarities.

Other than this swapping of configuration files and the expected lack of a .dockerignore file, the Python 3 Cloud NDB app containerized for Cloud Run is nearly identical to the Python 3 Cloud NDB App Engine app we started with. Cloud Run’s build-and-deploy command (gcloud run deploy) will use a Dockerfile if present but otherwise selects Cloud Buildpacks to build and deploy the container image. The user experience is the same, only without the time and challenges required to maintain and debug a Dockerfile.

Get started now

If you’re considering containerizing your App Engine apps without having to know much about containers or Docker, we recommend you try this migration on a sample app like ours before considering it for yours. A corresponding codelab leading you step-by-step through this exercise is provided in addition to the video which you can use for guidance.

All migration modules, their videos (when available), codelab tutorials, and source code, can be found in the migration repo. While our content initially focuses on Python users, we hope to one day also cover other legacy runtimes so stay tuned. Containerization may seem foreboding, but the goal is for Cloud Buildpacks and migration resources like this to aid you in your quest to modernize your serverless apps!

Containerizing Google App Engine apps for Cloud Run

Posted by Wesley Chun (@wescpy), Developer Advocate, Google Cloud

Google App Engine header

An optional migration

Serverless Migration Station is a video mini-series from Serverless Expeditions focused on helping developers modernize their applications running on a serverless compute platform from Google Cloud. Previous episodes demonstrated how to migrate away from the older, legacy App Engine (standard environment) services to newer Google Cloud standalone equivalents like Cloud Datastore. Today’s product crossover episode differs slightly from that by migrating away from App Engine altogether, containerizing those apps for Cloud Run.

There’s little question the industry has been moving towards containerization as an application deployment mechanism over the past decade. However, Docker and use of containers weren’t available to early App Engine developers until its flexible environment became available years later. Fast forward to today where developers have many more options to choose from, from an increasingly open Google Cloud. Google has expressed long-term support for App Engine, and users do not need to containerize their apps, so this is an optional migration. It is primarily for those who have decided to add containerization to their application deployment strategy and want to explicitly migrate to Cloud Run.

If you’re thinking about app containerization, the video covers some of the key reasons why you would consider it: you’re not subject to traditional serverless restrictions like development language or use of binaries (flexibility); if your code, dependencies, and container build & deploy steps haven’t changed, you can recreate the same image with confidence (reproducibility); your application can be deployed elsewhere or be rolled back to a previous working image if necessary (reusable); and you have plenty more options on where to host your app (portability).

Migration and containerization

Legacy App Engine services are available through a set of proprietary, bundled APIs. As you can surmise, those services are not available on Cloud Run. So if you want to containerize your app for Cloud Run, it must be “ready to go,” meaning it has migrated to either Google Cloud standalone equivalents or other third-party alternatives. For example, in a recent episode, we demonstrated how to migrate from App Engine ndb to Cloud NDB for Datastore access.

While we’ve recently begun to produce videos for such migrations, developers can already access code samples and codelab tutorials leading them through a variety of migrations. In today’s video, we have both Python 2 and 3 sample apps that have divested from legacy services, thus ready to containerize for Cloud Run. Python 2 App Engine apps accessing Datastore are most likely to be using Cloud NDB whereas it would be Cloud Datastore for Python 3 users, so this is the starting point for this migration.

Because we’re “only” switching execution platforms, there are no changes at all to the application code itself. This entire migration is completely based on changing the apps’ configurations from App Engine to Cloud Run. In particular, App Engine artifacts such as app.yaml, appengine_config.py, and the lib folder are not used in Cloud Run and will be removed. A Dockerfile will be implemented to build your container. Apps with more complex configurations in their app.yaml files will likely need an equivalent service.yaml file for Cloud Run — if so, you’ll find this app.yaml to service.yaml conversion tool handy. Following best practices means there’ll also be a .dockerignore file.

App Engine and Cloud Functions are sourced-based where Google Cloud automatically provides a default HTTP server like gunicorn. Cloud Run is a bit more “DIY” because users have to provide a container image, meaning bundling our own server. In this case, we’ll pick gunicorn explicitly, adding it to the top of the existing requirements.txt required packages file(s), as you can see in the screenshot below. Also illustrated is the Dockerfile where gunicorn is started to serve your app as the final step. The only differences for the Python 2 equivalent Dockerfile are: a) require the Cloud NDB package (google-cloud-ndb) instead of Cloud Datastore, and b) start with a Python 2 base image.

Image of The Python 3 requirements.txt and Dockerfile

The Python 3 requirements.txt and Dockerfile

Next steps

To walk developers through migrations, we always “START” with a working app then make the necessary updates that culminate in a working “FINISH” app. For this migration, the Python 2 sample app STARTs with the Module 2a code and FINISHes with the Module 4a code. Similarly, the Python 3 app STARTs with the Module 3b code and FINISHes with the Module 4b code. This way, if something goes wrong during your migration, you can always rollback to START, or compare your solution with our FINISH. If you are considering this migration for your own applications, we recommend you try it on a sample app like ours before considering it for yours. A corresponding codelab leading you step-by-step through this exercise is provided in addition to the video which you can use for guidance.

All migration modules, their videos (when published), codelab tutorials, START and FINISH code, etc., can be found in the migration repo. We hope to also one day cover other legacy runtimes like Java 8 so stay tuned. We’ll continue with our journey from App Engine to Cloud Run ahead in Module 5 but will do so without explicit knowledge of containers, Docker, or Dockerfiles. Modernizing your development workflow to using containers and best practices like crafting a CI/CD pipeline isn’t always straightforward; we hope content like this helps you progress in that direction!

Cloud-based CI/CD on GCP

Learning how to build a serverless deployment pipeline on Google Cloud Platform

Several years ago I worked with a team to build a production CI/CD pipeline for an AWS project. At that time, none of the major cloud vendors offered their own CI/CD services, so we then used the open source GO CD server.

Although we were successful, the project was tedious and took several months. For this reason, I’ve been interested to try out integrated alternatives as they become available.

Lately, much of my production work has involved constructing and testing cloud-based data pipelines at scale for bioinformatics research. I have been thinking for awhile now about how best to use new cloud-based CI/CD services in automating these complex pipeline deployments.

Picturing Automated Deployment

As I worked through a well-written tutorial on GCP CI/CD today, I was surprised to be unable to find any sort of diagram of the process. Although I was able to complete the tutorial successfully, I was still left feeling like I was not fully comprehending the solution architecture applications.

GCP cloud-native CI/CD Tutorial Steps

Next, I attempted to draw my own version of what I completed in the tutorial. Quickly sketching considerations from my production experience and trying to combine that with what I had learned from completing the tutorial was unsatisfying. The result is shown below.

My initial thoughts around cloud-native CI/CD

A clear path through the complexity of the steps involved wasn’t yet apparent to me. After sketching, I tried thinking about each service or set of services that would be involved in building this type of pipeline.

Considering GCP Services

I started by trying to understand the GCP Tool Services offerings and how each service would relate to building a CI/CD solution for the type of data pipelines my teams have been building.

There are a number of tools. Below is my summarized understanding of each of the GCP services listed on the Tools menu. Note that all of these services are serverless.

GCP Tools

Learning More

Then I watched a few presentations from the recent GCP:Next conference. The most useful to me were the following:

  1. CI CD Across Multiple Environments” — interesting, but showed combining non GCP tools (Jenkins, HashiCorp Vault…) with GCP tools, so added even more complexity to potential scenarios.
  2. Develop Faster on K8 with Cloud Build” — compelling demo of Skaffold library for K8, but no end-to-end scenario was shown.
  3. Develop, Deploy & Debug with GCP Tools” — most useful was demoes showing aspects of working with various services, such as the very intelligent search in Source Repositories.

Although these presentations and doing the tutorial were helpful in my understanding of how I might use these services for my customers, I still couldn’t visualize what I would actually build. To that end, I made an attempt to diagram one possible CI/CD architecture for a use case that I’ve been working on for awhile now.

Visualizing CI/CD for GCP Data Pipelines

Here’s my attempt to design a solution for one of my customers. I’ll describe the application architecture first. The use case is bioinformatics research via a specialized machine learning library for genomics, VariantSpark.

Below is a diagram of solution architecture on GCP K8:

CSIRO’s VariantSpark on GCP K8

You’ll note that this is a Data Lake with the K8 cluster processing data from GCS. Managed Jupyter Notebooks are part of the solutions as this is the preferred environment for research work.

Shown below is my first draft of how I imagine my team could use GCP CI/CD services to build a pipeline to deploy a VariantSpark K8 cluster on GCP.

GCP CI/CD Pipeline for VariantSpark K8 Deployment

Because VariantSpark in an open source library, there are continual improvements to it. However, because VariantSpark is a research tool, it uses major and minor version numbers in release so that it can be cited in reproducible bioinformatics research papers.

So the process to build a cluster first requires determining which version of the VariantSpark JAR file is desired for the base container image. The pipeline may needs to perform a maven build on the source GitHub Repo for VariantSpark.

Cloud Build artifacts include Docker Images, Kubernetes Clusters and more

GCP Source Repositories can mirror GitHub Repos. Cloud Build can use triggers on Source Repositories to trigger builds. Cloud Build includes builder (task) for various types of images including maven. Cloud Build builders also include docker and kubectl, so the VariantSpark JAR file could be built as part of the docker image and that image could be used to spawn a GKE cluster.

Additionally, GCP Deployment Manger could serve a YAML-deployment which details other GCP service configurations that are needed for a GKE cluster. It also seems possible to use GCP Private Catalog for research groups who have GCP Organizational Accounts to offer a VariantSpark GKE deployment template.

Next Steps

Now that I’ve learned the fundamentals of these services, I am looking forward to working with research and DevOps teams world-wide to build serverless CI/CD pipelines for bioinformatics.

Lynn Langit is a big data and cloud architect, a Google Cloud Developer Expert, an AWS Community Hero, and a Microsoft MVP for Data Platform.


Cloud-based CI/CD on GCP was originally published in A Cloud Guru on Medium, where people are continuing the conversation by highlighting and responding to this story.