C Command Line Tutorial 6 – Code indentation, increment/decrement operators, do-while and for loops, and more
Editor’s note: Aerial data mapping company DroneDeploy wanted to migrate its on-premises Kubernetes environment to Google Kubernetes Engine—but only if it would pass muster with auditors. Read on to learn how the firm leveraged GKE’s native security capabilities to smooth the path to ISO-27001 certification.
At DroneDeploy, we put a lot of effort into securing our customers’ data. We’ve always been proud of our internal security efforts, and receiving compliance certifications validates these efforts, helping us formalize our information security program, and keeping us accountable to a high standard. Recently, we achieved ISO-27001 certification— all from taking advantage of the existing security practices in Google Cloud and Google Kubernetes Engine (GKE). Here’s how we did it.
As a fast-paced, quickly growing B2B SaaS startup in San Francisco, our mission is to make aerial data accessible and productive for everyone. We do so by providing our users with image processing, automated mapping, 3D modeling, data sharing, and flight controls through iOS and Android applications. Our Enterprise Platform provides an admin console for role-based access and monitoring of flights, mapped routes, image capture, and sharing. We serve more than 4,000 customers across 180 countries in the construction, energy, insurance, and mining industries, and ingest more than 50 terabytes of image data from over 30,000 individual flights every month.
Many of our customers and prospects are large enterprises that have strict security expectations of their third-party service providers. In an era of increased regulation (such as Europe’s GDPR law) and data security concerns, the scrutiny on information security management has never been higher.. Compliance initiatives are one piece of the overall security strategy that help us communicate our commitment to securing customer data. At DroneDeploy, we chose to start our compliance story with ISO-27001, an international information security standard that is for recognized across a variety of industries.
DroneDeploy’s Architecture: Google Kubernetes Engine (GKE)
DroneDeploy was an early adopter of Kubernetes, and we have long since migrated all our workloads from virtual machines to containers orchestrated by Kubernetes. We currently run more than 150,000 Kubernetes jobs each month with run times ranging from a few minutes to a few days. Our tooling for managing clusters evolved over time, starting with hand-crafted bash and Ansible scripts, to the now ubiquitous (and fantastic) kops. About 18 months ago, we decided to re-evaluate our hosting strategy given the decreased costs of compute in the cloud. We knew that managing our own Kubernetes clusters was not a competitive advantage for our business and that we would rather spend our energy elsewhere if we could.
We investigated the managed Kubernetes offerings of the top cloud providers and did some technical due diligence before making our selection—comparing not only what was available at the time but also future roadmaps. We found that GKE had several key features that were missing in other providers such as robust Kubernetes-native autoscaling, a mature control plane, multi-availability zone masters, and extensive documentation. GKE’s ability to run on pre-emptible node pools for ephemeral workloads was also a huge plus.
Proving our commitment to security hardening
But if we were going to make the move, we needed to document our information security management policies and process and prove that we were following best practices for security hardening.
Specifically, when it comes to ISO-27001 certification, we needed to follow the general process:
- Document the processes you perform to achieve compliance
- Prove that the processes convincingly address the compliance objectives
- Provide evidence that you are following the process
- Document any deviations or exceptions
While Google Cloud offers hardening guidance for GKE and several GCP blogs to guide our approach, we still needed to prove that we had security best practices in place for our critical systems. With newer technologies, though, it can be difficult to provide clear evidence to an auditor that those best practices are in place; they often live in the form of blog posts by core contributors and community leaders versus official, documented best practices. Fortunately, standards have begun to emerge for Kubernetes. The Center for Internet Security (CIS) recently published an updated compliance benchmark for Kubernetes 1.11 that is quite comprehensive. You can even run automated checks against the CIS benchmark using the excellent open source project kube-bench. Ultimately though, it was the fact that Google manages the underlying GKE infrastructure that really helped speed up the certification process.
Compliance with less pain thanks to GKE
As mentioned, one of the main reasons we switched from running Kubernetes in-house to GKE was to reduce our investment in manually maintaining and upgrading our Kubernetes clusters— including our compliance initiatives. GKE reduces the overall footprint that our team has to manage since Google itself manages and documents much of the underlying infrastructure. We’re now able to focus on improving and documenting the parts of our security procedures that are unique to our company and industry, rather than having to meticulously document the foundational technologies of our infrastructure.
For Kubernetes, here’s a snippet of how we documented our infrastructure using the four steps described above:
- We implemented security best practices within our Kubernetes clusters by ensuring all of them are benchmarked using the Kubernetes CIS guide. We use kube-bench for this process, which we run on our clusters once every quarter.
- A well respected third-party authority publishes this benchmark, which confirms that our process addresses best practices for using Kubernetes securely.
- We provided documentation that we assessed our Kubernetes clusters against the benchmark, including the tickets to track the tasks.
- We provided the results of our assessment and documented any policy exceptions and proof that we evaluated those exceptions against our risk management methodology.
Similarly to the physical security sections of the ISO-27001 standard, the CIS benchmark has large sections dedicated to security settings for Kubernetes masters and nodes. Because we run on GKE, Google handled 95 of the 104 line items in the benchmark applicable to our infrastructure. For those items that could not be assessed against the benchmark (because GKE does not expose the masters), we provided links to Google’s security documentation on those features (see Cluster Trust and Control Plane Security). Some examples include:
- Connecting kubelets to the masters
- Handling of config files on the masters (e.g. scheduler, controller manager, API server, etc.)
- Hardening the etcd database
Beyond GKE, we were also able to take advantage of many other Google Cloud services that made it easier for us to secure our cloud footprint (although the shared responsibility model for security means we can’t rely on Google Cloud alone):
- For OS level security best practices, we we able to document strong security best practices for our OS security because we use Google’s Container-Optimized OS (COS), which provides many security best practices by default by using things such as a read-only file system. All that was left for us to do was was follow best practices to help secure our workloads.
- We use node auto-upgrade on our GKE nodes to handle patch management at the OS layer for our nodes. For the level of effort, we found that node auto-upgrade provides a good middle ground patching and stability. To date, we have not had any issues with our software as a result of node auto-upgrade.
- We use Container Analysis (which is built into Google Container Registry) to scan for known vulnerabilities in our Docker images.
- ISO-27001 requires that you demonstrate the physical security of your network infrastructure. Because we run our entire infrastructure in the cloud, we were able to directly rely on Google Cloud’s physical and network security for portions of the certification (Google Cloud is ISO-27001 certified amongst other certifications).
DroneDeploy is dedicated to giving our customers access to aerial imaging and mapping technologies quickly and easily. We handles vast amounts of sensitive information on behalf of our customers, and we want them to know that we are following best security practices even when the underlying technology gets complicated, like in the case of Kubernetes. For DroneDeploy, switching to GKE and Google Cloud has helped us reduce our operational overhead and increased the velocity with which we achieve key compliance certifications. To learn more about DroneDeploy, and our experience using Google Cloud and GKE, feel free to reach out to us.
Despite advances in scientific research and medical technology, the process of drug discovery has become increasingly slower and more expensive over the last decade. While the pharmaceutical industry has spent more money on research and development each year, this has not resulted in an increase in the number of FDA-approved new medicines. Recursion, headquartered in Salt Lake City, is looking to address this declining productivity by combining rich biological datasets with the latest in machine learning to reinvent the drug discovery and development process.
Today, Recursion has selected Google Cloud as their primary public cloud provider as they build a drug discovery platform that combines chemistry, automated biology, and cloud computing to reveal new therapeutic candidates, potentially cutting the time to discover and develop a new medicine by a factor of 10.
In order to fulfill their mission, Recursion developed a data pipeline that incorporates image processing, inference engines and deep learning modules, supporting bursts of computational power that weigh in at trillions of calculations per second. In just under two years, Recursion has created hundreds of disease models, generated a shortlist of drug candidates across several diseases, and advanced drug candidates into the human testing phase for two diseases.
Starting with wet biology—plates of glass-bottom wells containing thousands of healthy and diseased human cells—biologists run experiments on the cells, applying stains that help characterize and quantify the features of the cellular samples: their roundness, the thickness of their membrane, the shape of their mitochondria, and other characteristics. Automated microscopes capture this data by snapping high-resolution photos of the cells at several different light wavelengths. The data pipeline, which sits on top ofGoogle Kubernetes Engine (GKE) and Confluent Kafka, all running on GCP, extracts and analyzes cellular features from the images. Then, data are processed by deep neural networks to find patterns, including those humans might not recognize. The neural nets are trained to compare healthy and diseased cell signatures with those of cells before and after a variety of drug treatments. This process yields promising new potential therapeutics.
To train its deep learning models, Recursion uses on-premises GPUs, then they use GCP CPUs to perform inference on new images in the pipeline using these models. Recursion is currently evaluating cloud-based alternatives including using Cloud TPU technology to accelerate and automate image processing. Since Recursion is already using TensorFlow to train its neural networks in its proprietary biological domains, Cloud TPUs are a natural fit. Additionally, Recursion is exploring using GKE On-Prem, the foundation of Cloud Services Platform, to manage all of their Kubernetes clusters from a single, easy-to-use console.
We’re thrilled to collaborate with Recursion in their quest to more rapidly and inexpensively discover new medicines for dozens of diseases, both rare and common. Learn more about how Recursion is using Google Cloud solutions to better execute its mission of “decoding biology to radically improve lives” here. You can also learn more about solutions for life sciences organizations and our Google Cloud for Startups Program.
Written communication is at the heart of what drives businesses. Proposals, presentations, emails to colleagues—this all keeps work moving forward. This is why we’ve built features into G Suite to help you communicate effectively, like Smart Compose and Smart Reply, which use machine learning smarts to help you draft and respond to messages quickly. More recently, we’ve introduced machine translation techniques into Google Docs to flag grammatical errors within your documents as you draft them.
If you’ve ever questioned whether to use “a” versus “an” in a sentence, or if you’re using the correct verb tense or preposition, you’re not alone. Grammar is nuanced and tricky, which makes it a great problem to solve with the help of artificial intelligence. Here’s a look at how we built grammar suggestions in Docs.
The gray areas of grammar
Although we generally think of grammar as a set of rules, these rules are often complex and subjective. In spelling, you can reference a resource that tells you whether a word exists or how it’s spelled: dictionaries (Remember those?).
Grammar is different. It’s a harder problem to tackle because its rules aren’t fixed. It varies based on language and context, and may change over time, too. To make things more complicated, there are many different style books—whether it be MLA, AP or some other style—which makes consistency a challenge.
Given these nuances, even the experts don’t always agree on what’s correct. For our grammar suggestions, we worked with professional linguists to proofread sample sentences to get a sense of the true subjectivity of grammar. During that process, we found that linguists disagreed on grammar about 25 percent of the time. This raised the obvious question: how do we automate something that doesn’t run on definitive rules?
Where machine translation makes a mark
Much like having someone red-line your document with suggestions on how to replace “incorrect” grammar with “correct” grammar, we can use machine translation technology to help automate that process. At a basic level, machine translation performs substitution and reorders words from a source language to a target language, for example, substituting a “source” word in English (“hello!”) for a “target” word in Spanish (¡hola!). Machine translation techniques have been developed and refined over the last two decades throughout the industry, in academia and at Google, and have even helped power Google Translate.
Along similar lines, we use machine translation techniques to flag “incorrect” grammar within Docs using blue underlines, but instead of translating from one language to another like with Google Translate, we treat text with incorrect grammar as the “source” language and correct grammar as the “target.”
Working with the experts
Before we could train models, we needed to define “correct” and “incorrect” grammar. What better way to do so than to consult the experts? Our engineers worked with a collection of computational and analytical linguists, with specialties ranging from sociology to machine learning. This group supports a host of linguistic projects at Google and helps bridge the gap between how humans and machines process language (and not just in English—they support over 40 languages and counting).
For several months, these linguists reviewed thousands of grammar samples to help us refine machine translation models, from classic cases like “there” versus “their” versus “they’re,” to more complex rules involving prepositions and verb tenses. Each sample received close attention—three linguists reviewed each case to identify common patterns and make corrections. The third linguist served as the “tie breaker” in case of disagreement (which happened a quarter of the time).
Once we identified the samples, we then fed them into statistical learning algorithms—along with “correct” text gathered from high-quality web sources (billions of words!)—to help us predict outcomes using stats like the frequency at which we’ve seen a specific correction occur. This process helped us build a basic spelling and grammar correction model.
We iterated over these models by rolling them out to a small portion of people who use Docs, and then refined them based on user feedback and interactions. For example, in earlier models of grammar suggestions, we received feedback that suggestions for verb tenses and the correct singular or plural form of a noun or verb were inaccurate. We’ve since adjusted the model to solve for these specific issues, resulting in more precise suggestions. Although it’s impossible to catch 100 percent of issues, we’re constantly evaluating our models at Google to ensure bias does not surface in results such as these.
Better grammar. No ifs, ands or buts.
So if you’ve ever asked yourself “how does it know what to suggest when I write in Google Docs,” these grammar suggestion models are the answer. They’re working in the background to analyze your sentence structure, and the semantics of your sentence, to help you find mistakes or inconsistencies. With the help of machine translation, here are some mistakes that Docs can help you catch:
Evolving grammar suggestions, just like language
When it comes to grammar, we’re constantly improving the quality of each suggestion to make corrections as useful and relevant as possible. With our AI-first approach, G Suite is in the best position to help you communicate smarter and faster, without sweating the small stuff. Learn more.
Today, we are very excited to announce the general availability of Azure Lab Services – your computer labs in the cloud.
With Azure Lab Services, you can easily set up and provide on-demand access to preconfigured virtual machines (VMs) to teach a class, train professionals, run hackathons or hands-on labs, and more. Simply input what you need in a lab and let the service roll it out to your audience. Your users go to a single place to access all their VMs across multiple labs, and connect from there to learn, explore, and innovate.
Since our preview announcement, we have had many customers use the service to conduct classes, training sessions, boot camps, hands on labs, and more! For classroom or professional training, you can provide students with a lab of virtual machines configured with exactly what you need for class and give each student a specified number of hours to use the VMs for homework or personal projects. You can run a hackathon or a hands-on lab at conferences or events and scale up to hundreds of virtual machines for your attendees. You can also create an invite-only private lab of virtual machines installed with your prerelease software to give preview customers access to early trials or set up interactive sales demos.
Top three reasons customers use Azure Lab Services
Automatic management of Azure infrastructure and scale
Azure Lab Services is a managed service, which means that provisioning and management of a lab’s underlying infrastructure is handled automatically by the service. You can just focus on preparing the right lab experience for your users. Let the service handle the rest and roll out your lab’s virtual machines to your audience. Scale your lab to hundreds of virtual machines with a single click.
Simple experience for your lab users
Users who are invited to your lab get immediate access to the resources you give them inside your labs. They just need to sign in to see the full list of virtual machines they have access to across multiple labs. They can click on a single button to connect to the virtual machines and start working. Users don’t need Azure subscriptions to use the service.
Cost optimization and tracking
Keep your budget in check by controlling exactly how many hours your lab users can use the virtual machines. Set up schedules in the lab to allow users to use the virtual machines only during designated time slots or set up reoccurring auto-shutdown and start times. Keep track of individual users’ usage and set limits.
Get started now
Try Azure Lab Services today! Get started by creating a lab account for your organization or team. All labs are managed under a lab account. You can give permissions to people in your organization to create labs in your lab account.
To learn more, visit the Azure Lab Services documentation. Ask any questions you have on Stack Overflow. Last of all, don’t forget to subscribe to our Service Updates and view other Azure Lab Services posts on the Azure blog to get the latest news.
General availability pricing
Azure Lab Services GA pricing goes into effect on May 1, 2019. Until then, you will continue to be billed based on the preview pricing. Please see the Azure Lab Services pricing page for complete details.
We continue to listen to our customers to prioritize and ship new features and updates. Several key features will be enabled in the coming months:
- Ability to reuse and share custom virtual machine images across labs
- Feature to enable connections between a lab and on-premise resources
- Ability to create GPU virtual machines inside the labs
We always welcome any feedback and suggestions. You can make suggestions or vote on priorities on our UserVoice feedback forum.
This blog was co-authored by Lei Zhang, Principal Research Manager, Computer Vision
You can now extract more insights and unlock new workflows from your images with the latest enhancements to Cognitive Services’ Computer Vision service.
1. Enrich insights with expanded tagging vocabulary
Computer Vision has more than doubled the types of objects, situations, and actions it can recognize per image.
2. Automate cropping with new object detection feature
Easily automate cropping and conduct basic counting of what you need from an image with the new object detection feature. Detect thousands of real life or man-made objects in images. Each object is now highlighted by a bounding box denoting its location in the image.
3. Monitor brand presence with new brand detection feature
You can now track logo placement of thousands of global brands from the consumer electronics, retail, manufacturing, entertainment industries.
With these enhancements, you can:
- Do at-scale image and video-frame indexing, making your media content searchable. If you’re in media, entertainment, advertising, or stock photography, rich image and video metadata can unlock productivity for your business.
- Derive insights from social media and advertising campaigns by understanding the content of images and videos and detecting logos of interest at scale. Businesses like digital agencies have found this capability useful for tracking the effectiveness of advertising campaigns. For example, if your business launches an influencer campaign, you can apply Custom Vision to automatically generate brand inclusion metrics pulling from influencer-generated images and videos.
In some cases, you may need to further customize the image recognition capabilities beyond what the enhanced Computer Vision service now provides by adding specific tagging vocabulary or object types that are relevant to your use case. Custom Vision service allows you to easily customize and deploy your model without requiring machine-learning expertise.
Redis is one of the most popular open source in-memory data stores, used as a database, cache and message broker. This post covers the major deployment scenarios for Redis on Google Cloud Platform (GCP). In the following post, we’ll go through the pros and cons of these deployment scenarios and the step-by-step approach, limitations and caveats for each.
Deployment options for running Redis on GCP
There are four typical deployment scenarios we see for running Redis on GCP: Cloud Memorystore for Redis, Redis Labs Cloud and VPC, Redis on Google Kubernetes Engine (GKE), and Redis on Google Compute Engine. We’ll go through the considerations for each of them. It’s also important to have backup for production databases, so we’ll discuss backup and restore considerations for each deployment type.
Cloud Memorystore for Redis
Cloud Memorystore for Redis, part of GCP, is a way to use Redis and get all its benefits without the cost of managing Redis. If you need data sharding, you can deploy open source Redis proxies such as Twemproxy and Codis with multiple Cloud Memorystore for Redis instances for scale until Redis Cluster becomes ready in GCP.
Twemproxy, also known as the nutcracker, is an open source (under the Apache License) fast and lightweight Redis proxy developed by Twitter. The purpose of Twemproxy is to provide a proxy and data sharding solution for Redis and to reduce the number of client connections to the back-end Redis instances. You can set up multiple Redis instances behind Twemproxy. Clients only talk to the proxy and don’t need to know the details of back-end Redis instances, which simplifies management. You can also run multiple Twemproxy instances for the same group of back-end Redis servers to prevent having a single point of failure, as shown here:
Note that Twemproxy does not support all Redis commands, such as pub/sub and transaction commands. In addition, it’s not convenient to add or remove back-end Redis nodes for Twemproxy. It requires you to restart Twemproxy for configurations to be effective, and data isn’t rebalanced automatically after adding or removing Redis nodes.
Codis is an open source (under the MIT License) proxy-based high-performance Redis cluster tool developed by CodisLabs. Codis offers another Redis data sharding proxy option to solve the horizontal scalability limitation and lack of administration dashboard. It’s fully compatible with Twemproxy and has a handy tool called redis-port that handles the migration from Redis Twemproxy to Codis.
Pros of Cloud Memorystore for Redis
- It’s fully managed. Google fully manages administrative tasks for Redis instances such as hardware provisioning, setup and configuration management, software patching, failover, monitoring and other nuances that require considerable effort for service owners who just want to use Redis as a memory store or a cache.
- It’s highly available. We provide a standard Cloud Memorystore tier, in which we fully manage replication and failover to provide high availability. In addition, you can keep the replica in a different zone.
- It’s scalable and performs well. You can easily scale memory that’s provisioned for Redis instances. We also provide high network throughput per instance, which can be scaled on demand.
- It’s updated and secure. We provide network isolation so that access to Redis is restricted to within a network via a private IP. Also, OSS compatibility is Redis 3.2.11, as of late 2018.
Cons of Cloud Memorystore for Redis
- Some features are not yet available: Redis Cluster, backup and restore.
- It lacks replica options. Cloud Memorystore for Redis provides a master/replica configuration in the standard tier, and master and replica are spread across zones. There is only one replica per instance.
- There are some product constraints you should note.
You can deploy OSS proxies such as Twemproxy and Codis with multiple Cloud Memorystore for Redis instances for scalability until Redis Cluster is ready in GCP. And note the caveat that basic-tier Cloud Memorystore for Redis instances are subject to a cold restart and full data flush during routine maintenance, scaling, or an instance failure. Choose the standard tier to prevent data loss during those events.
How to get started
Check out our Cloud Memorystore for Redis guide for the basics. You can see here how to configure multiple Cloud Memorystore for Redis instances using Twemproxy and an internal load balancer in front of them.
1. Create nine new Cloud Memorystore for Redis instances in asia-northeast1 region
2. Prepare a Twemproxy container for deployment
3. Build a Twemproxy docker image
* Please replace
Note that a VM instance starts a container with –network=”host” flag of the Docker run command by default.
4. Create an instance template based on the Docker image
* Please replace
5. Create a managed instance group using the template
6. Create a health check for the internal load balancer
7.Create a back-end service for the internal load balancer
8. Add instance groups to the back-end service
9. Create a forwarding rule for the internal load balancer
10. Configure firewall rules to allow the internal load balancer access to Twemproxy instances
Redis Labs Cloud and VPC
- Redis Enterprise Cloud is a fully managed and hosted Redis Cluster on GCP.
- Redis Enterprise VPC is a fully managed Redis Cluster in your virtual private cloud (VPC) on GCP.
Redis Labs Cloud and VPC protect your database by maintaining automated daily and on-demand backups to remote storage. You can back up your Redis Enterprise Cloud/VPC databases to Cloud Storage. Find instructions here.
You can also import a data set from an RDB file using Redis Labs Cloud with VPC. Check out the official public document on Redis Labs site for instructions.
Pros of Redis Labs Cloud and VPC
- It’s fully managed. Redis Labs manages all administrative tasks.
- It’s highly available. These Redis Labs products include an SLA with 99.99% availability.
- It scales and performs well. It will automatically add new instances to your cluster according to your actual data set size without any interruption to your applications.
- It’s fully supported. Redis Labs supports Redis itself.
Cons of Redis Labs Cloud and VPC
There’s a cost consideration. You’ll have to pay separately for Redis Labs’ service.
How to get started
Contact Redis Labs to discuss further steps.
Redis on GKE
If you want to use Redis Cluster, or want to read from replicas, Redis on GKE is an option. Here’s what you should know.
Pros of Redis on GKE
- You have full control of the Redis instances. You can configure, manage and operate as you like.
- You can use Redis Cluster.
- You can read from replicas.
Cons of Redis on GKE
- It’s not managed. You’ll need to manage administrative tasks such as hardware provisioning, setup and configuration management, software patching, failover, backup and restore, configuration management, monitoring, etc.
- Availability, scalability and performance varies, depending on how you architect. Running a standalone instance of Redis on GKE is not ideal for production because it would be a single point of failure, so consider configuring master/slave replication to have redundant nodes with Sentinel, or set up a cluster.
- There’s a steeper learning curve. This option requires you to learn Redis itself in more detail. Kubernetes also requires some time to learn, and its deployment may introduce additional complexity to your design and operations.
- When using Redis on GKE, you’ll want to be aware of GKE cluster node maintenance; cluster nodes will need to be upgraded once every three months or so. To avoid unexpected disruption during the upgrade process, consider using PodDisruptionBudgets and configure parameters appropriately. And you’ll want to run containers in host networking mode to eliminate additional network overhead from Docker networking. Make sure that you run one Redis instance on each VM, otherwise it may cause port conflicts. This can be achieved with podAntiAffinity.
How to get started
Use Kubernetes to deploy a container to run Redis on GKE. The example below shows the steps to deploy Redis Cluster on GKE.
1. Provision a GKE cluster
* If prompted, specify your preferred GCP project ID or zone.
2. Clone an example git repository
3. Create config maps
4. Deploy Redis pods
* Wait until it is completed.
5. Prepare a list of Redis cache nodes
6. Submit a job to configure Redis Cluster
7. Confirm the job “redis-create-cluster-xxxxx” shows completed status
Limitations highly depend on how you design the cluster.
Backing up and restoring manually built Redis
Both GKE and Compute Engine will follow the same method to back up and restore your databases. Basically, copying the RDB file is completely safe while the server is running, because the RDB is never modified once produced.
To back up your data, copy the RDB file to somewhere safe, such as Cloud Storage.
- Create a cron job in your server to take hourly snapshots of the RDB files in one directory, and daily snapshots in a different directory.
- Every time the cron script runs, make sure to call the “find” command to make sure old snapshots are deleted: for instance, you can take hourly snapshots for the latest 48 hours, and daily snapshots for one or two months. Make sure to name the snapshots with data and time information.
- At least once a day, make sure to transfer an RDB snapshot outside your production environment. Cloud Storage is a good place to do so.
To restore a data set from an RDB file, disable AOF and remove AOF and RDB before restoring data to Redis. Then you can copy RDB file from remote and simply restart redis-server to restore your data.
- Redis will try to restore data from the AOF file if AOF is enabled. If the AOF file cannot be found, Redis will start with an empty data set.
- Once the RDB snapshot is triggered due to the key changes, the original RDB file will be rewritten.
Redis on Compute Engine
You can also deploy your own open source Redis Cluster on Google Compute Engine if you want to use Redis Cluster, or want to read from replicas. The possible deployment options are:
- Run Redis on a Compute Engine instance—this is the simplest way to run the Redis service processes directly.
- Run Redis containers on Docker on a Compute Engine instance.
Pros of Redis on Compute Engine
You’ll have full control of Redis. You can configure, manage and operate as you like.
Cons of Redis on Compute Engine
- It’s not managed. You have to manage administrative tasks such as hardware provisioning, setup and configuration management, software patching, failover, backup and restore, configuration management, monitoring, etc.
- Availability, scalability and performance depend on how you architect. For example, a standalone setup is not ideal for production because it would be a single point of failure, so consider configuring master/slave replication to have redundant nodes with Sentinel, or set up a cluster.
- There’s a steeper learning curve: This option requires you to learn Redis itself in more detail.
For best results, run containers in host networking mode to eliminate additional network overhead from Docker networking. Make sure that you run one Redis container on each VM, otherwise it causes port conflicts. Limitations highly depend on how you design the cluster.
How to get started
Provision Compute Engine instances by deploying containers on VMs and managed instance groups. Alternatively, you can run your container on Compute Engine instances using whatever container technologies and orchestration tools that you need. You can create an instance from a public VM image and then install the container technologies that you want, such as Docker. Package service-specific components into separate containers and upload to Cloud Repositories.
The steps to configure Redis on Compute Engine instances are pretty basic if you’re already using Compute Engine, so we don’t describe them here. Check out the Compute Engine docs and open source Redis docs for more details.
Redis performance testing
It’s always necessary to measure the performance of your system to identify any bottlenecks before you expose it in production. The key factors affecting the performance of Redis are CPU, network bandwidth and latency, the size of the data set, and the operations you perform. If the result of the benchmark test doesn’t meet your requirements, consider scaling your infrastructure up or out or adjust the way you use Redis. There are a few ways to do benchmark testing against multiple Cloud Memorystore for Redis instances deployed using Twemproxy with an internal load balancer in front.
Redis-benchmark is an open source command line benchmark tool for Redis, which is included with the open source Redis package.
Memtier_benchmark is an open source command line benchmark tool for NoSQL key-value stores, developed by Redis Labs. It supports both Redis and Memcache protocols, and can generate various traffic patterns against instances.
Migrating Redis to GCP
The most typical Redis customer journey to GCP we see is migration from other cloud providers. Here are a few options that can be used to perform data migration of Redis:
- Setting up the master/slave relationship to replicate the data
- Loading persistence data files [Use append-only file (AOF) or Redis database (RDB) to restore the data]
- Use MIGRATE command
- Use the redis-port tool developed by CodisLabs
If you would like to work with Google experts to migrate your Redis deployment onto GCP, get in touch and learn more here.
Editor’s note: Cloud Identity, Google Cloud’s identity as a service (IDaaS) platform, now offers secure LDAP functionality that enables authentication, authorization, and user/group lookups for LDAP-based apps and IT infrastructure. Today, we hear from OpenVPN, which has tested and integrated its OpenVPN Access Server with secure LDAP, enabling your employees and partners to use their Cloud Identity credentials to access applications through VPN. Read on to learn more.
As IT organizations adopt more cloud-based IaaS and SaaS apps, they need a way to let users access them securely, while still being able to use legacy LDAP-based apps and infrastructure. The new secure LDAP capabilities in Cloud Identity provides both legacy LDAP platforms and cloud-native applications with a single authentication source, for a simple, effective solution to this problem.
In fact, we here at OpenVPN have integrated our OpenVPN Access Server with Cloud Identity, allowing your remote users to connect to your corporate network and apps over VPN with their Cloud Identity (or G Suite) credentials. This helps keep your company secure, and ensures your entire team is following the protocol.
This illustration demonstrates how Cloud Identity makes security accessible and efficient for any level of enterprise. The top-half of the illustration shows the deployment of OpenVPN Access Server in various cloud IaaS providers. As you can see, all instances of Access Server use Cloud Identity for authentication and authorization. The Access Servers are configured with a group called ‘IT Admin,’ which allows SSH access to all application servers on all the private networks. This allows any employee identity present in Cloud Identity that is a member of ‘IT Admin’ group to access any of the private networks via VPN and use SSH.
Then, as you can see in the lower half of the illustration, remote employees use VPN to connect to your corporate network and apps with their Cloud Identity credentials.
Using Cloud Identity for authentication
OpenVPN Access Server v2.6.1 and later supports secure LDAP and has been tested to work with Cloud Identity. You can find specific configuration instructions on our website.
Using Cloud Identity groups for network access control
As shown in the illustration below, Access Server’s administrative controls make it easy to configure groups. Administrators can configure access controls for these groups with fine granularity down to an individual IP address and port number.
You can configure groups in Access Server that correspond to those stored in Cloud Identity and enforce access controls for the user based on that user’s group membership. You can do this kind of mapping by using a script on Access Server. Instructions to set up the script are available on our website. In addition, our support staff is also ready to help you.
With OpenVPN Access Server, you can protect your cloud applications, connect your premise to the cloud, and provide simple and secure access for your remote employees in a way that scales with the tools you’re already using. Best of all, OpenVPN Access Server is available on GCP Marketplace. Try it out today!