Google named a Leader in the Gartner 2020 Magic Quadrant for Cloud AI Developer Services

The enterprise applications for artificial intelligence and machine learning seem to grow by the day. To take advantage of everything AI/ML technologies have to offer, it’s important to have a platform that supports your needs fully—whether you’re a developer, a data scientist, an analyst, or just interested in AI. But with so many features and services to consider, it can be difficult to sort through it all. This is where analyst reports can provide valuable research to help you get the answers you need.

Today, Gartner named Google a Leader in the Gartner 2020 Magic Quadrant for Cloud AI Developer Services report. This designation is based on Gartner’s evaluation of Google’s language, vision, conversation, and structured data products, including AutoML, all of which we deliver through Google Cloud. Let’s take a closer look at some of Gartner’s findings.

Vision AI for every enterprise use case

You don’t need to be an ML expert to reap the benefits that our AI portfolio offers. Our vision and video APIs, along with AutoML Vision and Video products, let developers of any experience level build perception AI into their applications. These products help you understand and derive insights from your images and videos with industry-leading prediction accuracy in the cloud or at the edge.

Our Computer Vision products provide many features to help you understand your visual content and create powerful custom machine learning models: 

  • Through REST and RPC APIs, the Vision API provides access to pretrained models that are ready to use to quickly classify images. 

  • AutoML Vision automates the training of your own custom machine learning models with an easy-to-use graphical interface. It lets you optimize your models for accuracy, latency, and size, and export them to your application in the cloud, or to an array of devices at the edge.

  • The Video Intelligence API has pre-trained machine learning models that automatically recognize a vast number of objects, places, and actions in stored and streaming video. 

  • AutoML Video Intelligence lets developers quickly and easily train custom models to classify and track objects within videos, regardless of their level of ML experience. 

  • The What-If Tool, an open-source visualization tool for inspecting any machine learning model, enhances your model’s interpretability, offering insights into how it’s making decisions for AutoML Vision and our data-labeling services.

While powerful pre-trained APIs and custom model creation capabilities are part of meeting all of an enterprise’s ML needs, it’s equally important to be able to deploy these models wherever the business needs them. To that end, our AutoML Vision models can be deployed via container wherever it works best for you: in a virtual private cloud, on-premises, and in our public cloud. 

Easier and better custom ML models for your structured data 

AutoML Tables enables your entire team of data scientists, analysts, and developers to automatically build and deploy state-of-the-art machine learning models on structured data at a massively increased speed and scale. To create ML models, developers usually need training data that’s as complete and clean as possible. AutoML Tables provides information about and automatically handles missing data, high cardinality, and distribution for each feature in a dataset. Then, in training, it automates a range of feature engineering tasks, from normalization of numeric features and creation of one-hot encoding, to embeddings for categorical features.

In addition, AutoML Tables also provides codeless GUI and python SDK options, as well as automated data preprocessing, feature engineering, hyperparameter and neural/tree architecture search, evaluation, model explainability, and deployment functionality. All of these features significantly reduce the amount of time it takes to bring a custom ML model to production from months to days.

Ready for global scale 

As business becomes more and more global, being able to serve customers wherever they are or whatever language they speak is a key differentiator. To that end, many of our products support more languages than other providers. For example:

With such strong language support, Google Cloud makes it easier to grow your business globally.

As the uses for AI continue to expand, more organizations are turning to Google to help build out their AI capabilities. At Google Cloud, we’re passionate about helping developers in organizations of all sizes to build AI/ML into their workflows quickly and easily, wherever they may be on their AI journey. To learn more about how to make AI work for you, download a complimentary copy of the Gartner 2020 Magic Quadrant for Cloud AI Developer Services report.

Disclaimer: Gartner, Magic Quadrant for Cloud AI Developer Services, Van Baker, Bern Elliot, Svetlana Sicular, Anthony Mullen, Erick Brethenoux, 24 February 2020. Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings or other designation. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose.

Your ML workloads cheaper and faster with the latest GPUs

Running ML workloads more cost effectively

Google Cloud wants to help you run your ML workloads as efficiently as possible. To do this, we offer many options for accelerating ML training and prediction, including many types of NVIDIA GPUs. This flexibility is designed to let you get the right tradeoff between cost and throughput during training or cost and latency for prediction.

We recently reduced the price of NVIDIA T4 GPUs, making AI acceleration even more affordable. In this post, we’ll revisit some of the features of recent generation GPUs, like the NVIDIA T4, V100, and P100. We’ll also touch on native 16-bit (half-precision) arithmetics and Tensor Cores, both of which provide significant performance boosts and cost savings. We’ll show you how to use these features, and how the performance benefit of using 16-bit and automatic mixed-precision for training often outweighs the higher list price of NVIDIA’s newer GPUs.

Half-precision (16-bit float)

Half-precision floating point format (FP16) uses 16 bits, compared to 32 bits for single precision (FP32). Storing FP16 data reduces the neural network’s memory usage, which allows for training and deployment of larger networks, and faster data transfers than FP32 and FP64.

32-bit Float structure (Source: Wikipedia)
16-bit Float structure (Source: Wikipedia)

Execution time of ML workloads can be sensitive to memory and/or arithmetic bandwidth. Half-precision halves the number of bytes accessed, reducing the time spent in memory-limited layers. Lowering the required memory lets you train larger models or train with larger mini-batches.

The FP16 format is not new to GPUs. In fact, it has been supported as a storage format for many years on NVIDIA GPUs: High performance FP16 is supported at full speed on NVIDIA T4, NVIDIA V100, and P100 GPUs. 16-bit precision is a great option for running inference applications, however if you’re training a neural network entirely at this precision, the network may not converge to required accuracy levels without higher precision result accumulation.

Automatic mixed precision mode in TensorFlow

Mixed precision uses both FP16 and FP32 data types when training a model. Mixed-precision training offers significant computational speedup by performing operations in half-precision format whenever it’s safe to do so, while storing minimal information in single precision to retain as much information as possible in critical parts of the network. Mixed-precision training usually achieves the same accuracy as single-precision training using the same hyper-parameters.

NVIDIA T4 and NVIDIA V100 GPUs incorporate Tensor Cores, which accelerate certain types of FP16 matrix math, enabling faster and easier mixed-precision computation. NVIDIA has also added automatic mixed-precision capabilities to TensorFlow.

To use Tensor Cores, FP32 models need to be converted to use a mix of FP32 and FP16. Performing arithmetic operations in FP16 takes advantage of the performance gains of using lower-precision hardware (such as Tensor Cores). Due to the smaller representable range of float16, though, performing the entire training with FP16 tensors can result in gradient underflow and overflow errors. However, performing only certain arithmetic operations in FP16 results in performance gains when using compatible hardware accelerators, decreasing training time and reducing memory usage, typically without sacrificing model performance.

TensorFlow supports FP16 storage and Tensor Core math. Models that contain convolutions or matrix multiplication using the tf.float16 data type will automatically take advantage of Tensor Core hardware whenever possible.

This process can be configured automatically using automatic mixed precision (AMP). This feature is available in V100 and T4 GPUs, and TensorFlow version 1.14 and newer supports AMP natively. Let’s see how to enable it.

Manually: Enable automatic mixed precision via TensorFlow API

Wrap your tf.train or tf.keras.optimizers Optimizer as follows:

This change applies automatic loss scaling to your model and enables automatic casting to half precision.

(Note: To enable mixed precision in a for TensorFlow 2 Keras you can use: tf.keras.mixed_precision.Policy.)

Automatically: Enable automatic mixed precision via an environment variable

When using the NVIDIA NGC TFDocker image, simply set one environment variable:

As an alternative, the environment variable can be set inside the TensorFlow Python script:

(Note: For a complete AMP example showing the speed-up on training an image classification task on CIFAR10, check out this notebook.)

Please take a look at the Models that have been tested successfully using mixed-precision.

Configure AI Platform to use accelerators

If you want to start taking advantage of the newer NVIDIA GPUs like the T4, V100, or P100 you need to use the customization options: Define a config.yaml file that describes the GPU options you want. The structure of the YAML file represents the Job resource.

The first example shows a configuration file for a training job that uses Compute Engine machine types with a T4 GPU.

(Note: For a P100 or V100 GPU, configuration is similar, just replace type with the correct GPU type—NVIDIA_TESLA_P100 or NVIDIA_TESLA_V100.)

Use the gcloud* command to submit the job, including a --config argument pointing to your config.yaml file. This example assumes you’ve set up environment variables—indicated by a $ sign followed by capital letters—for the values of some arguments:

The following example shows how to submit a job with a similar configuration (using Compute Engine machine types with GPUs attached), but without using a config.yaml file:

(Note: Please verify you are running the latest Google Cloud SDK to get access to the different machine types.)

Hidden cost of low-priced instances

The conventional practice most organizations follow is to select lower-priced cloud instances to save on per-hour compute cost. However, the performance improvements of newer GPUs can significantly reduce costs for running compute-intensive workloads like AI.

To validate the concept that modern GPUs reduce the total cost of some common training workloads, we trained Google’s Neural Machine Translation (GNMT) model—which is used for applications like real-time language translations—on several GPUs. In this particular example we tested the GNMTv2 model using AI Platform Training using Custom Containers. By simply using modern hardware like a T4, we are able to train the model at 7% of the cost while obtaining the result eight times faster, as shown in the table below. (For details about the setup please take a look at the NVIDIA site.)

  • Each GPU Model was tested using three different runs and calculating the average numbers per section.

  • Additional costs for storing data (GNMT input data was stored on GCS) are not included, since they are the same for all tests.

A quick note: When calculating the cost of a training job using Consumed ML units use the following formula:

In this case to calculate the cost for running the job in the K80 use the Consumed ML units * $0.49formula: 465 * $0.49 = $227.85.

The Consumed ML units can be found on your Job details page (see below), and are equivalent to training units with the duration of the job factored in:


Looking at the specific NVIDIA GPUs, we can get more granular on the performance-price proposition.

  • NVIDIA T4 is well known for its low power consumption and great Inference performance for Image/Video Recognition, Natural Language Processing, and Recommendation Engines, just to name a few use cases. It supports half-precision (16-bit float) and automatic mixed precision for model training and gives a 8.1x speed boost over K80 at only 7% of the original cost.

  • NVIDIA P100 introduced half-precision (16-bit float) arithmetic. Using it gives a 7.6x performance boost over K80, at 27% of the original cost.

  • NVIDIA V100 introduced tensor cores that accelerate half-precision and automatic mixed precision. It provides an 18.7x speed boost over K80 at only 15% of the original cost. In terms of time savings, the time to solution (TTS) was reduced from 244 hours (about 10 days) to just 13 hours (an overnight run). 

What about model prediction?

GPUs can also drastically lower latency for online prediction (inference). However, the high availability demands of online prediction often requires keeping machines alive 24/7 and provisioning sufficient capacity in case of failures or traffic spikes. This can potentially make low latency online prediction expensive.

The latest price cuts to T4s, however, make low latency, high availability serving more affordable on the Google Cloud AI Platform. You can deploy your model on a T4 for about the same price as eight vCPUs, but with the low latency and high-throughput of a GPU.

The following example shows how to deploy a TensorFlow model for Prediction using 1 NVIDIA T4 GPU:


Model training and serving on GPUs has never been more affordable. Price reductions, mixed precision, and Tensor Cores accelerate AI performance for training and prediction when compared to older GPUs such as K80s. As a result, you can complete your workloads much faster, saving both time and money. To leverage these capabilities and reduce your costs, we recommend the following rules of thumb:

  • If your training job is short lived (under 20 minutes), use T4, since they are the cheapest per hour.

  • If your model is relatively simple (fewer layers, smaller number of parameters, etc.), use T4, since they are the cheapest per hour.

  • If you want the fastest possible runtime and have enough work to keep the GPU busy, use V100.

  • To take full advantage of the newer NVIDIA GPUs use 16-bit precision in P100 and enable mixed precision mode when using T4 and V100.

If you haven’t explored GPUs for model prediction or inference, take a look at our GPUs on Compute Engine page for more details. For more information on getting started, check out our blog post on the topic.


Acknowledgements: Special thanks to the following people who contributed to this post: 
NVIDIA: Alexander Tsado, Cloud Product Marketing Manager
Google: Henry Tappen, Product Manager; Robbie Haertel, Software Engineer; Viesturs Zarins, Software Engineer

1. Price is calculated as described here. Consumed ML Units * Unit Cost (different per region).

Explaining model predictions on structured data

Machine learning technology continues to improve at a rapid pace, with increasingly accurate models being used to solve more complex problems. However, with this increased accuracy comes greater complexity. This complexity makes debugging models more challenging. To help with this, last November Google Cloud introduced Explainable AI, a tool designed to help data scientists improve their models and provide insights to make them more accessible to end users.

We think that understanding how models work is crucial to both effective and responsible use of AI. With that in mind, over the next few months, we’ll share a series of blog posts that covers how to use AI Explanations with different data modalities, like tabular, image, and text data.

In today’s post, we’ll take a detailed look at how you can use Explainable AI with tabular data, both with AutoML Tables and on Cloud AI Platform.

What is Explainable AI?

Explainable AI is a set of techniques that provides insights into your model’s predictions. For model builders, this means Explainable AI can help you debug your model while also letting you provide more transparency to model stakeholders so they can better understand why they received a particular prediction from your model. 

AI Explanations works by returning feature attribution values for each test example you send to your model. These attribution values tell you how much a particular feature affected the prediction relative to the prediction for a model’s baseline example. A typical baseline is the average value of all the features in the training dataset, and the attributions tell how much a certain feature affected a prediction relative to the average individual. 

AI Explanations offers two approximation methods: Integrated Gradients and Sampled Shapley. Both options are available in AI Platform, while AutoML Tables uses Sampled Shapley. Integrated Gradients, as the name suggests, uses the gradients—which show how a prediction is changing at each point—in its approximation. It requires a differentiable model implemented in TensorFlow, and is the natural choice for those models, for example neural networks. Sampled Shapley provides an approximation through sampling to the discrete Shapley value. While it doesn’t scale as well in the number of features, Sampled Shapely does work on non-differentiable models, like tree ensembles. Both methods allow for an assessment of how much each feature of a model led to a model prediction by comparing those against a baseline. You can learn more about them in our whitepaper.

About our dataset and scenario

The Cloud Public Datasets Programmakes available public datasets that are useful for experimenting with machine learning. For our examples, we’ll use data that is essentially a join of two public datasets stored in BigQuery: London Bike rentals and NOAA weather data, with some additional processing to clean up outliers and derive additional GIS and day-of-week fields. 

Using this dataset, we’ll build a regression model to predict the duration of a bike rental based on information about the start and end stations, the day of the week, the weather on that day, and other data. If we were running a bike rental company, we could use these predictions—and their explanations—to help us anticipate demand and even plan how to stock each location.

While we’re using bike and weather data here, you can use AI Explanations for a wide variety of tabular models, taking on tasks as varied as asset valuations, fraud detection, credit risk analysis, customer retention prediction, analyzing item layouts in stores, and many more.

AI Explanations for AutoML Tables

AutoML Tables lets you automatically build, analyze, and deploy state-of-the-art machine learning models using your own structured data. Once your custom model is trained, you can view its evaluation metrics, inspect its structure, deploy the model in the cloud, or export it so that you can serve it anywhere a container runs. 

Of course, AutoML Tables can also explain your custom model’s prediction results. This is what we’ll look at in our example below. To do this, we’ll use the “bikes and weather” dataset that we described above, which we’ll ingest directly from a BigQuery table. This post walks through the data ingestion—which is made easy by AutoML—and training process using that dataset in the Cloud Console UI.

Global feature importance

AutoML Tables automatically computes global feature importance for the trained model. This shows, across the evaluation set, the average absolute attribution each feature receives. Higher values mean the feature generally has greater influence on the model’s predictions.

This information is extremely useful for debugging and improving your model. If a feature’s contribution is negligible—if it has a low value—you can simplify the model by excluding it from future training. Based on the diagram below, for our example, we might try training a model without including bike_id.

6 Global feature importance results.png
Global feature importance results for a trained model.

Explanations for local feature importance

You can now also measure local feature importance: a score showing how much (and in which direction) each feature influenced the prediction for a single example.

It’s easy to explore local feature importance through Cloud Console’s Tables UI. After you deploy your model, go to the TEST & USE tab of the Tables panel, select ONLINE PREDICTION, enter the field values for the prediction, and then check the Generate feature importance box at the bottom of the page. The result will now be the prediction, the Baseline prediction value, and the feature importance values.

Let’s look at a few examples. For these examples, in lieu of real-time data, we’re using instances from the test dataset that the model did not see while training. AutoML tables allows you to export the test dataset to BigQuery after training, including the target column, which makes it easy to explore.

One thing our bike rental business might want to investigate is why different trips between the same two stations are sometimes accurately predicted to have quite different durations. Let’s see if the prediction explanations give us any hints. The actual duration value (that we want our model to predict) is annotated in red in the screenshots below.

1 lf_1_2.png
Click to enlarge

Both of these trips are to and from the same locations, but the one on the left was correctly predicted to take longer. It looks like the day of week (7 is a weekend; 4 is a weekday) was an important contributor. When we explore the test dataset in BigQuery, confirm that the average duration of weekend rides is indeed higher than for weekdays.

Let’s look at two more trips with the same qualities: to and from the same locations, yet the duration of one is accurately predicted to be longer.

2 ex4__1_2.png
Click to enlarge

In this case, it looks like the weather, specifically max temperature, might have been an important factor. When we look at the average ride durations in the BigQuery test dataset for temps at the high and low end of the scale, our theory is supported.

So these prediction explanations suggest that on the weekends, and in hot weather, bike trips will tend to take longer than they do otherwise. This is data our bike rental company can use to tweak bike stocking, or other processes, to improve business.  

What about inaccurate predictions? Knowing why a prediction was wrong can also be extremely valuable, so let’s look at one more example: where the predicted trip duration is much longer than the actual trip duration, as shown below.

3 slowstationv2.png
Click to enlarge

Again, we can load an example with an incorrect prediction into Cloud Console. This time, the local feature importance values suggest that the starting station might have played a larger-than-usual role in the overly high prediction. Perhaps the trips from this station have more variability than the norm.

After querying the test dataset on BigQuery, we can detect that this station is in the top three for standard deviation in prediction accuracy. This high variability of prediction results suggests that there might be some issues with the station or its rental setup, that the rental company might want to look into.

Using the AutoML Tables client libraries to get local explanations

You can also use the AutoML Tables client libraries to programmatically interact with the Tables API. That is, from a script or notebook, you can create a dataset, train your model, get evaluation results, deploy the model for serving, and then request local explanations for prediction results given the input data. 

For example, with the following “bikes and weather” model input instance:

… you can request a prediction with local feature importance annotations like this:

The response will return not only the prediction itself and the 95% prediction interval—the bounds that the true value of the prediction is likely to fall between with 95% probability—but also the local feature importance values for each input field. The prediction response should look something like this.

This notebook walks through the steps in more detail, and shows how to parse and plot the prediction results.

Explanations for AI Platform

You can also get explanations for custom TensorFlow models deployed to AI Platform. Let’s show how using a model trained on a similar dataset to the one above. All of the code for deploying an AI Explanations model to AI Platform can be found in this notebook.

Preparing a model for deployment

When we deploy AI Explanations models to AI Platform, we need to choose a baseline input for our model. When you choose a baseline for tabular models, think of it as helping you identify outliers in your dataset. For this example we’ve set the baseline to the median across all of our input values, computed using Pandas.

Since we’re using a custom TensorFlow model with AI Platform, we also need to tell the explanations service which tensors we want to explain from our TensorFlow model’s graph. We provide both the baseline and this list of tensors to AI Explanations in an  explanation_metadata.json file, uploaded to the same GCS bucket as our SavedModel.

Getting attribution values from AI Platform

Once our model is deployed with explanations, we can get predictions and attribution values with the AI Platform Prediction API or gcloud. Here’s what an API request to our model would look like:

For the example below, our model returns the following attribution values, which are all relative to our model’s baseline value. Here we can see that distance was the most important feature, since it pushed our model’s prediction down from the baseline by 2.4 minutes. It also shows that the start time of the trip (18:00, or 6:00 pm) caused the model to shorten its predicted trip duration by 1.2 minutes:


Next, we’ll use the What-If Tool to see how our model is performing across a larger dataset of test examples and to visualize the attribution values.

Visualizing tabular attributions with the What-If Tool

The What-If Tool is an open-source visualization tool for inspecting any machine learning model, and the latest release includes features intended specifically for AI Explanations models deployed on AI Platform. You can find the code for connecting the What-If Tool to your AI Platform model in this demo notebook.

Here’s what you’ll see when you initialize the What-If Tool with a subset of our test dataset and model and click on a data point:

5 What-If Tool.png
Click to enlarge

On the right, we see the distribution of all 500 test data points we’ve passed the What-If Tool. The Y-axis indicates the model’s predicted trip duration for these values. When we click on an individual data point, we can see all of the feature values for that data point along with each feature’s attribution value. This part of the tool also lets you change feature values and re-run the prediction to see how the updated feature value affected the model’s prediction:

Click to enlarge

One of our favorite What-If Tool features is the ability to create custom charts and scatter plots, and the attributions data returned from AI Platform makes this especially useful. For example, here we created a custom plot where the X-axis measures the attribution value for trip distance and the Y-axis measures the attribution value for max temperature:

Click to enlarge

This can help us identify outliers. In this case, we show an example where the predicted trip duration was way off since the distance traveled was 0 but the bike was in use for 34 minutes.

There are many possible exploration ideas with the What-If Tool and AI Platform attribution values, like analyzing our model from a fairness perspective, ensuring our dataset is balanced, and more. 

Next steps

Ready to dive into the code? These resources will help you get started with AI Explanations on AutoML Tables and AI Platform:

If you’d like to use the same datasets we did, here is the London bikeshare data in BigQuery. We joined this with part of the NOAA weather dataset, which was recently updated to include even more data. 

We’d love to hear what you thought of this post. You can find us on Twitter at @amygdala and @SRobTweets. If you have specific questions about using Explainable AI in your models, you can reach us here. And stay tuned for the next post in this series, which will cover explainability on image models.

New Dialogflow Mega Agent for Contact Center AI increases intents by 10 times to 20,000

Contact centers are one of the most important ways that businesses interact with customers. But, consistently providing great customer interactions over the range of potential conversations presents a number of complex challenges, challenges that businesses are increasingly turning to artificial intelligence (AI) and the cloud to help solve.

We recently announced Contact Center AI as GA, with Virtual Agent and Agent Assist to help businesses consistently deliver great customer experiences. To help improve your customer interactions even further, today we’re announcing significant updates to Dialogflow, the core technology of Contact Center AI, including increasing the number of intents available to your virtual agents by 10 times to 20,000. 

Dialogflow is an industry-leading platform for building chatbots and interactive voice responses (IVR), and powering contact centers globally with natural and rich conversational experiences. Increasing the number of intents means more training phrases, actions, parameters, and responses to help your Virtual Agent interact with customers and get their issues resolved more efficiently. 

In addition to added intents, we’ve made some other updates to Dialogflow to help you deliver the experience that your customers expect, while making it easier than ever to scale your Contact Center implementation. Here’s an overview of the updates we’re sharing today: 

  • Dialogflow Mega Agent (Beta): Get better customer conversations with up to 20,000 intents  

  • Dialogflow Agent Validation (GA): Identify agent design errors in real time for faster deployment and higher quality agents 

  • Dialogflow Versions and Environments (GA): Create multiple versions of agents and publish them to different environments 

  • Dialogflow Webhook Management API (GA): Create and manage your queries more quickly and easily

Let’s take a closer look at each feature.

Mega Agent: Answer 10x more customer questions
When your customer says or writes something, Dialogflow captures their request and matches it to the best intent in the agent. More intents lead to better customer conversations. A regular Dialogflow agent comes with a limit of 2,000 intents—which is the most intents available in the market, based on public information. With Dialogflow Mega Agent, now in beta, you can combine multiple Dialogflow agents into a single agent, and expand your intent limit by 10 times to 20,000 intents. 

With increased intents, customers can have more natural, seamless conversations, pivot intents and questions when they want, and get their questions covered. This greatly increases scale and your ability to tackle more use cases, to better serve your customers’ needs and solve their problems. 

Dialogflow Mega Agent also makes it easier for developers to create and manage their Dialogflow experience. If you have multiple teams building an agent, each team can now be responsible for one sub-agent, simplifying change conflicts and creating better governance across teams.

Companies are already using Dialogflow Mega Agent to provide a more seamless and integrated customer experience: 

“At KLM we are building multiple (chat)bot services using Dialogflow,” said Joost Oremus, Head of Social Technology at KLM Royal Dutch Airlines. “As travel is a complex product, making sure that our customers are guided towards the right agent (both human agent and multiple automated agents) can be challenging. Our first trial experience with Mega Agent shows promising results in solving this challenge for us.”

Agent Validation: Better conversations lead to better customer experiences 
Frustrating interactions with your contact center is a sure way to lose customers. Yet, an internal study showed that 80% of Dialogflow agents had easy-to-fix quality issues. Dialogflow’s Agent Validation helps eliminate these negative interactions by helping designers identify errors to create high-quality agents and improve customer experiences.

It does this by highlighting quality issues in the Dialogflow agent design—such as overlapping training phrases, wrong entity annotations, and other issues—and giving developers real-time updates on issues that can be corrected. Reducing errors leads to faster bot deployment, and ultimately, higher quality Dialogflow agents in production. 

Contact Center AI is designed to make implementation as easy as possible. The following two features simplify the deployment stage even further, so your developers can spend their time on testing and reiterating on products. 

Versions & Environments: Create, test, and deploy your agent, all in one place
Versions and Environments, now GA, lets you create multiple versions of your agent and publish them to a variety of custom environments, including testing, development, staging, production, and so on. This means that developers can now test different agent versions, track changes, and manage the entire deployment process in the Dialogflow agent itself. 

Webhook Management API: Reduce webhook response time and save developer resources
With Webhook Management API, you can now create and manage webhooks, making it easier for enterprises to programatically fulfill their queries. As Dialogflow processes and fulfills millions of queries daily with webhook, this new API—which was previously limited to the Dialogflow console—will help enterprises speed up their agent design process. 

A great customer experience builds loyalty and leads to repeat business. With these updates to Dialogflow, we aim to make developing a great customer experience easier than ever before (Dialogflow pricing is available here). You can access all these features today through your Dialogflow console or API, which are all available for your Contact Center AI integrations.

Expanding our alliance with Cisco in hybrid cloud and the contact center

Over the past three years, we’ve worked closely with Cisco to deliver a number of customer-focused solutions in areas such as hybrid cloud, multi-cloud, work transformation, and contact center integrations. This week at Cisco Live in Barcelona, we’re sharing updates on our joint work in two key areas of customer demand—hybrid cloud solutions and the digital contact center.

Announcing the availability of Anthos 1.2 with Cisco HyperFlex

At Next ‘19, Cisco and Google Cloud announced a hybrid cloud partnership to bring Anthos and Cisco HyperFlex to our shared customers. After working closely across our engineering and business development teams, today we are excited to announce the general availability of Anthos 1.2 with Cisco HyperFlex, with a Cisco Validated Design (CVD) for the joint solution coming soon. 

Google Cloud’s Anthos deployed with Cisco HyperFlex enables you to modernize in-place with your existing resources. You can automate policy and security at scale, track configuration and policy changes to have an audit log of system configuration, and update configurations in seconds across all of your Anthos environments. It also provides consistency, the same experience across on-prem and in the cloud.

“This is an important milestone in our hybrid cloud partnership with Google Cloud,” said Kaustubh Das, Vice President Product Management, Cisco. “With Anthos and the HyperFlex Data Platform, our customers now have a highly available and resilient on-prem data platform for running Kubernetes workloads at scale. We now have all the benefits of Anthos on a system that delivers predictable performance, enterprise-grade data services, storage optimization, security and zero downtime during upgrades.”

Cisco HyperFlex unifies compute, storage, and networking for your core to the edge. Anthos GKE on-prem deployed on Cisco HyperFlex provides a Container-as-a-Service environment based on our recently released Anthos 1.2. This solution provides end-to-end orchestration, management and scalable architecture to deploy Anthos on Cisco HyperFlex with HyperFlex CSI (Container Storage Interface) for persistent storage. Customers looking at hybrid cloud models will experience a consistent Kubernetes experience on-prem as well as on the cloud with:

  • A single management control plane for the entire hardware lifecycle management.

  • Scalable and highly available hyperconverged infrastructure to aid container applications with compute, network and storage needs,

  • Faster turnaround time, making it a good fit for DevOps and CI/CD use cases.

  • Anthos single control plane for multi-cloud management, allowing you to deploy applications across hybrid and multi-cloud environments without changing the underlying code.

  • Automated policies and security at scale.

“Cisco and Google Cloud have combined Cisco’s leading hyper converged technology with Google Cloud’s Anthos to make hybrid cloud containerization a reality for our customers,” said Dave Sellers, General Manager, MultiCloud at World Wide Technology.”  Leveraging our Advanced Technology Center, WWT is providing our customers a unique educational and hands-on lab experience showcasing the unique value proposition offered by these cutting-edge technologies.”

Customers in our first Anthos lab day event last week with WWT are already sharing positive feedback with us—and we’ve been thrilled by the reception. Customer centricity is a joint value shared within this partnership, and it has helped inform the direction of our products and will continue to shape the future of Anthos. 

Expanding our partnership to modernize the Contact Center

We are also excited to expand our partnership with Cisco by offering Contact Center AI through Cisco’s platform. Cisco is now bringing in Google Cloud’s Natural Language Processing (NLP), AI, and ML capabilities to create a seamless end-to-end conversational experience for customers. With the release of our joint solution, Cisco is now introducing Google Cloud’s conversational IVR, Virtual Agent, Agent Assist, and Insights to their contact center offering. 

Powered by Google Cloud’s innovative conversational AI, our Contact Center AI offering helps businesses create richer, more natural-sounding, and more helpful conversational experiences within the contact center. Customers can use natural language to describe the reason for their text or call. The Virtual Agent can then either assist the customer or route the conversation to the appropriate agent. Cisco Contact Center’s industry leading routing technology then routes the customer to the appropriate agent based on the understood intent.. AI and NLP continue to assist the conversation by surfacing knowledge articles, recommendations, and turn-by-turn guidance for the agent. Then the agent is assisted with wrap-up and business leaders can use the data with Insights for sentiment analysis and spotting trends.

Google Cloud Contact Center AI, in partnership with Cisco, improves the customer experience, increases agent satisfaction, and provides insights to business leaders—and it does this all while deflecting more calls, reducing average handle time, and lowering costs. And since Google Cloud and Cisco have done the hard work on the backend, the solution is easier to implement. No machine learning experts needed!

“We’re excited to launch this joint solution that infuses AI from the Google Cloud into our Contact Center and transforms how our joint customers do business,” said Omar Tawakol, VP/GM at Cisco Contact Center:  With this integration, we’re combining Google Cloud’s Natural Language Processing and AI capabilities with our industry leading contact center capabilities to empower agents to provide better customer service and vastly improve the experience for the end customer.”

You can learn more about Anthos and Contact Center AI on our website. And if you’re attending Cisco Live Barcelona, we invite you to stop by Booth 02 to learn more about our joint solutions—details are here.

Cheaper Cloud AI deployments with NVIDIA T4 GPU price cut

Google Cloud offers a wide range of GPUs to accelerate everything from AI deployment to 3D visualization. These use cases are now even more affordable with the price reduction of the NVIDIA T4 GPU. As of early January, we’ve reduced T4 prices by more than 60%, making it the lowest cost GPU instance on Google Cloud

Hourly Pricing Per T4 GPU.png
Prices above are for us-central1 and vary by region. A full GPU pricing table is here.

Locations and configurations

Google Cloud was the first major cloud provider to launch the T4 GPU and offer it globally (in eight regions). This worldwide footprint, combined with the performance of the T4 Tensor Cores, opens up more possibilities to our customers. Since our global rollout, T4 performance has improved. The T4 and V100 GPUs now boast networking speeds of up to 100 Gbps, in beta, with additional regions coming online in the future. 

These GPU instances are also flexible to suit different workloads. The T4 GPUs can be attached to our n1 machine types that support custom VM shapes. This means you can create a VM tailored specifically to meet your needs, whether it’s a low cost option like one vCPU, one GB memory, and one T4 GPU, or as high performance as 96 vCPUs, 624 GB memory, and four T4 GPUs—and most anything in between. This is helpful for machine learning (ML), since you may want to adjust your vCPU count based on your pre-processing needs. For visualization, you can create VM shapes for lower end solutions all the way up to powerful, cloud-based professional workstations.

Machine Learning

With mixed precision support and 16 GB of memory, the T4 is also a great option for ML workloads. For example, Compute Engine preemptible VMs work well for batch ML inference workloads, offering lower cost compute in exchange for variable capacity availability. We previously shared sample T4 GPU performance numbers for ML inference of up to 4,267 images-per-second (ResNet 50, batch size 128, precision INT8). That means you can perform roughly 15 million image predictions in an hour for a $0.11 add-on cost for a single T4 GPU with your n1 VM. 

Google Cloud offers several options to access these GPUs. One of the simplest ways to get started is through Deep Learning VM Images for AI Platform and Compute Engine, and Deep Learning Containers for Google Kubernetes Engine (GKE). These are configured for software compatibility and performance, and come pre-packaged with your favorite ML frameworks, including PyTorch and TensorFlow Enterprise

We’re committed to making GPU acceleration more accessible, whatever your budget and performance requirements may be. With the reduced cost of NVIDIA T4 instances, we now have a broad selection of accelerators for a multitude of workloads, performance levels, and price points. Check out the full pricing table and regional availability and try the NVIDIA T4 GPU for your workload today.

Want to use AutoML Tables from a Jupyter Notebook? Here’s how

While there’s no doubt that machine learning (ML) can be a great tool for businesses of all shapes and sizes, actually building ML models can seem daunting at first. Cloud AutoML—Google Cloud’s suite of products—provides tools and functionality to help you build ML models that are tailored to your specific needs, without needing deep ML expertise.

AutoML solutions provide a user interface that walks you through each step of model building, including importing data, training your model on the data, evaluating model performance, and predicting values with the model. But, what if you want to use AutoML products outside of the user interface? If you’re working with structured data, one way to do it is by using the AutoML Tables SDK, which lets you trigger—or even automate—each step of the process through code. 

There is a wide variety of ways that the SDK can help embed AutoML capabilities into applications. In this post, we’ll use an example to show how you can use the SDK from end-to-end within your Jupyter Notebook. Jupyter Notebooks are one of the most popular development tools for data scientists. They enable you to create interactive, shareable notebooks with code snippets and markdown for explanations. Without leaving Google Cloud’s hosted notebook environment, AI Platform Notebooks, you can leverage the power of AutoML technology.

There are several benefits of using AutoML technology from a notebook. Each step and setting can be codified so that it runs the same every time by everyone. Also, it’s common, even with AutoML, to need to manipulate the source data before training the model with it. By using a notebook, you can use common tools like pandas and numpy to preprocess the data in the same workflow. Finally, you have the option of creating a model with another framework, and ensemble that together with the AutoML model, for potentially better results. Let’s get started!

Understanding the data

The business problem we’ll investigate in this blog is how to identify fraudulent credit card transactions. The technical challenge we’ll face is how to deal with imbalanced datasets: only 0.17% of the transactions in the dataset we’re using are marked as fraud. More details on this problem are available in the research paper Calibrating Probability with Undersampling for Unbalanced Classification.

To get started, you’ll need a Google Cloud Platform project with billing enabled. To create a project, follow the instructions here. For a smooth experience, check that the necessary storage and ML APIs are enabled. Then, follow this link to access BigQuery public datasets in the Google Cloud console.

In the Resources tree in the bottom-left corner, navigate through the list of datasets until you find ml-datasets, and then select the ulb-fraud-detection table within it.


Click the Preview tab to preview sample records from the dataset. Each record has the following columns:

  • Time is the number of seconds between the first transaction in the dataset and the time of the selected transaction.
  • V1-V28 are columns that have been transformed via a dimensionality reduction technique called PCA that has anonymized the data.
  • Amount is the transaction amount.
ulb-fraud-detection 1.png

Set up your Notebook Environment

Now that we’ve looked at the data, let’s set up our development environment. The notebook we’ll use can be found in AI Hub. Select the “Open in GCP” button, then choose to either deploy the notebook in a new or existing notebook server.

set up Notebook Environment.png
ai hub.png

Configure the AutoML Tables SDK

Next, let’s highlight key sections of the notebook. Some details, such as setting the project ID, are omitted for brevity, but we highly recommend running the notebook end-to-end when you have an opportunity.

We’ve recently released a new and improved AutoML Tables client library. You will first need to install the library and initialize the Tables client.

By the way, we recently announced that AutoML Tables can now be used in Kaggle kernels. You can learn more in this tutorial notebook, but the setup is similar to what you see here.

Import the Data 

The first step is to create a BigQuery dataset, which is essentially a container for the data. Next, import the data from the BigQuery fraud detection dataset. You can also import from a CSV file in Google Cloud Storage or directly from a pandas dataframe.

Train the Model

First, we have to specify which column we would like to predict, or our target column, with set_target_column(). The target column for our example will be “Class”—either 1 or 0, if the transaction is fraudulent or not.

Then, we’ll specify which columns to exclude from the model. We’ll only exclude the target column, but you could also exclude IDs or other information you don’t want to include in the model.

There are a few other things you might want to do that aren’t necessary needed in this example:

  • Set weights on individual columns

  • Create your own custom test/train/validation split and specify the column to use for the split

  • Specify which timestamp column to use for time-series problems

  • Override the data types and nullable status that AutoML Tables inferred during data import

The one slightly unusual thing that we did in this example is override the default optimization objective. Since this is a very imbalanced dataset, it’s recommended that you optimize for AU-PRC, or the area under the Precision/Recall curve, rather than the default AU-ROC.

Evaluate the Model

After training has been completed, you can review various performance statistics on the model, such as the accuracy, precision, recall, and so on. The metrics are returned in a nested data structure, and here we are pulling out the AU-PRC and AU-ROC from that data structure.

Deploy and Predict with the Model

To enable online predictions, the model must first be deployed. (You can perform batch predictions without deploying the model).

We’ll create a hypothetical transaction record with similar characteristics and predict on it. After invoking the predict() API with this record, we receive a data structure with each class and its score. The code below finds the class with the maximum score.


Now that we’ve seen how you can use AutoML Tables straight from your notebook to produce an accurate model of a complex problem, all with a minimal amount of code, what’s next?

To find out more, the AutoML Tables documentation is a great place to start. When you’re ready to use AutoML in a notebook, the SDK guide has detailed descriptions of each operation and parameter. You might also find our samples on Github helpful.

After you feel comfortable with AutoML Tables, you might want to look at other AutoML products. You can apply what you’ve learned to solve problems in Natural Language, Translation, Video Intelligence, and Video domains.

Find me on Twitter at @kweinmeister, and good luck with your next AutoML experiment!

Exploratory data analysis, feature selection for better ML models

When you’re getting started with a machine learning (ML) project, one critical principle to keep in mind is that data is everything. It is often said that if ML is the rocket engine, then the fuel is the (high-quality) data fed to ML algorithms. However, deriving truth and insight from a pile of data can be a complicated and error-prone job. To have a solid start for your ML project, it always helps to analyze the data up front, a practice that describes the data by means of statistical and visualization techniques to bring important aspects of that data into focus for further analysis. During that process, it’s important that you get a deep understanding of: 

  • The properties of the data, such as schema and statistical properties;

  • The quality of the data, like missing values and inconsistent data types;

  • The predictive power of the data, such as correlation of features against target.

This process lays the groundwork for the subsequent feature selection and engineering steps, and it provides a solid foundation for building good ML models. 

There are many different approaches to conducting exploratory data analysis (EDA) out there, so it can be hard to know what analysis to perform and how to do it properly. To consolidate the recommendations on conducting proper EDA, data cleaning, and feature selection in ML projects, we’ll summarize and provide concise guidance from both intuitive (visualization) and rigorous (statistical) perspectives. Based on the results of the analysis, you can then determine corresponding feature selection and engineering recommendations. You can also get more comprehensive instructions in this white paper.

You can also check out the Auto Data Exploration and Feature Recommendation Tool we developed to help you automate the recommended analysis, regardless of the scale of the data, then generate a well-organized report to present the findings. 

EDA, feature selection, and feature engineering are often tied together and are important steps in the ML journey. With the complexity of data and business problems that exist today (such as credit scoring in finance and demand forecasting in retail), how the results of proper EDA can influence your subsequent decisions is a big question. In this post, we will walk you through some of the decisions you’ll need to make about your data for a particular project, and choosing which type of analysis to use, along with visualizations, tools, and feature processing.

Let’s start exploring the types of analysis you can choose from. 

Statistical data analysis

With this type of analysis, data exploration can be conducted from three different angles: descriptive, correlation, and contextual. Each type introduces complementary information on the properties and predictive power of the data, helping you make an informed decision based on the outcome of the analysis.

1. Descriptive analysis (univariate analysis)

Descriptive analysis, or univariate analysis, provides an understanding of the characteristics of each attribute of the dataset. It also offers important evidence for feature preprocessing and selection in a later stage. The following table lists the suggested analysis for attributes that are common, numerical, categorical and textual.

2. Correlation analysis (bivariate analysis)

Correlation analysis (or bivariate analysis) examines the relationship between two attributes, say X and Y, and examines whether X and Y are correlated. This analysis can be done from two perspectives to get various possible combinations:

  • Qualitative analysis. This performs computation of the descriptive statistics of dependent numerical/categorical attributes against each unique value of the independent categorical attribute. This perspective helps intuitively understand the relationship between X and Y. Visualizations are often used together with qualitative analysis as a more intuitive way of presenting the result.
  • Quantitative analysis. This is a quantitative test of the relationship between X and Y, based on hypothesis testing framework. This perspective provides a formal and mathematical methodology to quantitatively determine the existence and/or strength of the relationship.

3. Contextual analysis

Descriptive analysis and correlation analysis are both generic enough to be performed on any structured dataset, neither of which requires context information. To further understand or profile the given dataset and to gain more domain-specific insights, you can use one of two common contextual information-based analyses: 

  • Time-based analysis: In many real-world datasets, the timestamp (or a similar time-related attribute) is one of the key pieces of contextual information. Observing and/or understanding the characteristics of the data along the time dimension, with various granularities, is essential to understanding the data generation process and ensuring data quality

  • Agent-based analysis: As an alternative to the time, the other common attribute is the unique identification (ID, such as user ID) of each record. Analyzing the dataset by aggregating along the agent dimension, i.e., histogram of number of records per agent, can further help improve your understanding of the dataset. 

Example of time-based analysis:

The following figure displays the average number of train trips per hour originating from and ending at one particular location based on a simulated dataset.

Example of time-based analysis.png

From this, we can conclude that peak times are around 8:30am and 5:30pm, which is consistent with the intuition that these are the times when people would typically leave home in the morning and return after a day of work.

Feature selection and engineering

The ultimate goal of EDA (whether rigorous or through visualization) is to provide insights on the dataset you’re studying. This can inspire your subsequent feature selection, engineering, and model-building process. 

Descriptive analysis provides the basic statistics of each attribute of the dataset. Those statistics can help you identify the following issues: 

  • High percentage of missing values

  • Low variance of numeric attributes

  • Low entropy of categorical attributes

  • Imbalance of categorical target (class imbalance)

  • Skew distribution of numeric attributes

  • High cardinality of categorical attributes

The correlation analysis examines the relationship between two attributes. There are two typical action points triggered by the correlation analysis in the context of feature selection or feature engineering:

  • Low correlation between feature and target

  • High correlation between features

Once you’ve identified issues, the next task is to make a sound decision on how to properly mitigate these issues. One such example is for “High percentage of missing values.” The identified problem is that the attribute is missing in a significant proportion of the data points. The threshold or definition of “significant” can be set based on domain knowledge. There are two options to handle this, depending on the business scenario:

  1. Assign a unique value to the missing value records, if the missing value, in certain contexts, is actually meaningful. For example, a missing value could indicate that a monitored, underlying process was not functioning properly. 

  2. Discard the feature if the values are missing due to misconfiguration, issues with data collection or untraceable random reasons, and the historic data can’t be reconstituted. 

You can check out the whitepaper to learn more about the proper ways of addressing the above issues, recommended visualization of each analysis and a survey of the existing tools that are most suitable.

A tool that helps you automate

To further help you speed up the process of preparing data for machine learning, you can use our Auto Data Exploration and Feature Recommendation Tool to automate the recommended analysis regardless of the scale of the data, and generate a well-organized report to present the findings and recommendations. 

The tool’s automated EDA includes:

  • Descriptive analysis of each attribute in a dataset for numerical, categorical; 

  • Correlation analysis of two attributes (numerical vs. numerical, numerical vs. categorical, and categorical vs. categorical) through qualitative and/or quantitative analysis.

Based on the EDA performed, the tool makes feature recommendations and generates a summary report, which looks something like this:

exploratory data analysis report.png

We look forward to your feedback as we continue adding features to the tool.

Thanks to additional contributors to this work: Dan Anghel, cloud machine learning engineer and Barbara Fusinska, cloud machine learning engineer

Discover insights from text with AutoML Natural Language, now generally available

Organizations are managing and processing greater volumes of text-heavy, unstructured data than ever before. To manage this information more efficiently, organizations are looking to machine learning to help with the complex sorting, processing, and analysis this content needs. In particular, natural language processing is a valuable tool used to reveal the structure and meaning of text, and today we’re excited to announce that AutoML Natural Language is generally available. 

AutoML Natural Language has many features that make it a great match for these data processing challenges. It includes common machine learning tasks like classification, sentiment analysis, and entity extraction, which have a wide variety of applications, such as: 

  • Categorizing digital content, including news, blogs, and tweets, in real time to allow content creators to see patterns and insights—a great example is Meredith, which is categorizing text content across its entire portfolio of media properties in months instead of years

  • Identifying sentiment in customer feedback

  • Turning dark, unstructured scanned data into classified and searchable content 

We’re also introducing support for PDFs, including native PDFs and PDFs of scanned images. To further unlock the most complex and challenging use cases—such as understanding legal documents or document classification for organizations with large and complex content taxonomies—AutoML Natural Language now supports 5,000 classification labels, training up to 1 million documents, and document size up to 10 MB. 

One customer using this new functionality is Chicory, which develops custom digital shopping and marketing solutions for the grocery industry. 

“AutoML Natural Language allows us to solve complex classification problems at scale. We are using AutoML to classify and translate recipe ingredient data across a network of 1,300 recipe websites into actual grocery products that consumers can purchase seamlessly through our partnerships with dozens of leading grocery retailers like Kroger, Amazon, and Instacart,” Asaf Klibansky, Director of Engineering at Chicory explains. “With the expansion of the max classification label size to the thousands, we can expand our label/ingredient taxonomy to be more detailed than ever, providing our shoppers with better matches during their grocery shopping experience—a business challenge we have been trying to perfect since Chicory began. 

“Also, we see better model performance than we were able to achieve using open source libraries, and we have increased visibility into the individual label performance that we did not have before,” Klibanky continues. “This has allowed us to identify insufficient or poor quality training data per label quickly and reduce the time and cost between model iterations.” 

We’re continuously improving the quality of our models in partnership with Google AI research through better fine-tuning techniques, and larger model search spaces. We’re also introducing more advanced features to help AutoML Natural Language understand documents better. 

For example, AutoML Text & Document Entity Extraction will now look at more than just text to incorporate the spatial structure and layout information of a document for model training and prediction. This spatial awareness leads to better understanding of the entire document, and is especially valuable in cases where both the text and its location on the “page” are important, such as invoices, receipts, resumes, and contracts.

GCP AutoML.png
Identifying applicant skills by location on the document.

We also launched preferences for enterprise data residency for AutoML Natural Language customers in Europe and across the globe to better serve organizations in regulated industries. Many customers are already taking advantage of this functionality, which allows you to create a dataset, train a model, and make predictions while keeping your data and related machine learning processing within the EU or any other applicable region. Finally, AutoML Natural Language is FedRAMP-authorized at the Moderate level, making it easier for federal agencies to benefit from Google AI technology.

To learn more about AutoML Natural Language and the Natural Language API, check out our website. We can’t wait to hear what you discover with your data.

Better bandit building: Advanced personalization the easy way with AutoML Tables

As demand grows for features like personalization systems, efficient information retrieval, and anomaly detection, the need for a solution to optimize these features has grown as well. Contextual bandit is a machine learning framework designed to tackle these—and other—complex situations.

With contextual bandit, a learning algorithm can test out different actions and automatically learn which one has the most rewarding outcome for a given situation. It’s a powerful, generalizable approach for solving key business needs in industries from healthcare to finance, and almost everything in between.

While many businesses may want to use bandits, applying it to your data can be challenging, especially without a dedicated ML team. It requires model building, feature engineering, and creating a pipeline to conduct this approach.

Using Google Cloud AutoML Tables, however, we were able to create a contextual bandit model pipeline that performs as good or better than other models, without needing a specialist for tuning or feature engineering.

A better bandit building solution: AutoML Tables

Before we get too deep into what contextual bandits are and how they work, let’s briefly look at why AutoML Tables is such a powerful tool for training them. Our contextual bandits model pipeline takes in structured data in the form of a simple database table, uses the contextual bandit and meta-learning theories to perform automated machine learning, and creates a model that can be used to suggest optimal future actions related to the problem. 

In our research paper, “AutoML for Contextual Bandits”—which we presented at the ACM RecSys Conference REVEAL workshop—we illustrated how to set this up using the standard, commercially available Google Cloud product.

As we describe in the paper, AutoML Tables enables users with little machine learning expertise to easily train a model using a contextual bandit approach. It does this with:

  • Automated Feature Engineering, which is applied to the raw input data

  • Architecture Search to compute the best architecture(s) for our bandits formulation task—e.g. to find the best predictor model for the expected reward of each episode

  • Hyper-parameter Tuning through search

  • Model Selection where models that have achieved promising results are passed onto the next stage

  • Model Tuning and Ensembling

This solution could be a game-changer for businesses that want to perform bandit machine learning but don’t have the resources to implement it from scratch. 

Bandits, explained

Now that we’ve seen how AutoML Tables handles bandits, we can learn more about what, exactly, they are. As with many topics, bandits are best illustrated with the help of an example. Let’s say you are an online retailer that wants to show personalized product suggestions on your homepage.

You can only show a limited number of products to a specific customer, and you don’t know which ones will have the best reward. In this case, let’s make the reward $0 if the customer doesn’t buy the product, and the item price if they do.

To try to maximize your reward, you could utilize a multi-armed bandit (MAB) algorithm, where each product is a bandit—a choice available for the algorithm to try. As we can see below, the multi-armed bandit agent must choose to show the user item 1 or item 2 during each play. Each play is independent of the other—sometimes the user will buy item 2 for $22, sometimes the user will buy item 2 twice earning a reward of $44.

1 multi-armed bandits agent.png

The multi-armed bandit approach balances exploration and exploitation of bandits.

2 exploration and exploitation.png

To continue our example, you probably want to show a camera enthusiast products related to cameras (exploitation), but you also want to see what other products they may be interested in, like gaming gadgets or wearables (exploration). A good practice is to exploit more at the beginning, when the agent’s information about the environment is less accurate, and gradually adapt this policy as more knowledge is gained.

Now let’s say we have a customer that’s a professional interior designer and an avid knitting hobbyist. They may be ordering wallpaper and mirrors during working hours and browsing different yarns when they’re home. Depending on what time of day they access our website, we may want to show them different products.

The contextual bandit algorithm is an extension of the multi-armed bandit approach where we factor in the customer’s environment, or context, when choosing a bandit. The context affects how a reward is associated with each bandit, so as contexts change, the model should learn to adapt its bandit choice, as shown below.

3 contextual bandits.png

Not only do you want your contextual bandit approach to find the maximum reward, you also want to reduce the reward loss when you’re exploring different bandits. When judging the performance of a model, the metric that measures reward loss is regret—the difference between the cumulative reward from the optimal policy and the model’s cumulative sum of rewards over time. The lower the regret, the better the model.

How contextual bandits on AutoML Tables measures up

In “AutoML for Contextual Bandits” we used different data sets to compare our bandit model powered by AutoML Tables to previous work. Namely, we compared our model to the online cover algorithm implementation for Contextual Bandit in the Vowpal Wabbit library, which is considered one of the most sophisticated options available for contextual bandit learning.

Using synthetic data we generated, we found that our AutoML Tables model reduced the regret metric as the number of data blocks increased, and outperformed the Vowpal Wabbit offering.

4 regret curve on synthetic data.png

We also compared our model’s performance with other models on some other well-known datasets that the contextual bandit approach has been tried on. These datasets have been used in other popular work in the field, and aim to test contextual bandit models on applications as diverse as chess and telescope data. 

We consistently found that our AutoML model performed well against other approaches, and was exceptionally better than the Vowpal Wabbit solution on some datasets.

5 regret curve gamma telescope.png
6 chess.png
7 covertype.png
7 dou shou.png

Contextual bandits is an exciting method for solving the complex problems businesses face today, and AutoML Tables makes it accessible for a wide range of organizations—and performs extremely well, to boot. To learn more about our solution, check out “AutoML for Contextual Bandits.” Then, if you have more direct questions or just want more information, reach out to us at [email protected]

The Google Cloud Bandits Solutions Team contributed to this report: Joe Cheuk, Cloud Application Engineer; Praneet Dutta, Cloud Machine Learning Engineer; Jonathan S Kim, Customer Engineer; Massimo Mascaro, Technical Director, Office of the CTO, Applied AI

Advancing the medical imaging field with cloud-based solutions at RSNA

The healthcare industry is increasingly embracing the cloud, and to help, we’ve developedhealthcare andlife sciences solutions that make it easier for organizations to transition to cloud technologies. Today, at the annual meeting of the Radiological Society of North America (RSNA), we’re excited to share the ways we’re enabling our customers and partners, through managed DICOM services, analytics, and AI, to make advances toward their clinical and operational goals in medical imaging.

At RSNA, we’ll be showcasing a number of end-to-end solutions and partner offerings. Specifically, we’ll be demonstrating solutions that enable de-identification of data in DICOM images and HIPAA-supported deployments so that our customers and partners can focus on their core business—not on managing and implementing infrastructure. 

Sharing the work of our customers and partners

More than a dozen customers and partners will be joining us this week to give live demos, host lightning talks, and share their innovations at RSNA. Some of the topics include:

  • Disaster recovery and vendor neutral archiving solutions running on Google Cloud.
  • Google Cloud as an enabler for next-generation PACS solutions.
  • A real-world evidence platform on Google Cloud.
  • A zero-footprint teleradiology solution. 
  • Machine learning to optimize workflow solutions and reduce annual costs.

You can find a full agenda below. Stop by booth #11318 in the North Hall Level 2 in the AI showcase to see these solutions in action.

Advancing research and AI in radiology

The importance and impact of AI in radiology has been rapidly expanding over the past few years—as can be seen with the growing size of the AI Showcase at RSNA. As an “AI first” company, we are committed to growing the ecosystem of AI developers, fostering new talent, and advancing research. Through Kaggle, and together with RSNA, we have hosted a number of medical imaging AI-based competitions to help encourage AI-based innovation in areas of medical need.

Last year, we hosted an AI competition, where over 1400 teams participated in building algorithms to detect a visual signal for pneumonia, one of the top 15 leading causes of death in the United StatesEarlier this year, we launched another healthcare AI competition in collaboration with RSNA. For this challenge, Kaggle participants built algorithms to detect acute intracranial hemorrhage and its subtypes. This year’s competition drew 1,345 teams, 1,787 individuals across those teams, and over 22,000 submissions. By supporting these competitions, we hope to inspire more AI researchers to build algorithms and models that positively impact the healthcare community.

Visit us at RSNA

If you’re planning to attend RSNA, we’d love to connect! Stop by booth #11318 in the North Hall Level 2 in the AI Showcase to say hello and learn more about how we’re working with customers, partners and patients to engineer a healthier world together. 

You’re invited to join our corporate symposium “Journey to the Cloud.” A number of our customers and partners will be on hand to share their experiences using Google Cloud to drive innovation in the PACS industry, enable real world evidence, and accelerate of new imaging solutions. The session is scheduled for Dec 3 at 9am CT (room S102AB, South Building, Level 1).

For a full list of Google Cloud activities, partners, demos, and presentations at RSNA, please review the Google Cloud guide to RSNA 2019.

We look forward to seeing you in Chicago!

How Cloud AI is shaping the future of retail—online and in-store

Technology has played a key role in retail for decades, from early innovations like barcode scanning and digital point of sale devices, to the global frontier of modern logistics. Through it all, however, the fundamentals remain the same: retailers generate huge quantities of data, face unpredictable environments, and need to continually adapt to the ever-evolving needs of the customer. Throw in the chaos of Black Friday and Cyber Monday, and you’ve got one of the most complex enterprise challenges in the world.

It’s also a challenge tailor-made for AI: a technology that thrives on big data, adapts to change fluidly, and can deliver personalized experiences at scale. With the holiday rush upon us, let’s take a look at how two Cloud AI customers—3PM for online shoppers and Tulip for in-store—are helping make retail more efficient, more personal, and more trustworthy.

Tulip is helping brands across the world bring the flexibility and personalization of e-commerce to their in-store experiences. Online, 3PM continuously tracks millions of sellers across a range of e-commerce marketplaces, helping to turn the tide against predatory practices like counterfeit products and trademark infringement.

3PM: Safeguarding online marketplaces at a global scale

Trust is the foundation of every retail experience, and that’s especially true online. With the proliferation of online marketplaces like Amazon, eBay, and, however, trademarks, copyrighted content, and other brand assets are often spread across too many places to be effectively monitored.

Particularly disconcerting is the fast-growing world of counterfeit products. It’s not just knock-off sneakers and handbags, either. Fraudulent supplements, prescription drugs, and even baby food are readily available online, presented in convincing detail intended to fool customers and could pose a danger to consumer health. Small merchants and global brands alike have found it difficult to contain counterfeiting, largely due to its decentralized nature. This calls for a solution that lies outside marketplaces. 

3PM Solutions saw an opportunity to help. By combining the power of advanced analytics with data at a global scale, 3PM’s suite of tools can detect counterfeit goods automatically, monitor a brand’s reputation over time, and help the brand understand its customers more deeply.

But getting such an ambitious vision off the ground presented some significant technical challenges for 3PM. Online marketplaces routinely change the format and structure of their listings, quickly confounding hand-written rules and filters. To make matters worse, the content within those listings is notoriously unreliable. For example, counterfeiters often intentionally misspell brand and product names to keep their goods under the radar. It’s a level of complexity that calls for a particularly flexible solution that’s capable of ingesting massive quantities of data, while also evolving as the nature of that data changes.

These challenges prompted 3PM to migrate to Google Cloud Platform, bringing the company’s data and infrastructure—and, more importantly, a state-of-the-art AI toolkit—into a single environment.

Google Cloud’s flexibility helped 3PM implement a creative, agile development process. The company’s developers designed a TensorFlow-based image classifier and trained it on billions of examples, forming the basis of a self-serve tool that lets brands accurately detect improper use of product photography, logos, and other trademarks. They built custom machine-learning models to intelligently analyze product listings. These models can look past the basics like  image and title to incorporate a wide range of data points to detect subtle features correlated with fraud that rule-based systems—not to mention humans—would miss. 3PM even used the Cloud Translate API to transcend language barriers automatically.

Tulip: Bringing digital personalization to the in-store experience

Of course, brick-and-mortar remains fundamental to the identity of countless brands, with 80% of all sales still taking place in physical stores. Nevertheless, the speed, flexibility, and extreme personalization of e-commerce is influencing customer expectations everywhere—even when shopping in person—and retailers are scrambling to keep up.

Tulip helps retailers keep up with these demands with a suite of powerful mobile apps that gives retail workers the power of the digital world anywhere in their store, whether they’re looking up products, managing customer information, checking out shoppers, or communicating with customers. Tulip helps physical stores establish deeper relationships with their patrons based on their preferences, behaviors, and purchases—just as they would online—and it’s changing the way global brands do business.

A major challenge in any retail application is forecasting. Whether it’s an unexpected fashion craze or an annual event like Black Friday, retail’s surges and lulls can make traditional allocation of compute resources extremely challenging. 

“Because we had to scale for peak demand, we had to buy capacity up front, which sat idle much of the time when sales demand was lower,” explains Jeff Woods, director of software for infrastructure at Tulip. “It became difficult and expensive. We were constantly asking the vendor to waive arbitrary limits. We had to use massive instances, and it was difficult to scale down.”

After migrating to Google Cloud, Tulip could deploy on an infrastructure capable of scaling to any size at a moment’s notice—and only pay for what they used. In the process, they also gained access to some of the world’s most advanced machine learning technologies. Now, wIth their data, infrastructure, and AI tools in one place, the stage was set for Tulip to build an entirely new level of intelligence into their solutions.

Tulip’s solutions use a set of custom TensorFlow models running on AI Platform to identify customer insights and sales opportunities based on data from a customer’s in-store mobile applications. This drives recommendations on when to connect with customers and how to engage them with highly personal and relevant communications. 

Tulip’s solution is a textbook example of what makes Deployed AI so powerful: using previously unseen patterns in large quantities of data to solve a clearly defined business challenge, all at the speed of retail. “Every day, Tulip collects millions of data points from customer interactions across its channels,” says Ali Asaria, Tulip’s founder and CEO. “By integrating Google machine learning and big data products into our core platform, we can now use that data to provide intelligent insights and recommendations to retail associates.”


Just a few years ago, AI seemed too expensive and complex for companies like 3PM and Tulip. In both cases, however, moving to Google Cloud has demonstrated this technology’s affordability, interoperability, and ease of use. And the results have been transformative.

Whether the crowds are in stores or online, companies like Tulip and 3PM are demonstrating the power—and sometimes, the necessity—of using AI to make every retail interaction safer and more engaging. It’s another example of Deployed AI in action: using state-of-the-art technology to overcome age-old business challenges.