Amazon AppStream 2.0 adds support for Windows Server 2016 and Windows Server 2019

Today, Amazon AppStream 2.0 adds support for Windows Server 2016 and Windows Server 2019 base images for Standard, Compute Optimized, Memory Optimized and Graphics Pro instance families. This launch allows you to bring your applications that are only supported for Windows Server 2016/2019 to AppStream 2.0 or update your existing AppStream 2.0 environments to use the most recent Windows server operating system.

AWS Elemental MediaConnect Now Supports SPEKE for Conditional Access

Today AWS Elemental MediaConnect added integration with SPEKE for key exchange with conditional access system (CAS) partners. SPEKE, which stands for Secure Packager and Encoder Key Exchange, is an open API specification that streamlines the way CAS systems integrate with MediaConnect. Using this feature, you can now encrypt live video shared using entitlements and fully control permissions for content sent to your distribution partners. This enables you to build complex distribution workflows with more granular and sophisticated conditional rights management, including time-based access, black-outs, and many other rules-based requirements. Visit the documentation pages to learn more.

How to implement document tagging with AutoML

Many businesses need to digitize photos, documents, memos, and other types of physical media to help with tasks like invoice processing, application review, and contract analysis. At Google Cloud, we provide a number of ways customers can do this, from using our pre-trained machine learning APIs, to build on our AutoML suite, to applying Document Understanding AI, our latest AI solution.

In this post, we’ll focus on one approach, using Cloud AutoML to perform document tagging for the purposes of document processing. Document tagging means identifying key value pairs from a document like responses (or values) to fields (or tags) such as customers, account numbers, totals, and more. Here, ‘tags’ are the fields that one wants to extract, and ‘values’ are the knowledge against that tag. In this solution, we’ll use AutoML to fetch important content from an image like signatures, stamps, and boxes, for processing.

Solutions of the past

A few years ago, digitizing a document meant simply scanning and storing it as an image in the cloud. Now, with better tools and techniques, and with the recent boom in ML-based solutions, it is possible to convert a physical document into structured data that can be automatically processed, and from which useful knowledge can be extracted.

Until recently, digitizing documents required the application of a rule-based methodology like using regular expressions for identifying fields, or extracting OCR from fixed field positions. But these solutions don’t always work on new documents and can be problematic with keyword-matching or text-based NLP models. Object detection and entity recognition, which gained a lot of traction in the last few years, have now led to significant improvements in this area. Cloud AutoML, our suite of AI services that let you create high-quality custom machine learning models with minimal ML expertise, is one example of that.

A GCP solution: AutoML at scale

There are a wide variety of AutoML services that can be used as a foundation to create models that solve unique business problems. In the case of document digitization, one possible architecture that can be used looks like this:

cloud ai solutions suite.png

This type of architecture is not just simple to follow, but also easy to deploy in production. All components are based on existing GCP products that are highly scalable, serverless, and can be directly put in production.

  1. Tagged document—You can use the AI Platform Data Labeling Service if you don’t already have annotated data.

  2. OCR & object detection—This can be done by Vision API and AutoML Vision Object Detection, a recent addition to the AutoML suite of products.

  3. Merge and feature processing—There are several different ways this can be done, like using a simple Jupyter notebook or a Python-based containerized solution.

  4. Entity recognition—This can be done by using Entity extraction, a new feature in AutoML Natural Language,  a recent addition to the AutoML suite of products

  5. Post processing—This can be done in a similar fashion to feature processing.

The whole pipeline can be orchestrated using Cloud Composer, or can be deployed using Google Kubernetes Engine (GKE). However, some business problems, for e.g. building customized data ingestion pipeline to GCP, rules extraction from legal documents, redact sensitive information from the documents before parsing etc., require additional customizations that can be developed in addition to the above mentioned architecture. For such requirements you can contact our sales team for more details and help.

Value generation

Different ML solutions have their own business or technical benefits—and many of our customers have used solutions like this one to meet their objectives, whether it’s enhancing the user experience, decreasing operational costs, or reducing overall errors. Solutions like the one described in this post can be used across industries such as healthcare, financial services, media, and more. Here are just a few examples:

  • Automatically extracting knowledge from Electronic Health Records (EHR).
  • Key value pair generation from invoices.
  • Field fetching from financial documents.
  • Text understanding of customer complaints.
  • Tagging of bank checks, tickets, and other data.

What’s next

In this age of deep learning, solutions that simplify the training process, like transfer learning, are increasingly needed. The architecture described in this post has been successfully tested and deployed to work at scale, and makes it possible to digitize documents without needing thousands of annotated images for model training. 

Data variability, however, is still an important factor in any machine learning-based solution. AutoML automatically solves a lot of basic problems for variance in data, making it possible for you to use as little as a few thousand images to train a custom model.

Helping customers process their documents fits perfectly with Google’s mission to organize the world’s information and make it universally accessible and useful. We hope that by sharing this post, we can inspire more organizations to look to the cloud. Tools like Cloud AutoML Vision, Cloud AutoML Natural Language, and Cloud Storage can help you build a rich data set and improve the end-user experience.

This is a simple and targeted solution for a specific problem. For broader and more powerful document process automation and insight extraction technology, please refer to Google’s Document Understanding AI solution. AutoML is a core component of the end-to-end Document Understand AI solution, which is easy to deploy through our partners, and requires no machine learning expertise. You can learn more on our website.

Introducing Equiano, a subsea cable from Portugal to South Africa

Today we are introducing Equiano, our new private subsea cable that will connect Africa with Europe. Once complete, Equiano will start in western Europe and run along the West Coast of Africa, between Portugal and South Africa, with branching units along the way that can be used to extend connectivity to additional African countries. The first branch is expected to land in Nigeria. This new cable is fully funded by Google, making it our third private international cable after Dunant and Curie, and our 14th subsea cable investment globally.

equiano cable route.png
Equiano’s planned route and branching units, from which additional potential landings can be built.

Google’s private subsea cables all carry the names of historical luminaries, and Equiano is no different. Named for Olaudah Equiano, a Nigerian-born writer and abolitionist who was enslaved as a boy, the Equiano cable is state-of-the-art infrastructure based on space-division multiplexing (SDM) technology, with approximately 20 times more network capacity than the last cable built to serve this region. 

Equiano will be the first subsea cable to incorporate optical switching at the fiber-pair level, rather than the traditional approach of wavelength-level switching. This greatly simplifies the allocation of cable capacity, giving us the flexibility to add and reallocate it in different locations as needed. And because Equiano is fully funded by Google, we’re able to expedite our construction timeline and optimize the number of negotiating parties. A contract to build the cable with Alcatel Submarine Networks was signed in Q4 2018, and the first phase of the project, connecting South Africa with Portugal, is expected to be completed in 2021.

gcp subsea cables.png

Over the last three years, Google has invested US$47 billion to improve our global infrastructure, and Equiano will further enhance the world’s highest capacity and best connected international network. We’re excited to bring Equiano online, and look forward to working with licensed partners to bring Equiano’s capacity to even more countries across the African continent.

Leveraging complex data to build advanced search applications with Azure Search

Data is rarely simple. Not every piece of data we have can fit nicely into a single Excel worksheet of rows and columns. Data has many diverse relationships such as the multiple locations and phone numbers for a single customer or multiple authors and genres of a single book. Of course, relationships typically are even more complex than this, and as we start to leverage AI to understand our data the additional learnings we get only add to the complexity of relationships. For that reason, expecting customers to have to flatten the data so it can be searched and explored is often unrealistic. We heard this often and it quickly became our number one most requested Azure Search feature. Because of this we were excited to announce the general availability of complex types support in Azure Search. In this post, I want to take some time to explain what complex types adds to Azure Search and the kinds of things you can build using this capability. 

Azure Search is a platform as a service that helps developers create their own cloud search solutions.

What is complex data?

Complex data consists of data that includes hierarchical or nested substructures that do not break down neatly into a tabular rowset. For example a book with multiple authors, where each author can have multiple attributes, can’t be represented as a single row of data unless there is a way to model the authors as a collection of objects. Complex types provide this capability, and they can be used when the data cannot be modeled in simple field structures such as strings or integers.

Complex types applicability

At Microsoft Build 2019,  we demonstrated how complex types could be leveraged to build out an effective search application. In the session we looked at the Travel Stack Exchange site, one of the many online communities supported by StackExchange.

The StackExchange data was modeled in a JSON structure to allow easy ingestion it into Azure Search. If we look at the first post made to this site and focus on the first few fields, we see that all of them can be modeled using simple datatypes, including tags which can be modeled as a collection, or array of strings.

{
   "id": "1",
    "CreationDate": "2011-06-21T20:19:34.73",
    "Score": 8,
    "ViewCount": 462,
    "BodyHTML": "

My fiancée and I are looking for a good Caribbean cruise in October and were wondering which "Body": "my fiancée and i are looking for a good caribbean cruise in october and were wondering which islands "OwnerUserId": 9, "LastEditorUserId": 101, "LastEditDate": "2011-12-28T21:36:43.91", "LastActivityDate": "2012-05-24T14:52:14.76", "Title": "What are some Caribbean cruises for October?", "Tags": [ "caribbean", "cruising", "vacations" ], "AnswerCount": 4, "CommentCount": 4, "CloseDate": "0001-01-01T00:00:00",​

However, as we look further down this dataset we see that the data quickly gets more complex and cannot be mapped into a flat structure. For example, there can be numerous comments and answers associated with a single document.  Even votes is defined here as a complex type (although technically it could have been flattened, but that would add work to transform the data).

"CloseDate": "0001-01-01T00:00:00",
    "Comments": [
        {
            "Score": 0,
            "Text": "To help with the cruise line question: Where are you located? My wife and I live in New Orlea
            "CreationDate": "2011-06-21T20:25:14.257",
           "UserId": 12
        },
        {
            "Score": 0,
            "Text": "Toronto, Ontario. We can fly out of anywhere though.",
            "CreationDate": "2011-06-21T20:27:35.3",
            "UserId": 9
        },
        {
            "Score": 3,
            "Text": ""Best" for what?  Please read [this page](http://travel.stackexchange.com/questions/how-to
            "UserId": 20
        },
        {
            "Score": 2,
            "Text": "What do you want out of a cruise? To relax on a boat? To visit islands? Culture? Adventure?
            "CreationDate": "2011-06-24T05:07:16.643",
            "UserId": 65
        }
    ],
    "Votes": {
        "UpVotes": 10,
        "DownVotes": 2
    },
    "Answers": [
        {
            "IsAcceptedAnswer": "True",
            "Body": "This is less than an answer, but more than a comment…nnA large percentage of your travel b
            "Score": 7,
            "CreationDate": "2011-06-24T05:12:01.133",
            "OwnerUserId": 74

All of this data is important to the search experience. For example, you might want to:

In fact, we could even improve on the existing StackExchange search interface by leveraging Cognitive Search to extract key phrases from the answers to supply potential phrases for autocomplete as the user types in the search box.

All of this is now possible because not only can you map this data to a complex structure, but the search queries can support this enhanced structure to help build out a better search experience.

Next Steps

If you would like to learn more about Azure Search complex types, please visit the documentation, or check out the video and associated code I made which digs into this Travel StackExchange data in more detail.

A solution to manage policy administration from end to end

Legacy systems can be a nightmare for any business to maintain. In the insurance industry, carriers struggle not only to maintain these systems but to modify and extend them to support new business initiatives. The insurance business is complex, every state and nation has its own unique set of rules, regulations, and demographics. Creating new products such as an automobile policy has traditionally required the coordination of many different processes, systems, and people. These monolithic systems traditionally used to create new products are inflexible and creating a new product can be an expensive proposition.

The Azure platform offers a wealth of services for partners to enhance, extend, and build industry solutions. Here we describe how one Microsoft partner, Sunlight Solutions, uses Azure to solve a unique problem.

Monolithic systems and their problems

Insurers have long been restricted by complex digital ecosystems created by single-service solutions. Those tasked with maintaining such legacy, monolithic systems struggle as the system ages and becomes more unwieldy. Upgrades and enhancements often require significant new development, large teams, and long-term planning which are expensive, unrealistic, and a drain on morale. Worse, they restrict businesses from pursuing new and exciting opportunities.

A flexible but dedicated solution

An alternative is a single solution provider that is well versed in the insurance business but able to create a dedicated and flexible solution, one that overcomes the problems of a monolith. Sunlight is such a provider. It allows insurance carriers to leverage the benefits of receiving end-to-end insurance administration functionality from a single vendor. At the same time, their solution provides greater flexibility, speed-to-market, and fewer relationships to manage with lower integration costs.

Sunlight’s solution is a single system which manages end-to-end functionality across policy, billing, claims, forms management, customer/producer CRM, reporting and much more. According to Sunlight:

“We are highly flexible, managed through configuration rather than development. This allows for rapid speed to market for the initial deployment and complete flexibility when you need to make changes or support new business initiatives. Our efficient host and continuous delivery models address many of the industry’s largest challenges with respect to managing the cost and time associated with implementation, upgrades, and product maintenance.”

In order to achieve their goals of being quick but pliable, the architecture of the solution is literally a mixture of static and dynamic components. Static components are fields that do not change. Dynamic components such as lists populate at run time. This is conveyed in the graphic below, the solution uses static elements but lets users configure with dynamic parts as needed. The result is a faster cycle that maintains familiarity but allows a variety of data types.

Diagram image of Sunlight's solution using static elements and letting user configure with dynamic parts

In the figure above, data appears depending on the product. When products are acquired, for example through mergers, the static data can be mapped. If a tab exists for the product, it appears. For example, “benefits” and “deductibles” are not a part of every product.

Benefits

In brief, here are the key gains made by using Sunlight:

  • End-to-end functionality: Supports all products/coverages/lines of business
  • Cloud-based and accessible anywhere
  • Supports multiple languages and currencies
  • Globally configurable for international taxes and regional regulatory controls
  • Highly configurable by non-IT personnel
  • Reasonable price-point

Azure services

  • Azure Virtual Machines are used to implement the entire project life cycle quickly.
  • Azure Security Center provides a complete and dynamic infrastructure that continuously improves on its own.
  • Azure Site Recovery plans are simple to implement for our production layer.
  • Azure Functions is utilized in order to quickly replicate environments.
  • Azure Storage is used to keep the application light with a range of storage options for increased access time based on the storage type.

Next steps

To learn more about other industry solutions, go to the Azure for insurance page. To find more details about this solution, go to Sunlight Enterprise on the Azure Marketplace and select Contact me.

New Digital Training Course Now Available: Data Analytics Fundamentals

Data Analytics Fundamentals, which replaces the digital training course Big Data Technology Fundamentals, will teach you the latest on how to seamlessly plan a data analysis solution using the suite of AWS services. In this 3.5-hour, self-paced digital course, you will understand the big picture of data analysis, how to plan data analysis solutions, as well as the data analytic processes involved in those solutions.

GCP DevOps tricks: Create a custom Cloud Shell image that includes Terraform and Helm

If you develop or manage apps on Google Cloud Platform(GCP), you’re probably familiar with Cloud Shell, which provides you with a secure CLI that you can use to manage your environment directly from the browser. But while Cloud Shell’s default image contains most of the tools you could wish for, in some cases you might need more—for example, Terraform for infrastructure provisioning, or Helm, the Kubernetes package manager. 

In this blog post, you will learn how to create a custom Docker image for Cloud Shell that includes the Helm client and Terraform. At a high level, this is a two-step process:

  1. Create and publish a Docker image
  2. Configure your custom image to be used in Cloud Shell

Let’s take a closer look. 

1. Create and publish a custom Cloud Shell Docker image

First, you need to create new Docker image that’s based on the default Cloud Shell image and then publish the image you created to Container Registry.

1. Create a new repo and set the project ID where the Docker image should be published:

2.  With your file editor of choice, create a file named Dockerfile with the following content:

3. Build the Docker image:

4. Push the Docker image to Container Registry:

Note: You will need to configure Docker to authenticate with gcr by following the steps here.

2. Configure Cloud Shell image to use the published image

Now that you’ve created and published your image, you need to configure the Cloud Shell Environment to utilize the image that was published to Container Registry. In the Cloud Console follow these steps:

  1. Go to Cloud Shell Environment settings
  2. Click Edit
  3. Click “Select image from project”
  4. In the Image URL field enter: gcr.io/$GCP_PROJECT_ID/cloud-shell-image:latest
  5. Click “Save”Now open a new Cloud Shell session and you should see that the new custom image is used.

There you have it—a way to configure your Cloud Shell environment with all your favorite tools. To learn more about Cloud Shell, check out the documentation.