Amazon ECS admite el drenaje automatizado para instancias de spot que ejecutan servicios ECS

Amazon Elastic Container Service (ECS) ahora admite el drenaje automatizado de instancias de spot, una nueva capacidad que reduce las interrupciones del servicio debido a la finalización de spots para las cargas de trabajo de ECS. Esta característica permitirá a los clientes de ECS gestionar de forma segura cualquier interrupción de las tareas de ECS que se ejecutan en instancias de spot debido a la finalización de la instancia de spot EC2 subyacente.

Serverless Mullet Architectures

Business in the front, party in the back. Bring on the mullets!

A 1930’s Bungalow in Sydney that preserved its historical front facade while radically updating the yard-facing rear of the house. Credit Dwell.

In residential construction, a mullet architecture is a house with a traditional front but with a radically different — often much more modern — backside where it faces the private yard.

Like the mullet haircut after which the architecture is named, it’s conventional business in the front — but a creative party in the back.

I find the mullet architecture metaphor useful in describing software designs that have a similar dichotomy. Amazon API Gateway launched support for serverless web sockets at the end of 2018, and using them with AWS Lambda functions is a great example of a software mullet architecture.

In this case, the “front yard” is a classic websocket — a long-lived, duplex TCP/IP socket between two systems established via HTTP.

Classic uses for websockets include enabling mobile devices and web browsers to communicate with backend systems and services in real time, and to enable those services to notify clients proactively — without requiring the CPU and network overhead of repeated polling by the client.

In the classic approach, the “server side” of the websocket is indeed a conventional server, such as an EC2 instance in the AWS cloud.

The serverless version of this websockets looks and works the same on the front — to the mobile device or web browser, nothing changes. But the “party in the back” of the mullet is no longer a server — now it’s a Lambda function.

To make this work, API Gateway both hosts the websocket protocol (just as it hosts the HTTP protocol for a REST API) and performs the data framing and dispatch. In a REST API call, the relationship between the call to the API and API Gateway’s call to Lambda (or other backend services) is synchronous and one-to-one.

Both of these assumptions get relaxed in a web socket, which offers independent, asynchronous communication in both directions. API Gateway handles this “impedance mismatch” — providing the long-lived endpoint to the websocket for its client, while handling Lambda invocations (and response callbacks — more on those later) on the backend.

Here’s a conceptual diagram of the relationships with its communication patterns:

A Serverless Websocket Architecture on AWS

When is a serverless mullet a good idea?

When (and why) is a serverless mullet architecture helpful? One simple answer: Anywhere you use a websocket today, you can now consider replacing it with a serverless backend.

Amazon’s documentation uses a chat relay server between mobile and/or web clients to illustrate one possible case where a serverless approach can replace a scenario that historically could only be accomplished with servers.

However, there are also interesting “server-to-server” (if you’ll forgive the expression) applications of this architectural pattern beyond long-lived client connections. I recently found myself needing to build a NAT puncher rendezvous service — essentially a simplified version of a STUN server.

You can read more about NAT punching here, but for the purposes of our discussion here, what matters is that I had the following requirements:

  1. I needed a small amount of configuration information from each of two different Lambda functions. Let’s call this info a “pairing key” — it can be represented by a short string. For discussion purposes, we’ll refer to the two callers as “left” and “right”. Note that the service is multi-tenanted, so there are potentially a lot of left/right pairs constantly coming and going, each using different pairing keys.
  2. I also needed a small amount of metadata that I can get from API Gateway about the connection itself (basically the source IP as it appears to API Gateway, after any NATting has taken place).
  3. I have to exchange the data from (2) between clients who provide the same pairing key in (1); that is, left gets right’s metadata and right gets left’s metadata. There’s a lightweight barrier synchronization here: (3) can’t happen until both left and right have shown up…but once they have shown up, the service has to perform (3) as quickly as possible.

The final requirement above is the reason a simple REST API backed by Lambda isn’t a great solution: It would require the first arriver to sit in a busy loop, continuously polling the database (Amazon DynamoDB in my case) waiting for the other side to show up.

Repeatedly querying DynamoDB would drive up costs and we’d be subject to maximum integration duration of an API call of 30 seconds. Using DynamoDB change streams doesn’t work here, either, as the Lambda they would invoke can’t “talk” to the Lambda instance created by invoking the API. It’s also tricky to use Step Functions — “left” and “right” are symmetric peers here, so neither one knows who should kick off a workflow.

Enter…The Mullet

So what can we do that’s better? Well, left and right aren’t mobile or web clients, they’re Lambdas — but they have a very “websockety” problem. They need to coordinate some data and event timing through an intermediary that can “see” both conversations and they benefit from a communication channel that can implicitly convey the state of the barrier synchronization required.

The protocol is simple and looks like this (shown with left as the first arrival):

Here we take full advantage of the mullet architecture:

  • Clients arrive (and communicate) asynchronously with respect to one another, but we can also track the progression of the workflow and coordinate them from the “server” — here, a Lambda/Dynamo combo — that tracks the state of each pairing.
  • API Gateway does most of the heavy lifting, including detecting the data frames in the websocket communication and turning them into Lambda invocations.
  • API Gateway model validation verifies the syntax of incoming messages, so the Lambda code can assume they’re well formed, making the code even simpler.

The architecture is essentially the equivalent of a classic serverless “CRUD over API Gateway / Lambda / Dynamo” but with the added benefits of asynchronous, bidirectional communication and lightweight cross-call coordination.

One important piece of the puzzle is the async callback pathway. There’s an inherent communication asymmetry when we hook up a websocket to a Lambda.

Messages that flow from client to Lambda are easy to model — API Gateway turns them into the arguments to a Lambda invocation. If that Lambda wants to synchronously respond, that’s also easy — API Gateway turns its result into a websocket message and sends it back to the client after the Lambda completes.

But what about our barrier synchronization? In the sequence chart above, it has to happen asynchronously with respect to left’s conversation. To handle this, API Gateway creates a special HTTPS endpoint for each websocket. Calls to this URL get turned into websocket messages that are sent (asynchronously) back to the client.

In our example, the Lambda handling the conversation with right uses this special endpoint to unblock left when the pairing is complete. This represents more “expressive power” than normally exists when a client invokes a Lambda function.

Serverless Benefits

The serverless mullet architecture offers all the usual serverless advantages. In contrast to a serverful approach, such as running a (fleet of) STUN server(s), there are no EC2 instances to deploy, scale, log, manage, or monitor and fault tolerance and scalability comes built in.

Also unlike a server-based approach that would need a front end fleet to handle websocket communication, the code required to implement this approach is tiny — only a few hundred lines, most of which is boilerplate exception handling and error checking. Even the JSON syntax checking of the messages is handled automatically.

One caveat to this “all in on managed services” approach is that the configuration has a complexity of its own — unsurprisingly, as we’re asking services like API Gateway, Lambda, and Dynamo to do a lot of the heavy lifting for us.

For this project, my AWS CloudFormation template is over 500 lines (including comments), while the code, including all its error checking, is only 383 lines. Asingle data point, but illustrative of the fact that “configurating” the managed services to handle things like data frame syntax checking by exhibiting an embedded JSON Schema makes for some non trivial CloudFormation.

However, a little extra complexity in the config is well worth it to gain the operational operational benefits of letting AWS maintain and scale all that functionality!

Mullets all Around

Serverless continues to expand its “addressable market” as new capabilities and services join the party. Fully managed websockets backed by Lambda is a great step forward, but it’s far from the only example of mullet architectures.

Amazon AppSync, a managed GraphQL service, is another example. It offers a blend of synchronous and asynchronous JSON-based communication channels — and when backed by a Lambda instead of a SQL or NoSQL database, it offers another fantastic mullet architecture that makes it easy to build powerful capabilities with built-in query capabilities, all without the need for servers.

AWS and other cloud vendors continue to look for ways to make development easier, and hooking up serverless capabilities to conventional developer experiences continues to be a rich area for new innovation.

Business in the front, party in the back …

bring on the mullets!


Serverless Mullet Architectures was originally published in A Cloud Guru on Medium, where people are continuing the conversation by highlighting and responding to this story.

Introducing AWS IQ

AWS IQ is a new service that enables customers to quickly find, engage, and pay AWS Certified third-party experts for on-demand project work. AWS IQ offers video-conferencing, contract management, secure collaboration, and integrated billing.  

Now use PrivateLink Endpoint Policies to better control Amazon ECR access

Amazon Elastic Container Registry (ECR) now supports PrivateLink Endpoint Policies, a capability that enables customers to better control access to Amazon ECR repositories and images using private endpoints. Previously customers were not able to explicitly define policies to deny or allow access based on IAM resource policies, but now customers can define granular, API level access to container image repositories.

Detect and respond to high-risk threats in your logs with Google Cloud

Editor’s Note: This the fourth blog and video in our six-part series on how to use Cloud Security Command Center. There are links to the three previous blogs and videos at the end of this post. 

Data breaches aren’t only getting more frequent, they’re getting more expensive. With regulatory and compliance fines, and business resources being allocated to remediation, the costs from a data breach can quickly add up. In fact, the average total cost of a data breach in the U.S. has risen to $3.92 million, 1.5% more expensive than in 2018, and 12% more expensive than five years ago, according to IBM.

Today, we’re going to look at how Event Threat Detection can notify you of high-risk and costly threats in your logs and help you respond. Here’s a video—that’s also embedded at the end of this post—that will help you learn more about how it works.

Enabling Event Threat Detection
Once you’re onboard, Event Threat Detection will appear as a card on the Cloud Security Command Center (Cloud SCC) dashboard. 

Event Threat Detection works by consuming Cloud Audit, VPC flow, Cloud DNS, and Syslog via fluentd logs and analyzing them with our threat detection logic and Google’s threat intelligence. When it detects a threat, Event Threat Detection writes findings (results) to Cloud SCC and to a logging project. For this blog and video, we’ll focus on the ETD findings available in Cloud SCC.

1 Cloud SCC.png

Detecting threats with Event Threat Detection
Here are the threats ETD can detect in your logs, and how they work:

  • Brute force SSH: ETD detects the brute force of SSH by examining Linux Auth logs for repeated failures followed by success. 
  • Cryptomining: ETD detects coin mining malware by examining VPC logs for connections to known bad domains for mining pools and other log data.
  • Cloud IAM abuse Malicious grants: ETD detects the addition of accounts from outside of your organization’s domain that are given Owner or Editor permission at the organization or project level.
  • Malware: ETD detects Malware in a similar fashion to crypto mining, as it examines VPC logs for connections to known bad domains and other log data.
  • Phishing: ETD detects Phishing by examining VPC logs for connections and other log data.
  • Outgoing DDoS, port-scanning: ETD detects DDoS attacks originating inside your organization by looking at the sizes, types, and numbers of VPC flow logs. Outgoing DDoS is a common use of compromised instances and projects by attackers. Port scanning is a common indication of an attacker getting ready for lateral movement in a project. 

Responding to threats with Event Threat Detection
When a threat is detected, you can see when it happened—either in the last 24 hours or last 7 days—and how many times it was detected, via the count.

2 Event Threat Detection.png

When you click on a finding, you can see what the event was, when it occurred, and what source the data came from. This information saves time and lets you focus on remediation.

3 finding details.png

To further investigate a threat detected by Event Threat Detection, you can send your logs to a SIEM. Because Event Threat Detection has already processed your logs, you can send only high value incidents to your SIEM, saving time and money. 

You can use a Splunk connector to export these logs. Splunk automatically sorts your key issues—you can see events and categories—so you can investigate further and follow the prescribed steps. 

To learn more about how Event Threat Detection can help you can detect threats in your logs, watch our video.

In this video, learn about Event Threat Detection, a service within Cloud Security Command Center that can alert you when a threat is detected in your logs running in GCP.

9 things to know about Google’s maps data: Beyond the Map

With more than a billion people using Google Maps every day and more than 5 million active apps and websites using Google Maps Platform core products every week, we get questions about where our maps data come from, how we keep it accurate, and more. So before we get to our third installment of the Beyond the Map series, we sat down with product director Ethan Russell to get answers to a few frequently asked questions about our maps data and how you can help us keep it up to date for your very own applications and experiences. 

How do you make sure Google’s maps data is accurate? 
The world is a vast and constantly changing place. Think about how frequently restaurants in your neighborhood come and go, and then consider all the businesses, buildings, homes and roads that are built–and then scale that up to more than 220 countries and regions that are home to more than 7 billion people in the world. We want everyone on the planet to have an accurate, up-to-date map, but there’s a lot going on! So our work is never done and we have a variety of different efforts and technologies helping us keep our maps data as up to date as possible. If you haven’t read the first two installments of the Beyond the Map series, they’re a good start in learning more about how we map the world and keep our data up to date. The first post gives you an overview of our mapping efforts and the second post explains how imagery is the foundation of our mapping techniques. But something we haven’t highlighted in the series yet is how we empower our customers, businesses, and users to contribute what they know about the world and keep our data up to date for themselves and each other. 

How can I submit updated information? 
There are a few different channels for people, businesses, and customers to help update our maps data when something’s not right. Anyone who uses Google Maps can let us know about data issues via the Send Feedback (desktop Maps) and Suggest an Edit (place profiles on Maps and Search) tools. For Google Maps Platform customers using one of our industry solutions (like gaming), the product includes an API for reporting bad points, enabling our game studio partners to report issues to us so we can take action accordingly. And of course if a customer is working closely with our customer engineering teams or an account manager, then they can always work directly with them or the support team to get the information updated. Businesses and agencies that manage business info can also update their business information via Google My Business

Are there any other ways that Google finds updated information beyond user contributions?
Within Google, we have a dedicated team working on keeping our data up to date day in and day out. This covers things like incorporating data from third party resources, developing algorithms to automatically update data and identify spam or fraud, and reaching out directly to businesses and organizations to get accurate info.

How often is your maps data updated?
The map is updated constantly–literally, every second of every day! We’re constantly collecting new information about the world, whether from satellite imagery and Street View cars, or Google Maps users and local business owners, and using that information to update the map. Google Maps users contribute more than 20 million pieces of information every day–that’s more than 200 contributions every second. In addition to the updates we make from what people tell us, we’re making countless updates uncovered through other means like the imagery and machine learning efforts we’ve shared with you in the recent Beyond the Map blog posts posts. 

If a business or organization has a lot of data to contribute, how can they do that? 
For organizations like governments, non-profits, and educational institutions that have large amounts of data about things like new roads or addresses of new buildings, they can use the new Geo Data Upload tool. When submitting via the tool, it’s important that you send data in the right format, so we can ingest the files easier–a shapefile (.shp) or .csv with spatial attributes are preferred file types. If you’re ready to submit your data, it’s helpful that you and your team review our upload content requirements (which you can do at this support page).  

Agencies that manage online marketing for a variety of businesses can use Google My Business to add and update business information. Not only does it get business info into our Places APIs, but it offers a wide range of tools to help businesses better connect with consumers through features like messaging, product inventory, and more on Google Maps and Search. 

How do you manage the vast amounts of data it takes to keep up with the changing world? 
Given that we’re building maps at a truly global scale, you can imagine we process a lot of information. We have many different types of data–roads, buildings, addresses, businesses, and all their various attributes–and imagery from different viewpoints at high resolution. Luckily, we’re not starting from scratch here. From processing and storage systems like Dataflow and Cloud Spanner to machine learning libraries and frameworks like TensorFlow, we’re able to make sense of a river of incoming data.

Why are there differences in data quality in various parts of the world? And how do you address these differences to make sure businesses everywhere can use Google Maps Platform?
Part of what’s fun and challenging about mapping the entire planet is dealing with all the regional differences. This starts with different political constructs, like how granular the postal codes are, or whether addresses for buildings run linearly from one end of a street to the other or are distributed around a block. Then there are physical differences, like with buildings being attached to each other in a city, and with multiple businesses–and private residences!–on different floors. Or when an area has lots of tree cover that makes it hard to see roads underneath, or no tree cover but dry riverbeds that look like dirt roads. And then there are economic differences like how quickly new roads and buildings are constructed, and how quickly new businesses open up. Add in the fact of different languages and different scripts that our algorithms, machine learning and human operators need to understand, and you have a lot of complicating factors leading to different kinds of problems in different parts of the world.

To address these differences we take new and different mapping approaches to these areas. For an area with few authoritative data sources to reference, we use satellite and street-level imagery and machine learning to identify roads or businesses and add the information to our maps data. Or for an area with roads too narrow to map we created a “Street View 3-wheeler” to capture imagery to help us add those roads. As we uncover new mapping challenges, we’re always eager to develop a new solution. 

What’s the most interesting way that Google or another organization has contributed maps data? 
Sheep View is my personal favorite. Solar-powered cameras were strapped to sheep’s woolly backs to collect imagery of the Faroe Islands for Street View. The 18 Faroe Islands are home to just 50,000 people, but—fittingly for a country whose name means “Sheep Island”—there are 70,000 sheep roaming the green hills and volcanic cliffs of the archipelago. So sheep were a brilliant way to capture imagery of the area–and definitely the most creative I’ve seen. 

With Halloween around the corner, what’s up with all those spooky Street View images of people with three legs or a plane submerged in a lake? 
The imagery you see on Google Maps and that’s available via our Maps and Street View APIs is a compilation of billions of photos combined together. Sometimes when we stitch together photos of the same scene, things don’t line up exactly right. This happens especially with things that are moving, like a person walking or an airplane flying. We’re always tweaking our systems and algorithms to handle these situations better. Last Halloween, we actually explained some of the photography challenges behind the most common types of “spooky” imagery in this blog post

Now that we’ve answered some of the most common questions about our maps data, stay tuned for upcoming Beyond the Maps posts for more in-depth looks at how we’re mapping the world and how that helps businesses build location-based experiences worldwide. 

 For more information on Google Maps Platform, visit our website

Amazon SageMaker Neo ya se encuentra disponible en 12 regiones nuevas

Amazon SageMaker Neo ya se encuentra disponible en 12 regiones nuevas: Asia Pacífico (Singapur), Asia Pacífico (Sídney), Asia Pacífico (Seúl), Asia Pacífico (Mumbai), Asia Pacífico (Hong Kong), Canadá (Central), UE (Fráncfort), UE (Londres), UE (París), UE (Estocolmo), América del Sur (São Paulo), EE.UU. Oeste (Norte de California). Amazon SageMaker Neo les permite a los desarrolladores entrenar modelos de aprendizaje automático una única vez y ejecutarlos en cualquier lugar en la nube y en el borde. Amazon SageMaker Neo optimiza modelos para ejecutarlos hasta el doble de rápido, con menos de un décimo de la cantidad de memoria, sin perder la precisión.