Where to find the gurus in Las Vegas!
Here’s where you can find A Cloud Guru at AWS re:Invent 2019!
We’re looking forward to meeting you, hearing your feedback, handing out some awesome swag, and sharing our latest content and features.
Monday, Dec 2
10:00 AM — 8:00 PM: Hackathon with Ryan Kroonenburg
To get re:invent started, hackathon with Ryan judging and winning teams getting a full-year membership to A Cloud Guru!
The Non-Profit Hackathon for Good provides a hands-on and team-oriented experience while supporting non-profit organizations. It is open to all skill levels. Be sure to attend the mixer on Sunday from 6–9pm at the Level Up inside the MGM Grand to build your team! More info here.
Non-Profit Hackathon for Good
10:00 AM — 8:00 PM
Venue: MGM Grand
10:00AM — Machine Learning with Kesha Williams
In this session, learn how to level-up your skills and career through the journey of Kesha Williams, an AWS Machine Learning Hero.
CMY201— Java developer to machine-learning practitioner
10:00 AM — 11:00 AM
Venetian, Level 4, Delfino 4005
1:45PM — Getting Started with Machine Learning
In this chalk talk with Kesha Williams, learn how to get started building, training, and deploying your first machine learning model.
AIM226 — How to successfully become a machine learning developer
1:45 PM — 2:45 PM
Venetian, Level 3, Murano 3201A
Tuesday, Dec 3
All Day — A Cloud Guru at Booth 727!
When the exhibition hall opens on Tuesday, head over to booth #727 to say hello to Ryan and the crew from A Cloud Guru — see you there!
Wednesday, Dec 4
All Day — A Cloud Guru Booth 727!
After the keynote, A Cloud Guru will be heading back to Expo Hall in the Venetian. Stop by and say hello!
6:00PM — AWS Certification Reception
Are you AWS Certified? Register for the AWS Certification Reception and celebrate alongside our A Cloud Guru instructors! Space is limited, so be sure to register early for this event. Hope to see you there!
AWS Certification Reception
6:00 PM — 8:00 PM
Brooklyn Bowl |The LINQ
Thursday, Dec 5
10:30 AM — AWS DeepRacer with Scott Pletcher
Scott Pletcher will share how to host your own AWS DeepRacer event with everything from building a track, logistics, getting support from AWS, planning, leaderboards and more.
How to Roll Your Own DeepRacer Event
10:30 AM –11:00 AM
Venetian, Level 2, Hall C, Expo, Developer Lounge
1:00pm — AWS Security with Faye Ellis
AWS has launched a security certification for specialists to demonstrate their skills, which are in high demand. Learn about the major areas of security and AWS services you’ll need to know to become a security specialist and obtain the certification.
DVC07 — Preparing for the AWS Certified Security Specialty exam
1:00 PM — 1:30 PM
Venetian, Level 2, Hall C, Expo, Developer Lounge
All Week — Info Sessions
A Cloud Guru will be available every day for info sessions to share our latest content and features for business memberships. Be sure to schedule an appointment today — sessions are limited!
A Cloud Guru on Social Media
Follow us on Twitter, Facebook, and LinkedIn for updates! Be sure to subscribe to A Cloud Guru’s AWS This Week — and stay tuned for Ryan’s video summary of all the major re:Invent announcements!
Keep being awesome cloud gurus!
The focus has expanded to the entire application lifecycle
Over the last 4 years of developing the Serverless Framework, and working closely with many of the thousands of organizations that use it to build their applications, I’ve had the good fortune of watching the movement evolve considerably.
In the early days I met countless pioneering developers who were utilizing this new architecture to build incredible things, despite the considerable challenges and relatively crude tooling that existed.
I also worked with many of these early pioneers to convince their organizations to go all-in on Serverless, despite the lack of successful case studies and tried and true best practices — often based simply on an internal POC that promised a shorter development cycle and lower total cost of ownership.
As the tooling has evolved, and the case studies have piled up, I’ve noticed that these early Serverless pioneers have forged a new title that is gaining prominence within organizations — that of Serverless Architect.
What is a Serverless Architect?
Early on in the Serverless journey, when we were initially developing the Serverless Framework (in those days known as JAWS), all of the focus was on development and deployment.
It was clear that this new piece of infrastructure called Lambda had some amazing qualities, but how could we as developers actually build something meaningful with it? And seeing as how Lambda is a cloud native service, the question that followed shortly after was: how can we actually deploy these apps in a sane way?
As various solutions to these problems were developed and improved upon, the focus of developers building Serverless applications expanded to the entire application lifecycle, including testing, monitoring and securing their Serverless apps.
A Serverless Architect is a developer who takes this lifecycle focused view and often personally owns at least part of every stage of the Serverless Application Lifecycle. They don’t simply write functions — they implement business results while thinking through how the code that delivers those results will be developed, deployed, tested, monitored, and secured.
Why is the Serverless Architect essential?
Serverless architectures are essentially collections of managed services connected by functions. Because of this unique and novel model it’s important that the architect has a deep understanding of the event-driven, cloud native paradigm of the architecture.
The demand for the Serverless Architect is a direct result of the unique nature of this architecture and the Serverless Application Lifecycle that accompanies it. Unlike legacy architectures, these various lifecycle stages are no longer separate concerns handled by separate teams at separate times — but rather a single integrated lifecycle that needs to be addressed in a unified way.
There are a couple specific reasons this is the case with Serverless:
- Due to the reduced complexity and self-serve nature of the Serverless architecture, developers are more likely to be responsible for the monitoring and security of their applications.
- Due to the cloud native nature of the services that make up a Serverless Architecture, develop, deploy, and test stages are naturally more integrated.
- Due to the focus on simplicity with Serverless architecture, there’s a stronger desire for fewer tools and more streamlined experiences.
As organizations mature in their Serverless adoption, the demand for these Serverless Architectures grows quickly. While one person thinking this way in the early days is often all that is needed to get adoption off the ground, it often takes teams of Serverless Architects to scale to a ‘serverless first’ mindset.
What types of tooling does the Serverless Architect need?
As Serverless continues to grow in adoption and the number of Serverless Architects continues to increase, it’s becoming clear that unified tooling that addresses the entire Serverless Application Lifecycle is going to be increasingly valuable.
Cobbling together multiple complex solutions is antithetical to the whole Serverless mindset — and if that’s what’s required to be successful with Serverless than somethings gone wrong.
At Serverless Inc. we’re evolving the Serverless Framework to address the complete application lifecycle while maintaining the streamlined developer workflow that our community has grown to love. We’re working hard to ensure that Serverless Architects have the tools they need to flourish and we’re always excited to hear feedback.
Sign up free and let us know what you think.
AWS CloudFormation is an infrastructure graph management service — and needs to act more like it
CloudFormation should represent our desired infrastructure graphs in the way we want to build them
What’s AWS CloudFormation?
As Richard Boyd says, CloudFormation is not a cloud-side version of the AWS SDK. Rather, CloudFormation is an infrastructure-graph management service.
But it’s not clear to me that CloudFormation fully understands this, and I think it should more deeply align with the needs that result from that definition.
Chief among these needs is that CloudFormation resources should be formed around the lifecycle of the right concepts in each AWS service — rather than simply mapping to the API calls provided by those services.
What’s the Issue?
For an example, let’s talk about S3 bucket notifications. If there’s a standard “serverless 101”, it’s image thumbnailing. Basic stuff, right? You have an S3 bucket, and you use bucket notifications to trigger a Lambda that will create the thumbnails and write them back to the bucket.
Any intro-to-serverless demo should show best practices, so you’ll put this in CloudFormation. The best practice for CloudFormation is to never explicitly name your resources unless you absolutely have to — so you never have to worry about name conflicts.
But surprise! You simply can’t trigger a Lambda from an S3 bucket that has a CloudFormation-assigned name. The crux of it is this:
- Bucket notification configuration is only settable through the AWS::S3::Bucket resource, and bucket notifications check for permissions at creation time. If the bucket doesn’t have permission to invoke the Lambda, creation of that notification config will fail.
- The AWS::Lambda::EventSourcePermission resource that creates that permission requires the name of the bucket.
- If CloudFormation is assigning the bucket name, it’s not available in the stack until the bucket (and its notification configuration) are created.
Thus, you end up with a circular dependency. The AWS-blessed solution, described in several different places, is to hard-code an explicit bucket name on both the Bucket and EventSourcePermission resources.
This isn’t necessary. If we look at the lifecycle of the pieces involved, we can see that existence of the bucket should be decoupled with the settings of that bucket.
If we had a AWS::S3::BucketNotification resource that took the bucket name as a parameter, we could create the AWS::S3::Bucket first, and provide its name to both the BucketNotification and the EventSourcePermission.
Despite this option, we’re still years into AWS explicitly punting on this issue and telling customers, in official communications, to just work around it.
What about Lambda?
Going back to infrastructure graph representation, let’s talk about Lambda. CloudFormation has traditionally managed the infrastructure onto which applications were deployed. But in a serverless world, the infrastructure is the application.
When I want to do a phased rollout of a new version of a Lambda function, I’m supposed to have a CodeDeploy resource in the same template as my function. I update the AWS::Lambda::Function resource, and CodeDeploy takes care of the phased rollout using a weighted alias—all while my stack is in the UPDATING state.
The infrastructure graph during the rollout, when two versions of the code are deployed at the same time, has no representation within CloudFormation — and that’s a problem.
What if I want this rollout to happen over an extended period of time? What if I want to deploy two versions of a Lambda function to exist alongside each other indefinitely?
The latter is literally impossible to achieve with a single CloudFormation template today. The AWS::Lambda:Version resource publishes what’s in the $LATEST, which is what is set by AWS::Lambda::Function.
Instead, when we have phased rollouts, we should be speaking of deployments, decoupled from the existence of the function itself.
A resource like AWS::Lambda::Deployment that had parameters for the function name, and the code and configuration, and published that, with the version number available as an attribute.
Multiple of these resources could be included in the same template without conflicting, and your two deployments could then be wired to a weighted alias for phased rollout. Note: To do this properly, we’d need an atomic UpdateFunctionCodeAndConfiguration API call from the Lambda service.
In this way, CloudFormation could represent the state of the graph during a rollout, not just on either side of it.
What’s the So What?
The important notion here is that a resource’s create/update/delete lifecycle doesn’t need to be mapped directly to create/update/delete API calls. Instead, the resources for a service need to match the concepts that allow coherent temporal evolution of an infrastructure that uses the service.
When this is achieved, CloudFormation can adequately represent our desired infrastructure graphs in the way we want to build them, which will only become more critical as serverless/service-full architecture grows in importance.
Epilogue: New tools like the CDK look to build client-side abstractions on CloudFormation. In general, I’m not a fan of those approaches, for reasons that I won’t detail here. In any case , they will never be fully successful if CloudFormation doesn’t support the infrastructure graph lifecycles that those abstractions need to build upon.
CloudFormation is an infrastructure graph management service — and needs to act more like it was originally published in A Cloud Guru on Medium, where people are continuing the conversation by highlighting and responding to this story.
Accelerate transformation at supersonic speed with a well designed cloud program enabled by committed change agents
Werner Vogels launched the 2017 AWS Summits in Sydney with his typical non-conformist approach that transcends both keynotes and strategy. The Amazon CTO took center stage wearing a custom T-shirt emblazoned with “Werner Against the Machine” — the chest insignia of a modern superhero.
As the ultimate committed change agent disguised as a CTO, Werner is using his AWS superpowers to save businesses worldwide — using speed, analytics, flexibility, the skill to adapt, and the power to take flight from ‘hostile’ database vendors.
The Supersonic Speed of an Enterprise
TL;DR — speed matters
For most other enterprises, supersonic speed is the most elusive superpower for a heroic cloud journey — and the most necessary to harness.
During my 20-year career at Capital One, speed at scale was the modus operandi of the lean enterprise. Accelerated by an agile mindset and DevOps culture, the rapid rate of adoption enabled a cloud journey that transformed Capital One into what is now essentially a large FinTech startup.
Capital One’s accelerated technology transition is powered by the API-driven cloud computing services from AWS, enabled by real-time access to big data, and fueled by a commitment to the open source community. While technology plays a leading role, the hardest part of that transition and ultimate superpower is really a talent transformation.
The Gravitational Pull of Legacy
Gaining the supersonic speed to achieve escape velocity from on-premise data centers requires bold leadership with a long-term dedication to innovation. There is no magic pill for enterprises — although going all-in with AWS is a leap in the right direction.
Cloud computing has posed a disruptive threat for years, but short-term incentive structures provide little reason for large corporations to invest in the disruptive technology. Many enterprises continue to be anchored by the weight of their own internal processes and platforms — and paralyzed by the FUD of vendors in survival mode contributing noise to their echo chambers.
“It’s not the big that eat the small … it’s the fast that eat the slow.“
— Laurence Haughton
The gravitational pull is preventing many enterprises from gaining momentum and advantage in the cloud — at least not at the speed required to avoid death by the fast and hungry. The only way out of the dilemma is to adopt a new approach — where continuous innovation is the new bottom line and speed is the name of the game.
Achieve Escape Velocity with a CCoE
For enterprises that are serious about cloud adoption and aspire to achieve supersonic speed, the executive leadership team must invest their time, resources and budget into the sponsorship of a Cloud Center of Excellence (CCoE).
starting your enterprise cloud adoption journey? please don't pass go and collect $200 without a cloud center of excellence #AWSSummit
The CCoE is an essential mechanism for large organizations planning to achieve the velocity required to escape the gravitational pull of their own death star — a private cloud or an on-premise data center.
Design the CCoE for Constant Evolution
TL;DR — transition matters
The Cloud Center of Excellence will continually evolve in order too keep pace with the rate of innovation associated with cloud adoption. Modern enterprises should be very purposeful with the organization design principles of their cloud program structure.
The leaders of cloud programs should observe the flow of their value within the organization, and design an adaptive structure that evolves alongside the internal needs of the enterprise. Simon Wardley infuses these concepts of flow and transition within his value chain mapping strategies — and his design principles apply as much to organizational design as they do towards product evolution.
All of this stuff - bimodal / two speed IT / dual operating - de-emphasises that important transition, the middle - https://t.co/hgpEvMurt5
Applying Simon’s design principles of Pioneers → Settlers → Town Planners (PST) toward an organization’s cloud adoption program offers an innovative approach for effectively navigating the evolving journey. Guided by astute situational awareness, organizations learn to continuously pivot through the cycle — from the exploration stage, through expansion, toward the enhancement of industrialized components and patterns.
Stage 1: The Cloud Center of Exploration
The Pioneers Explore
Embarking on a cloud journey requires a tremendous amount of iterative experimentation with the underlying AWS utility compute services. During this stage of early adoption, the core team is focused on pioneering engineers with plenty of aptitude and attitude.
The two-pizza team leverages agile techniques to break things daily — and collect the data which will determine the patterns of future success.
The core team is stacked with battle-tested engineers with deep experience in understanding how critical functions currently operate, and know how to translate existing data center platforms into cloud services. These engineers have plenty of scars and war stories from previous tours of duty with security, network, and access control.
ProTip 1: An executive sponsor is absolutely essential during the early phases of the cloud adoption. The sponsor must be a strong advocate, provide plenty of air-cover, and actively engage with the core team for the purpose of removing impediments from the board.
Stage 2: Cloud Center of Expansion
The Settlers Expand
Once the pioneers solidify early successes into identified patterns, the focus turns toward scaling the prototypes into products and services that are consumable by the enterprise. By listening to a broad range of internal customers, the settlers refine the patterns and help the understanding grow.
When the cloud services start to scale across an enterprise, it’s natural to place a heavier emphasis on governance and controls. A key advantage of AWS is the ability to engineer your governance by leveraging the API-driven services to access real-time controls and compliance.
At this stage, it’s imperative to focus on scaling the early understanding of AWS to the enterprise — achieving critical mass of cloud fluency is the only way an organization can sustain a transition to the new operating model.
Social Consensus Through the Influence of Committed Minorities shows that when just 10% of randomly distributed committed agents holds an unshakable belief, the prevailing majority opinion in a population can be rapidly reversed.
Invest the time and money into a multi-dimensional cloud education program. An engaged workforce armed with compelling context and content will dramatically accelerate your organization through the ‘trough of despair’ and ensure more attraction versus attrition.
During the enablement phase, the core team should no longer be considered the most cloud savvy department in the organization. By unleashing cloud superpowers upon thousands of developers throughout the enterprise, the core team should pivot and begin harvesting new and improved patterns from other divisions.
Elevating other departments beyond your core team’s existing capabilities is a key early indicator of achieving escape velocity.
ProTip: Instead of outsourcing your cloud training to Human Resources, tightly integrate cloud education as a core function on your program team. Leverage the AWS certifications as a benchmark for cloud fluency and set a minimum goal of 10% enterprise-wide.
Stage 3: Cloud Center of Enhancement
The Town Planners Enhance
As the cloud journey matures, it should lead toward the commoditization of services that result in more cost efficient, faster, and industrialized platforms. In highly regulated industries like financial services, the organization depends on these town planners to ensure customers and regulators can trust what’s built.
Innovation it’s not just limited to the early stages and pioneers — it’s also found in the operational stages of cloud adoption. For example, Capital One’s Terren Peterson is evolving the mindsets of their engineering teams by embracing the concepts of Site Reliability Engineering using innovative approaches to manage operations.
Their SRE teams leverages industrialized utility functions to manage compliance and controls with Cloud Custodian. The SRE’s also contribute new functions to the ever expanding platform — completing the cycle as higher order services continuously evolve.
Cloud Custodian is a function-based policy rules engine. The origin of the service demonstrates the PST design principles at work. Originally developed internally by their pioneers, it was scaled across the entire enterprise by settlers, and finally driven toward open source commoditization by the town planners.
ProTip: Involve your operational teams starting on day one of the cloud journey. Since operations is 24x7, consider a shift-and-lift for a subset of workloads. Leverage a cloud capable MSP for interim support to lighten the load during their talent transition.
Using these organizational design principles, a cloud program team can begin to continually cycle through the explore-expand-enhance stages. Over time, this approach will begin to harness Werner’s superpowers and accelerate cloud adoption at supersonic speeds.
Design your cloud center of excellence for constant evolution was originally published in A Cloud Guru on Medium, where people are continuing the conversation by highlighting and responding to this story.
The tips, tricks, and insider hints for the 2019 AWS re:Invent conference in Las Vegas from December 2–6, 2019
AWS re:Invent is taking place December, 2—6, 2019 in Las Vegas. Yet again, this promises to be the biggest cloud event of the year. Last year’s event was amazing. “Jam packed” doesn’t even begin to describe it — and yet, somehow this year is sure to be even better!
This guide will grow as we get closer to the show. Please check back regularly for updates! Ping me @marknca if you spot a problem or if something is missing.
In the meantime, you can sign up with AWS to receive conference updates — including a notification when registration opens.
Registration & Lodging
First up is registration. It opens 21-May-2019.
No official word on hotel rooms yet this year but last year and for the last couple of years, the conference room rates have been tied to registration. This means if you want to get a well positioned hotel room at the lowest rate possible, you want to register as quickly as possible.
These hotel room blocks disappear quickly. Making sure you have everything lined up for the 21st of May is a smart idea.
The official campus hasn’t been published yet but last year (2018) it was spread across seven properties with room blocks at an additional seven.
The challenge — as with last year — is that you will need to hop between venues throughout the week. The good news? These are all nice properties and it’s hard to go wrong.
Content in 2018 was divided among all of the seven main properties with breakouts from every track hosted in each location. The two big stand outs for events were the Quad—hosted at the Aria—and the Expo—hosted at The Venetian/Palazzo.
This guide will grow as we get closer to the show. Please check back regularly for updates! Ping me @marknca if you spot a problem or if something is missing.
The 90% figure is plucked out of thin air — it’s just me trying to pick a really big number without accurately calculating a percentage
What is Red Nose Day?
Since its launch in 1988, Red Nose Day has become something of a British institution. It’s the day when people across the land can get together and raise money at home, school and work to support vulnerable people and communities in the UK and internationally.
Now that Red Nose Day 2019 is over, I want to give you a bit of a run down on my learnings, the tooling that we used and the path taken, on what turned out to be nearly a full cloud migration from a containerised/EC2 ecosystem and into the world of serverless.
Upon starting at Comic Relief in April 2017, I joined a well-developed team that were already working in the cloud-native microservice world, with some legacy products operating on a fleet of EC2 servers (such as the website) and the rest either running on Pivotal Web Services (Cloud Foundry) or outsourced (Donation Platform).
Comic Relief was generally using Concourse CI (awesome tool) to deploy to Cloud Foundry or was using Jenkins for its legacy infrastructure. We frequently used Travis CI to run unit tests and do some code style tests, but have now moved over to CircleCI — thanks to a thorough comparison by Carlos Jimenez.
Nearly all of the code was written in PHP and was either Symfony or Slim Framework. I had quite a bit of experience of NodeJS at my previous role at Zodiac Media, was a big fan of running the same language on the front as on the back and not having to context switch when swapping between the two.
The project was my first introduction into Lambda and Serverless Framework and also my first lesson as to what works well in serverless. I created a serverless backend in NodeJS that accepted a POST request and then forwarded that on to RabbitMQ.
My colleague Andy Phipps (Senior Frontender) created a frontend in React which we dropped into S3 and created a CloudFront distribution in front of it. We then built a Concourse CI pipeline that ran the SLS DEPLOY command and ran some necessary mocha tests against the staging deploy and then deployed to production and did much the same for the frontend.
It was a liberating experience and became the foundation for everything that we would do after.
The next thing we wanted to do was to send an email from the contact service to our email service provider. I created another Serverless service that accepted messages either from RabbitMQ or via HTTP that again just took the message and formatted for our email service provider and forwarded on.
We then realized that we were using the same mailer code in our Symfony fundraising paying-in application, so we stripped out the code from payin and pointed it to our new mailer service. At this point, I started to realize that the majority of web development was some mix of presenting data and handling forms.
We then halted active development and steamed into Sport Relief 2018; this allowed us to test our assumptions towards Serverless and gain some real-world experience under heavy load.
The revelations were as follows:
- Cloudwatch is a pain to debug quickly, but a good final source of truth.
- Self-hosted RabbitMQ was OK but took a lot to manage for a small team. Why weren’t we using SQS?
- We were duplicating a lot of the boilerplate code across the two projects that we had created.
- Our API’s needed to be documented automatically as part of our pipelines and in code.
- Serverless Framework was the future.
We then went into a significant business restructure, which meant that a lot of our web-ops team ended up leaving. The result was a remit to simplify the infrastructure that we were using so that a smaller group could manage it, bringing ownership, responsibility and control to the development team. The approach was championed by our Engineering lead Peter Vanhee, without that, where we are now would never have happened.
The next obvious target was our Gift Aid application; the most basic description of it is a form that users submit their details so that we can claim gift aid on SMS submissions. The traffic hitting this application is generally very spikey of the back of a call to action on a BBC broadcast channel, ramping up from 0 to 10’s of thousands of requests in a matter of seconds.
We traditionally had a vast fleet of application and varnish servers to back this (150+ servers on EC2). As one of the most significant revenue sources, this gets a lot of traffic in the 5 hours that we are on mainstream TV and also in the lead up to it, so there was very little wiggle room to get it wrong.
At this point, we diverged from RabbitMQ and started deploying SQS queues from serverless.yml. My colleague Heleen Mol built a React app using create-react-app, and react-router hosted again on S3 with CloudFront in-front of it, this was and is the foundation of every public facing application we make, it can handle copious levels of load and takes zero maintenance.
At this point, it was apparent that we needed a good way to document our API’s alongside the code. We had previously been using swagger on our giving pages. However, it seemed a bit of a pain to set up, and I wanted something static that could be chucked into S3 and forgotten. We settled on apiDOC as it looked like it would be quick to integrate and was targeted at RESTful JSON API’s.
The primary donation system was previously outsourced to a company called Armakuni, who had built an ultra-resilient multi-cloud architecture across AWS and Google Cloud Platform.
It really seemed like the next logical step to bring the donation system in-house. This allowed us to share components and styling from our Storybook and Pattern Lab across our products, severely reducing the amount of duplication.
It should be noted that at this time we already had a payment service layer that had been built in previous years in Slim Framework which ran the Sport Relief app donation journey, our giving pages and shop.
As Peter (engineering lead) was heading away on paternity leave, the suggestion arose that if there was any time after moving over the giftaid backend to Serverless that I could create a proof of concept for the donation system in Serverless Framework. We agreed that as long I had my main tasks covered, then I would be able to give it a go with any time I had left. I then went about smashing out all of my tasks quicker than I was used to, to get onto the fun stuff ASAP!
After talking to the super knowledgable guys at Armakuni after the wrap up from Sport Relief, it was clear that we needed to recreate the highly redundant and resilient architecture that Armakuni had created, but in a serverless world.
Users would trigger deltas as they passed through the donation steps on the platform, these would go into an SQS queue, and then an SQS fan out on the backend would read the number of messages in the queue and trigger enough lambda’s to consume the message, but most importantly not overwhelm the backend services/database.
The API would load balance the payment service providers (Stripe, Worldpay, Braintree & Paypal), allowing us to gain redundancy and reach the required 150 donations per second that would safely get us through the night of TV (it can handle much more than this).
I initially put in AWS parameter store to store payment service provider configuration, this was free and therefore very attractive in a serverless world, but proved woefully incapable under load and was swapped out for storing configuration in S3.
I then created a basic frontend that would serve up the payment service provider frontend based on which provider the backend. Imported all of the styles over from the Comic Relief Pattern Lab and was good to demo it to Peter and the team on his return.
Upon Peter returning, we went through the system, discussed it’s viability and did some necessary load tests using Serverless Artillery, concluding that we could do what we thought we couldn’t!
A business case was put together by Peter and our Product Lead Caroline Rennie, and away we went. At this point, Heleen Mol and Sidney Barrah came on board and added meat to the bones, getting the system ready to go live and the ever impending night of TV.
Due to the nature of Red Nose Day, you don’t get many chances to test the system under peak load. We were struggling to get observability of what was going on in our functions using Cloudwatch.
At this point, Peter recommended that we try a tool that he had come across, which was IOPipe. IOPipe gave us unbelievable observability over our functions and how a user is interacting with them; it changed how we used Serverless and increased our confidence levels substantially.
At this point we also integrated Sentry, which alongside IOPipe gave us the killer one-two punch of being able to get a 360 view of errors within our system, allowing us to quantify bugs for our QA team (lead by Krupa Pammi) and trace the activity that caused them quickly and efficiently. I can’t think of a time where I have been able to have such an overview of everything going wrong, pretty scary, but excellent.
The next big part of the puzzle was the decision that we were copying way too much code between our Serverless projects. I had a look at Middy based on a recommendation from Peter, but at the time there wasn’t a vast amount of plugins for it, so decided to spin out our own lambda wrapper rather than having to learn and make plugins for a new framework and possibly run into Middy’s limitations (probably none).
I am still not sure yet how bad of an idea this was, however, it seems to work at scale, is easy to develop with and simple to onboard new developers, which is enough for it to stay for the time being.
Lambda wrapper encompasses all of the code to handle API Gateway requests, connect and send messages to SQS queues and a load of other common functionality. Lambda wrapper resulted in a massive code reduction across all of our Serverless projects. It also meant that the integration of Sentry & IOPipe was common and simple across all of our projects.
To add extra redundancy to the project, we introduced an additional region and created a traffic routing policy based on a health check from a status endpoint. We figured the chance of losing two geographically separate AWS regions was very low.
We also backed up all deltas to S3 on a retention policy of 5 days, to ensure that we could replay all deltas in the event of an SQS or RDS failure. We added timing code to all outbound dependencies using IOPipe and also created a dependency management system so that we could quickly pull out dependencies (such as Google Tag Manager or reCAPTCHA) from external providers at speed.
Based on a suggestion from AWS. We also added a regional AWS Web Application Firewall (WAF) to all of our endpoints, this introduced some basic protections, including stuff we already had covered, but higher up the chain, before API Gateway was even touched.
Another piece of the puzzle was to get decent insights into our delta publishes and processing, this gives us another way to get a good overview of what is happening with our system. We used InfluxDB to do this and consider it as an optional dependency of our system.
It was important for us to understand what our applications critical dependencies were, thus forming our application health check status and whether we would fall over to our backup region. InfluxDB is fantastic, however, is self-hosted. When AWS Timestream comes along, this will be out the door.
So the night of TV came and went on the 15th of March and the system performed nearly exactly as expected. The one unexpected, but now apparent weak point was the amount of reporting that we were trying to pull from the RDS read replica using Grafana and our live income reporting, we lowered our reporting requirements and were back on track within no time.
We originally used RDS so that we could achieve compatibility with our legacy payment service layer, in the future we will probably replace this with something more Serverless. Relying on AWS Timestream for more real-time analysis (when it arrives).
So to sum up this epic and overly long rundown of the journey to 90% Serverless:
- Try to get everything Serverless if you can, our highest monetary cost is RDS. It’s still nice to be able to run the SQL queries that we know and love, Athena and S3 are probably a solid replacement.
- Try to ingest data and work on it away from user interaction. You can provide the user with an endpoint where they can check on the status of processing. Manage as much state as you can with your frontend. This will hopefully give you redundancy and protection as a default.
- Lambda allows you to load test at a significant scale, do it often, make it part of your deployment/feature release strategy. Serverless pushes the load down the line and has a habit of finding weak points in your chain, so make sure you know where your weak points are going to be. Serverless Artillery is the way forward, do better than us and do it as part of your pipeline to production for the win!
- Continuously deploy, deploy on a Friday at 5 pm, don’t let fear stop you, create the tooling and automation tests to allow you not to worry. We use Nightwatch, Cypress and Mocha to significant effect. It should be noted that you need decent logging and a fast way to rollback code to be able to do this in a manageable way (Concourse CI).
- Serverless infrastructure cost is dependent on usage, so why not deploy your entire infrastructure on a pull request level and run tests in the PR against it. We do this, and it means that developers can be sure that before their code is merged, it works in real life and on our real-world infrastructure.
- Don’t host anything if you don’t have to, everything as a service. I am physically averse to calls about infrastructure outages at any time of the day. Also, go multi-region if you can, serverless makes this a doddle, and it reduces the voice of the crowd who will remind you that S3 took out US-EAST-1 in September 2017.
- Pick a piece of your architecture, migrate it to Serverless, get comfortable with it, rinse and repeat.
- The best system is the one that allows me to be in the pub after 17:30 or be at home with my family not checking my laptop, Serverless for the backend and a React application stored in S3 for the frontend gives you this.
- Concourse CI is probably one of the most expensive pieces of infrastructure that we are running, it doesn’t fit in with our fully Serverless headspace. Replacing it would be great. However, the power and flexibility it gives us to deploy reliably and continuously are unmatched. Sometimes in life, you can’t be all one thing, in this case, Serverless. We use Concourse UP to simplify it’s deployment and management, meaning that we don’t have to mess around with bosh.
- Don’t try to optimize/abstract your services too early when it comes to Serverless. I remember at my first job where all the servers had names, they were cared for and loved and were then quickly replaced with EC2 when AWS entered the fray. Serverless brings the same down to your code and services; they should perform a function, be replaced with ease and doted over just the right amount. Compose small but relevant services and be ready for the day where you type sls remove on that much-loved service!
The biggest lesson for me is that in my day job, I exist to solve business issues. I think sometimes as technologists we forget this. Serverless is the fastest way to decouple oneself from rubbish problems, move up the stack and move on to the next issue.
Be sure to watch this presentation by our Engineering Lead Peter Vanhee talking through the current architecture at Serverless Computing London, as well as this presentation featuring our Product Lead Caroline Rennie around the previous donations platform and the problem space.
The point is focus — that is the why of serverless
Functions are not the point
If you go serverless because you love Lambda, you’re doing it for the wrong reason. If you go serverless because you love FaaS in general, you’re doing it for the wrong reason. Functions are not the point.
Sure, I love Lambda — but that’s not why I advocate for serverless.
Don’t get me wrong, functions are great. They let you scale transparently, you don’t have to manage the runtime, and they fit naturally with event-driven architectures. These are all fantastic, useful properties.
But functions should end up being a small part of your overall solution. You should use functions as the glue, containing your business logic, between managed services that are providing the heavy lifting that forms the majority of your application.
Managed services are not the point
We are fortunate to have such a wide range of managed services for so many different parts of our applications. Databases, identity and access management (so glad I don’t have to own that myself!), analytics, machine learning, content delivery, message queues for all sorts of different patterns.
Managed services provide the functionality you need with less hassle. You’re not patching the servers they run on. You’re not making sure the autoscaling is correctly providing the required throughput without a lot of idle capacity. Managed services lowers your operational burden significantly.
Managed services are great — but … they aren’t the point.
Ops is not the point
It’s great to know that you can apply fewer operations resources to keep your applications healthy. It is especially great that the resources you need scales mostly with the number of features you ship — not with traffic volume.
Reduced operations is more efficient — but … it’s not the point.
Cost is not the point
Ok, sometimes all the business wants you to do is reduce cost — and that’s all you care about. And serverless will help you do that. But in general, your cloud bill is not the point.
Your cloud bill is only one component of the total cost of your cloud applications. First of all, there’s the operations salaries— and that cost is lower if you have fewer ops resources. There’s also your development costs.
There are a lot of cost advantages — but … none of these are the point.
Code is not the point
Not only is code not the point, code is a liability. Code can at best do exactly what you intend it to. Bugs detract from this. You can only lose points through more coding. The more code you own, the more opportunities exist to depart from your intended value. Understanding this is a cultural shift.
Technology has been hard for a long time. It’s taken clever people to create value through technology. So developers started to believe that cleverness was inherent and good. We’ve spent so long crafting Swiss watches that we’ve failed to recognize the advent of the quartz Casio — and impugn the evolution as lacking in elegance.
Instead of applying our cleverness to solving technology problems, we really need to be understanding and solving business problems. And when you have to code — you are solving technology problems.
Technology is not the point
The reason that we’re doing this, any of this, is in service of some business goal. The business value that your organization is trying to create is the point.
Now, sometimes, what you’re selling is literally technology. But even if your product is technology, that may not be the value of what you’re selling.
There’s an old adage that people don’t buy drills, they buy holes. When you need a hole in your wall, you don’t care how fancy the drill is — you care how well it creates that hole you need.
At iRobot, we don’t sell robots. We don’t even sell vacuums. We sell clean homes. Roomba gives you time back in your day to focus on the things that matter to you. So if technology isn’t the point, what are we here for?
The point is focus
Serverless is a way to focus on business value.
How do functions help you deliver value? They let you focus on writing business logic, not coding supporting infrastructure for your business logic.
Managed services let you focus on writing your functions. Having less operations resources frees up people and money to be applied to creating new value for your customers.
Observability gives you tools to address MTBF and MTTR, both of which are a measure of how often your customers aren’t getting value. Spending less on the cloud means you can spend that money more directly in support of creating value.
Focus is the Why of Serverless
You should go serverless because you want to focus on creating value — and at your company you endeavor to apply technology toward the creation of business value.
Going back to cost, Lyft’s AWS bill, $100 million per year, has been in the news recently. Many people chimed in to say they could do it cheaper — they couldn’t, but that’s beside the point.
Would Lyft’s bill be lower if they switched to Lambda and managed services for everything they possibly could? Probably. But what would that do as they spent time rearchitecting? They would lose focus.
The company is at a stage in its journey where growth is more important than cost control. Eventually, that might change. Public companies are responsible to their shareholders, and so cost reduction can deliver value to them. But for Lyft right now, delivering value to their customers means executing with their current applications and processes. They are making the serverless choice.
What I’m telling you is that serverless has never been about the technology we call serverless. So what does the technology that we call serverless have to do with it?
Serverless is a consequence of a focus on business value
Technology is a consequence of how you’re trying to deliver value. Dev and ops teams have traditionally been separated with the notion that they have different focuses. But we’re seeing that trend changing.
The traditional model put the focus on technology — dev tech vs ops tech. But we’re seeing people realize that the focus should be on the value — the feature being delivered, including both how it’s built and how it’s run.
When we take this notion of focusing on business value, and run it to its logical conclusion, we get serverless.
When you want to focus on delivering value, you want to write functions. When your function needs state, you want a database. To get it from someone else, you use DBaaS — and you choose between your options based on how well it lets you focus.
And when you’re choosing managed services, some of them may even be user-facing. If you can use social login instead of owning your own accounts, that’s one less thing you have to manage, and one less piece of the user experience table stakes you need to own.
Now, for everything you are outsourcing, you are still accountable. Your users don’t care if their bad experience is caused by a third party you’re using, it’s still your problem. You need to own outages to your users while accepting that you don’t fully control your destiny there. This is an uncomfortable place to be — but it’s worthwhile.
You can’t win points on these things — but you can lose points. This means that you need to know what “bad” looks like. That requires having enough knowledge about the outsourced pieces of your product and your technology to know that you’re delivering enough quality to your users.
Note that deep expertise in a focus area, and broad but thin knowledge of adjacent areas is exactly analogous to the T-shaped skills concept — applied to organizations and teams.
Serverless is a trait
Serverless is a trait of companies. A company is serverless if it decides that it shouldn’t own technology that isn’t core to delivering its business value. Few companies are really totally serverless. But within a company, you can still have parts that are serverless.
If your team decides to focus only on the value it’s delivering, and delegate anything outside that either to another team, or ideally outside — then your team is going serverless. And you can’t always choose to use an outside technology — that’s fine, you can still make the best choice given the constraints.
And with a big enough organization, it can cease to matter. When Amazon.com uses Lambda, that’s fully serverless, even though it’s on-prem in some sense. But what if it’s just you?
What if you’re excited about serverless, but you feel completely alone at your company? What if you’re far removed from actual business value — if you’re patching servers for a team that serves a team that serves a team that creates user-facing content? I want to convince you that you can go serverless today, yourself, in any situation.
Serverless is a direction, not a destination
I used to talk about serverless as a spectrum, because I knew there wasn’t a bright line separating serverless technology from non-serverless technology. I mean, there almost never is a bright line separating any given grouping of anything, so I was pretty safe in that assumption.
I talked about how something like Kinesis, where you need to manage shards, is serverless, but less serverless than SQS, where you don’t. How you don’t have to patch instances with RDS, but you do need to choose instance types and number. These technologies are all various shades of serverless.
But recently I’ve come to realize a problem with portraying serverless as a spectrum is that it doesn’t imply movement. Just because you’re using something designated serverless of a sort doesn’t mean you should feel comfortable that you’ve attained serverless — that it’s acceptable to keep using that and think you’ve checked the serverless box.
Climb the serverless ladder
I’ve started to think of serverless as a ladder. You’re climbing to some nirvana where you get to deliver pure business value with no overhead. But every rung on the ladder is a valid serverless step.
If you move from on-prem to a public cloud, that’s a rung on the ladder. If you move from VMs to containers, that’s a rung on the ladder. If you move from no container orchestration, or custom orchestration, to Kubernetes, that’s a rung on the ladder. If you move from long-lived servers to self-hosted FaaS, that’s a rung on the ladder. But there’s always a rung above you, and you should always keep climbing.
One thing the “ladder” doesn’t convey is that it’s not linear. Moving from VMs to containers to Kubernetes while staying on-prem are rungs on the ladder, but so is moving your VMs from on-prem to the cloud. There’s often not a definitive “better” in these cases.
I thought of the metaphor of many paths leading up a mountain, but one thing I like about the ladder is that it can be infinite. There isn’t an end state. I love Lambda, but I am always looking for better ways of delivering code that keep me more focused on value.
Serverless is a State of Mind
Serverless is about how you make decisions — not about your choices. Every decision is made with constraints. But if you know the right direction, even when you can’t move directly that way, you can take the choice that’s most closely aligned, and then you’re moving up another rung. So, how do you adopt this mindset? How do you make serverless choices?
Configuration is your friend
I think many developers look down on configuration as “not real programming”. There’s an idolatry of coding today. We’ve been told that “software is eating the world”, and we’ve inaccurately translated that to “coding is eating the world”.
We’ve come to believe that developers are the only important people in an organization, and that our sense of productivity is the only thing that matters. We want to feel in the zone, and that’s what coding provides. Any obstacle to this must be bad for the business. We’ve lost any sense of whether being in the zone is actually producing value faster and better than an alternative route.
Remember: Days of programming can save hours of configuration
Constraints are good. Removing options can help you focus. Obviously, not all constraints are good — but in general, the ability to do anything general comes at the cost of it taking longer to do one particular thing. Guard rails may chafe, but you’ll be faster than if you have to constantly watch the edge.
In this way, serverless is about minimalism. Removing distractions. Marie Kondo is big now, and the same advice applies. Find the components of your stack that don’t spark value.
Be afraid of the enormity of the enormity of the possible
Possibilities carry with them hidden complexity. For any technology, one of my primarily evaluation metrics is how much capability it has beyond the task at hand. When there’s a lot of extra space, there’s unnecessary complexity to both deal with and learn.
People tout Kubernetes as a single tool to accomplish every cloud need — and it can! But if everything is possible, anything is possible. For a given task, Kubernetes can go wrong because you haven’t accounted for the ways it acts for situations unrelated that task.
On the other hand, when you look at serverless services, you may have to choose between a 80% solution from your main provider, or a 3rd party provider with a service that better fits your needs. But what are the operations needs for that new provider? What’s the auth like? Those are hidden complexities that you’ll pull in — and you’ll need to trade that off with feature differences.
Accept the discomfort of not owning your own destiny
When you’re using a managed service, provider outages are stressful. There’s nothing you can do to fix their problem. There is no getting around it — this will always feel awful.
You’ll think, “if I was running my own Kafka cluster instead of using Kinesis, I could find the issue and fix it”. And that may be true, but you should remember two things:
- That would be a distraction from creating business value.
- You would almost certainly be worse at running it. You’d have more and worse incidents. It’s a service provider’s purpose in life to be good at it — and they have economies of scale you don’t.
Moving past the “I could always build it myself” attitude can be hard. Jared Short recently provided a brilliant set of guidelines for choosing technology.
My thinking on serverless these days in order of consideration. – If the platform has it, use it – If the market has it, buy it – If you can reconsider requirements, do it – If you have to build it, own it
In order, if you’re on a cloud platform, stay within the ecosystem when possible. You’re removing so many possibilities from the equation that way. But if you can’t get what you need on the platform, buy it from somewhere else.
If you can’t buy exactly what you need, can you rethink what you’re doing to fit what you can buy? This one is really important. It gets to the heart of time-to-market.
If you have something you think is valuable, you’ll want to ship it as soon as possible. But it’s better to ship something near to that faster, than to build the exact thing, You don’t know that it’s the right thing yet.
Waiting to build the exact right thing will not only take longer to ship, but your subsequent iterations will be slower — and maintenance of it will take resources that you could apply to shipping more things in the future. This applies even when the technology isn’t serverless: always ask if a tweak to your requirements would enable faster, better, or more focused delivery of value.
Finally, though, if you have to build it, own it. Find a way for it to be a differentiator. Now, this doesn’t mean everything you’ve built already you should turn into a differentiator. Look at only the things you can’t have bought as a service in a perfect world. Imagine what a completely serverless, greenfield implementation would look like, and find what needs to be built there.
Find your part of the business value
So fundamentally, you want to find your part of the business value. What is your technology work in service of? Maybe you’re far removed from user-facing product. You may only be contributing a small slice. But it’s there, and you can find it — and focus on that value.
Start with the immediate value you’re providing to others in the organization, and focus on that. And then start to trace the value chain. Make sure all your decisions are oriented around the value you’re creating. Make serverless choices.
Hire the people who will automate themselves out of a job, then just keep giving them jobs.
I love this quote from Jessie Frazelle. You can turn it around; automate yourself out of a job, and keep demanding jobs.
Remember that you are not the tool. For any value that you’re creating — automate that creation. If you manage build servers, find ways to make them self-service, so what you’re delivering is not the builds per se, but the build tooling so teams can deliver the builds themselves.
TL;DR Serverless is a State of Mind
The point is not functions, managed services, operations, cost, code, or technology. The point is focus — that is the why of serverless.
Serverless is a consequence of a focus on business value. It is a trait. It is a direction, not a destination. Climb the never-ending serverless ladder.
Configuration is your friend. Days of programming can save hours of configuration. Be afraid of the enormity of the enormity of the possible. Accept the discomfort of not owning your own destiny
Find your part of the business value, and achieve a serverless state of mind.
Why serverless architecture?
3 reasons serverless might be a good idea for you
The innovation in the tech industry paved a new way for infrastructures as a service — bringing a new paradigm shift from infrastructure and hardware ownership to a subscription and capacity on demand model. In minutes, you can now spin up several instances of virtual servers across multiple regions and dramatically reduce your time to market.
While this is still happening right in front of our eyes, a new era of serverless architecture is also rapidly unfolding in parallel taking away the worries of having to even manage servers. With serverless, you no longer even have to know the underlying operating system that is running your application.
Serverless offers developers the ability to quickly experiment with new ideas and fail fast while moving forward — allowing teams to focus on what matters (code) while the undifferentiated heavy lifting is managed by others.
If you are not satisfied with your current architecture or thinking of launching a new idea, here are three reasons serverless might be a good idea for you.
#1 Serverless reduces your time to market
Hiring developers is hard — and finding engineers with the right set of skills for your product needs is even harder. Not only do you have to find developers who can transform your product ideas into code, but also those who can manage the underlying infrastructure that supports your code.
With serverless architecture, your development team can concentrate on writing functions that are tied to your business value, and deploy them in minutes with security, logging and scalability.
#2 Serverless is cheaper
Yes, you read right! As ridiculous as it may sounds, running apps on a serverless environment is a way cheaper than you thought.
The startup phase is characterized by lots of experimentation, followed by continuous iterations designed to test and validate ideas. During these early stages, keeping your costs as low as possible helps to extend your runway.
There are lots of examples of teams migrating to serverless and significantly reducing their costs. No longer do teams have to bear the pain of managing and monitoring servers in development, test, and production environments — or paying for idle EC2 instances.
#3 Serverless scales with your idea
Startups can grow at a rapid pace — and sooner or later scaling becomes a pain point. The last thing a startup wants to get in the way of their growth is infrastructure constraints.
While some startups address the problem by throwing more resources at the underlying infrastructure, it usually requires some reengineering of the existing architecture to effectively scale — with significant costs.
The ability to scale is an area where serverless shines. With the proper architecture, a startup can rapidly scale as your business grows without massive investments in additional infrastructure.
TL;DR Serverless is not just another buzzword
Like any new technology, there is always a cloud of skepticism around and more often you hear words like Serverless is not ready yet or it’s just another buzz word.
Are you ready for serverless?