AWS Elemental MediaConvert now offers expanded support for audio-only workflows with the ability to create MP3 audio files. With MediaConvert, you can easily produce MP3 audio for devices and services that use this format with a simple selection from the console. Create MP3 files in standalone audio conversion jobs, extracted from video files, or as supplementary outputs produced during the video transcoding process. To learn more about supported audio formats, please see the documentation pages.
Amazon Elastic Container Service (ECS) now supports Amazon Elastic Filesystem (EFS) filesystems in ECS task definitions (in preview). When using ECS task definitions compatible with the EC2 launch type, customers can add EFS filesystems to their task definitions. This enables persistent, shared storage to be defined and used at the task and container level in ECS.
Amazon ElastiCache ahora ofrece nodos R5, la última generación de nodos optimizados para el rendimiento y la memoria, para maximizar el rendimiento de la red y el uso de la CPU en la región América del Sur (São Paulo) de AWS. Los nodos R5 cuentan con AWS Nitro System, una combinación de hardware dedicado e hipervisor ligero que ofrece prácticamente todos los recursos informáticos y de memoria a las máquinas virtuales huéspedes.
Ahora puede utilizar los chatbots de Amazon Lex en la región Asia Pacífico (Sídney) de AWS. Amazon Lex permite la creación de chatbots de conversación inteligentes para convertir los flujos de contacto de Amazon Connect en conversaciones naturales. Estos pueden ser utilizados para automatizar las interacciones de alto volumen sin comprometer la experiencia del cliente. Los clientes que se conectan al centro de contacto de Amazon Connect pueden interactuar con un chatbot de Amazon Lex y realizar tareas tales como cambiar una contraseña, solicitar el balance de una cuenta o programar una cita utilizando un lenguaje de conversación natural. Los clientes pueden decir cosas como “necesito ayuda con mi dispositivo”, en lugar de escuchar y recordar una lista de opciones, como marque el 1 para ventas, o el 2 para agendar una cita.
Este Quick Start implementa automáticamente una instancia lista para la producción de la versión 5.5.3 de IBM FileNet Content Manager en la nube de AWS en una Virtual Private Cloud (VPC) que cubre múltiples zonas de disponibilidad.
AWS Direct Connect support for AWS Transit Gateway is now available in AWS AWS Middle East (Bahrain) region. With this feature, customers can connect thousands of Amazon Virtual Private Clouds (Amazon VPCs) in multiple AWS Regions to their on-premises networks using 1/2/5/10 Gbps AWS Direct Connect connections.
You can now centrally aggregate your AWS Health events from all accounts in your organization. AWS Organizations enables you to centrally govern and manage across multiple AWS accounts. The new AWS Health Organizational View provides centralized and real-time access to all AWS Health events posted to individual accounts in your organization, including operational issues, scheduled maintenance, and account notifications. You can start using Organizational View today via the AWS Health API.
While there’s no doubt that machine learning (ML) can be a great tool for businesses of all shapes and sizes, actually building ML models can seem daunting at first. Cloud AutoML—Google Cloud’s suite of products—provides tools and functionality to help you build ML models that are tailored to your specific needs, without needing deep ML expertise.
AutoML solutions provide a user interface that walks you through each step of model building, including importing data, training your model on the data, evaluating model performance, and predicting values with the model. But, what if you want to use AutoML products outside of the user interface? If you’re working with structured data, one way to do it is by using the AutoML Tables SDK, which lets you trigger—or even automate—each step of the process through code.
There is a wide variety of ways that the SDK can help embed AutoML capabilities into applications. In this post, we’ll use an example to show how you can use the SDK from end-to-end within your Jupyter Notebook. Jupyter Notebooks are one of the most popular development tools for data scientists. They enable you to create interactive, shareable notebooks with code snippets and markdown for explanations. Without leaving Google Cloud’s hosted notebook environment, AI Platform Notebooks, you can leverage the power of AutoML technology.
There are several benefits of using AutoML technology from a notebook. Each step and setting can be codified so that it runs the same every time by everyone. Also, it’s common, even with AutoML, to need to manipulate the source data before training the model with it. By using a notebook, you can use common tools like pandas and numpy to preprocess the data in the same workflow. Finally, you have the option of creating a model with another framework, and ensemble that together with the AutoML model, for potentially better results. Let’s get started!
Understanding the data
The business problem we’ll investigate in this blog is how to identify fraudulent credit card transactions. The technical challenge we’ll face is how to deal with imbalanced datasets: only 0.17% of the transactions in the dataset we’re using are marked as fraud. More details on this problem are available in the research paper Calibrating Probability with Undersampling for Unbalanced Classification.
To get started, you’ll need a Google Cloud Platform project with billing enabled. To create a project, follow the instructions here. For a smooth experience, check that the necessary storage and ML APIs are enabled. Then, follow this link to access BigQuery public datasets in the Google Cloud console.
In the Resources tree in the bottom-left corner, navigate through the list of datasets until you find ml-datasets, and then select the ulb-fraud-detection table within it.
Click the Preview tab to preview sample records from the dataset. Each record has the following columns:
- Time is the number of seconds between the first transaction in the dataset and the time of the selected transaction.
- V1-V28 are columns that have been transformed via a dimensionality reduction technique called PCA that has anonymized the data.
- Amount is the transaction amount.
Set up your Notebook Environment
Now that we’ve looked at the data, let’s set up our development environment. The notebook we’ll use can be found in AI Hub. Select the “Open in GCP” button, then choose to either deploy the notebook in a new or existing notebook server.
Configure the AutoML Tables SDK
Next, let’s highlight key sections of the notebook. Some details, such as setting the project ID, are omitted for brevity, but we highly recommend running the notebook end-to-end when you have an opportunity.
We’ve recently released a new and improved AutoML Tables client library. You will first need to install the library and initialize the Tables client.
Import the Data
The first step is to create a BigQuery dataset, which is essentially a container for the data. Next, import the data from the BigQuery fraud detection dataset. You can also import from a CSV file in Google Cloud Storage or directly from a pandas dataframe.
Train the Model
First, we have to specify which column we would like to predict, or our target column, with set_target_column(). The target column for our example will be “Class”—either 1 or 0, if the transaction is fraudulent or not.
Then, we’ll specify which columns to exclude from the model. We’ll only exclude the target column, but you could also exclude IDs or other information you don’t want to include in the model.
There are a few other things you might want to do that aren’t necessary needed in this example:
Set weights on individual columns
Create your own custom test/train/validation split and specify the column to use for the split
Specify which timestamp column to use for time-series problems
Override the data types and nullable status that AutoML Tables inferred during data import
The one slightly unusual thing that we did in this example is override the default optimization objective. Since this is a very imbalanced dataset, it’s recommended that you optimize for AU-PRC, or the area under the Precision/Recall curve, rather than the default AU-ROC.
Evaluate the Model
After training has been completed, you can review various performance statistics on the model, such as the accuracy, precision, recall, and so on. The metrics are returned in a nested data structure, and here we are pulling out the AU-PRC and AU-ROC from that data structure.
Deploy and Predict with the Model
To enable online predictions, the model must first be deployed. (You can perform batch predictions without deploying the model).
We’ll create a hypothetical transaction record with similar characteristics and predict on it. After invoking the predict() API with this record, we receive a data structure with each class and its score. The code below finds the class with the maximum score.
Now that we’ve seen how you can use AutoML Tables straight from your notebook to produce an accurate model of a complex problem, all with a minimal amount of code, what’s next?
To find out more, the AutoML Tables documentation is a great place to start. When you’re ready to use AutoML in a notebook, the SDK guide has detailed descriptions of each operation and parameter. You might also find our samples on Github helpful.
After you feel comfortable with AutoML Tables, you might want to look at other AutoML products. You can apply what you’ve learned to solve problems in Natural Language, Translation, Video Intelligence, and Video domains.
Find me on Twitter at @kweinmeister, and good luck with your next AutoML experiment!
Anomaly detection plays a vital role in many industries across the globe, such as fraud detection for the financial industry, health monitoring in hospitals, fault detection and operating environment monitoring in the manufacturing, oil and gas, utility, transportation, aviation, and automotive industries.
Anomaly detection is about finding patterns in data that do not conform to expected behavior. It is important for decision-makers to be able to detect them and take proactive actions if needed. Using the oil and gas industry as one example, deep-water rigs with various equipment are intensively monitored by hundreds of sensors that send measurements in various frequencies and formats. Analysis or visualization is hard using traditional software platforms, and any non-productive time on deep-water oil rig platforms caused by the failure to detect anomaly could mean large financial losses each day.
Companies need new technologies like Azure IoT, Azure Stream Analytics, Azure Data Explorer and machine learning to ingest, processes, and transform data into strategic business intelligence to enhance exploration and production, improve manufacturing efficiency, and ensure safety and environmental protection. These managed services also help customers dramatically reduce software development time, accelerate time to market, provide cost-effectiveness, and achieve high availability and scalability.
While the Azure platform provides lots of options for anomaly detection and customers can choose the technology that best suits their needs, customers also brought questions to field facing architects on what use cases are most suitable for each solution. We’ll examine the answers to these questions below, but first, you’ll need to know a couple definitions:
What is a time series? A time series is a series of data points indexed in time order. In the oil and gas industry, most equipment or sensor readings are sequences taken at successive points in time or depth.
What is decomposition of additive time series? Decomposition is the task to separate a time series into components as shown on the graph below.
Time-series forecasting and anomaly detection
Anomaly detection is the process to identify observations that are different significantly from majority of the datasets.
This is an anomaly detection example with Azure Data Explorer.
- The red line is the original time series.
- The blue line is the baseline (seasonal + trend) component.
- The purple points are anomalous points on top of the original time series.
To detect anomalies, either Azure Stream Analytics or Azure Data Explorer can be used for real-time analytics and detection as illustrated in the diagram below.
Azure Stream Analytics is an easy-to-use, real-time analytics service that is designed for mission-critical workloads. You can build an end-to-end serverless streaming pipeline with just a few clicks, go from zero to production in minutes using SQL, or extend it with custom code and built-in machine learning capabilities for more advanced scenarios.
Azure Data Explorer is a fast, fully managed data analytics service for near real-time analysis on large volumes of data streaming from applications, websites, IoT devices, and more. You can ask questions and iteratively explore data on the fly to improve products, enhance customer experiences, monitor devices, boost operations, and quickly identify patterns, anomalies, and trends in your data.
Azure Stream Analytics or Azure Data Explorer?
Data Explorer is for on-demand or interactive near real-time analytics, data exploration on large volumes of data streams, seasonality decomposition, ad hoc work, dashboards, and root cause analyses on data from near real-time to historical. It will not suit you use case if you need to deploy analytics onto the edge.
You can set up a Stream Analytics job that integrates with Azure Machine Learning Studio.
Data Explorer provides native function for forecasting time series based on the same decomposition model. Forecasting is useful for many scenarios like preventive maintenance, resource planning, and more.
Stream Analytics does not provide seasonality support, with the limitation of sliding windows size.
Data Explorer provides functionalities to automatically detect the periods in the time series or allows you to verify that a metric should have specific distinct period(s) if you know them.
Stream Analytics does not support decomposition.
Data Explorer provides function which takes a set of time series and automatically decomposes each time series to its seasonal, trend, residual, and baseline components.
Filtering and Analysis
Stream Analytics provides functions to detect spikes and dips or change points.
Data Explorer provides analysis to finds anomalous points on a set of time series, and a root cause analysis (RCA) function after anomaly is detected.
Stream Analytics provides a filter with reference data, slow-moving, or static.
Data Explorer provides two generic functions:
• Finite impulse response (FIR) which can be used for moving average, differentiation, shape matching
• Infinite impulse response (IIR) for exponential smoothing and cumulative sum
Stream Analytics provides detections for:
• Spikes and dips (temporary anomalies)
• Change points (persistent anomalies such as level or trend change)
Data Explorer provides detections for:
• Spikes & dips, based on enhanced seasonal decomposition model (supporting automatic seasonality detection, robustness to anomalies in the training data)
• Changepoint (level shift, trend change) by segmented linear regression
• KQL Inline Python/R plugins enable extensibility with other models implemented in Python or R
Azure Data Analytics, in general, brings you the best of breed technologies for each workload. The new Real-Time Analytics architecture (shown above) allows leveraging the best technology for each type of workload for stream and time-series analytics including anomaly detection. The following is a list of resources that may help you get started quickly:
If you haven’t already, check out this GitHub repository for Anomaly detection in Azure Stream Analytics
Check out his GitHub repository for Anomaly detection and forecasting in Azure Data Explorer, and Time series analysis in Azure Data Explorer.
Documentation on Kusto query language and Time Series Analysis
Microsoft Sustainability Calculator helps enterprises analyze the carbon emissions of their IT infrastructure
For more than a decade, Microsoft has been investing to reduce environmental impact while supporting the digital transformation of organizations around the world through cloud services. We strive to be transparent with our commitments, evidenced by our announcement that Microsoft’s cloud datacenters will be powered by 100 percent renewable energy sources by 2025. The commitments and investments we make as a company are important steps in reducing our own environmental impact, but we recognize that the opportunity for positive change is greatest by empowering customers and partners to achieve their own sustainability goals.
An industry first—the Microsoft Sustainability Calculator
Today we’re announcing the availability of the Microsoft Sustainability Calculator, a Power BI application for Azure enterprise customers that provides new insight into carbon emissions data associated with their Azure services. Migrating from traditional datacenters to cloud services significantly improves efficiencies, however, enterprises are now looking for additional insights into the carbon impact of their cloud workloads to help them make more sustainable computing decisions. For the first time, those responsible for reporting on and driving sustainability within their organizations will have the ability to quantify the carbon impact of each Azure subscription over a period of time and datacenter region, as well as see estimated carbon savings from running those workloads in Azure versus on-premises datacenters. This data is crucial for reporting existing emissions and is the first step in establishing a foundation to drive further decarbonization efforts.
Providing transparency with rigorous methodology
The tool’s calculations are based on a customer’s Azure consumption, informed by the research in the 2018 whitepaper, “The Carbon Benefits of Cloud Computing: a Study of the Microsoft Cloud”, and have been independently verified by Apex, a leading environmental verification body. The calculator factors in inputs such as the energy requirements of the Azure service, the energy mix of the electric grid serving the hosting datacenters, Microsoft’s procurement of renewable energy in those datacenters, as well as the emissions associated with the transfer of data over the internet. The result is an estimate of the greenhouse gas (GHG) emissions, measured in total metric tons of carbon equivalent (MTCO2e) related to a customer’s consumption of Azure.
The calculator gives a granular view of the estimated emissions savings from running workloads on Azure by accounting for Microsoft’s IT operational efficiency, IT equipment efficiency, and datacenter infrastructure efficiency compared to that of a typical on-premises deployment. It also estimates the emissions savings attributable to a customer from Microsoft’s purchase of renewable energy.
We also understand customers want transparency into the specific commitments we are making to build a more sustainable cloud. To make that information easily accessible, we’ve built a view within the tool of the renewable energy projects that Microsoft has invested in as part of its carbon neutral and renewable energy commitments. Each year Microsoft purchases renewable energy to cover its annual cloud consumption. Customers can use the world map to learn about projects in regions where they consume Azure services or have a regional presence. The projects are examples of the investments that Microsoft has made since 2012.
A path to actionable insight
Azure enterprise customers can get started by downloading the Microsoft Sustainability Calculator from AppSource now and following the included setup instructions. We’re excited by the opportunity this new tool provides for our customers to gain a deeper understanding of their current infrastructure and drive meaningful sustainability conversations within their organizations. We see this as a first step and plan to deepen and expand the tool’s capabilities in the future. We know our customers would like an even more comprehensive view of the sustainability benefits of our cloud services and look forward to supporting and enabling them in their journey.
Amazon Aurora with MySQL compatibility supports the ANSI READ COMMITTED isolation level on read replicas. This isolation level enables long-running queries on an Aurora read replica to execute without impacting the throughput of writes on the writer node.
You’re moving faster than ever to build new applications, innovate, and bring value to your customers. Anthos, Google Cloud’s open application modernization platform, can help you modernize your existing applications, making them more portable, maintainable, scalable and secure. And now, our newest learning specialization, Architecting Hybrid Cloud Infrastructure with Anthos, is live, showing how you can use its technologies to transform your IT environments.
Designed for infrastructure operators, architects, and DevOps professionals, Architecting Hybrid Cloud Infrastructure with Anthos teaches you how to modernize, observe, secure, and manage your applications using Istio-powered service mesh and Kubernetes, whether you’re on-premises, on Google Cloud, or distributed across both. With a mix of lectures and hands-on labs, you’ll learn about compute, networking, service mesh, config management, and their underlying control-planes, so you can begin to understand the full scope of the platform’s capabilities. The training also unpacks the complexities of modern environments, and equips you with the foundational knowledge needed to address challenges such as migrating and scaling among environments hosted in multiple regions and by multiple providers.
This specialization builds on the Architecting with Google Kubernetes Engine (GKE) learning specialization, and assumes that students have extensive hands-on experience with Kubernetes. Architecting Hybrid Cloud Infrastructure with Anthos is delivered as three courses, which are available on demand and in a classroom setting:
Hybrid Cloud Infrastructure Foundations with Anthos – This course lays the groundwork for assembling hybrid infrastructure by presenting the Anthos platform architecture including Anthos GKE and Anthos Service Mesh.
Hybrid Cloud Service Mesh with Anthos – Gain the practical skills you need to deploy a service mesh to overcome challenges in multi-service application management, operation, and delivery.
Hybrid Cloud Multi-Cluster with Anthos – The final course will help you to understand configuration and get hands-on practice to manage a multi-cluster Anthos GKE deployment, including on-premises and in-cloud clusters.
Interested in hearing more? Register today for our webinar, Architecting Hybrid Cloud Infrastructure with Anthos, on Jan 31 at 9:00 am PST to get hands-on Anthos experience and receive a special discount on additional Anthos training.