Developer updates from Coral

Posted by The Coral Team

We’re always excited to share updates to our Coral platform for building edge ML applications. In this post, we have some interesting demos, interfaces, and tutorials to share, and we’ll start by pointing you to an important software update for the Coral Dev Board.

Important update for the Dev Board / SoM

If you have a Coral Dev Board or Coral SoM, please install our latest Mendel update as soon as possible to receive a critical fix to part of the SoC power configuration. To get it, just log onto your board and install the update as follows:

Dev Board / Som

This will install a patch from NXP for the Dev Board / SoM’s SoC, without which it’s possible the SoC will overstress and the lifetime of the device could be reduced. If you recently flashed your board with the latest system image, you might already have this fix (we also updated the flashable image today), but it never hurts to fetch all updates, as shown above.

Note: This update does not apply to the Dev Board Mini.

Manufacturing demo

We recently published the Coral Manufacturing Demo, which demonstrates how to use a single Coral Edge TPU to simultaneously accomplish two common manufacturing use-cases: worker safety and visual inspection.

The demo is designed for two specific videos and tasks (worker keepout detection and apple quality grading) but it is designed to be easily customized with different inputs and tasks. The demo, written in C++, requires OpenGL and is primarily targeted at x86 systems which are prevalent in manufacturing gateways – although ARM Cortex-A systems, like the Coral Dev Board, are also supported.

demo image

Web Coral

We’ve been working hard to make ML acceleration with the Coral Edge TPU available for most popular systems. So we’re proud to announce support for WebUSB, allowing you to use the Coral USB Accelerator directly from Chrome. To get started, check out our WebCoral demo, which builds a webpage where you can select a model and run an inference accelerated by the Edge TPU.

 Edge TPU

New models repository

We recently released a new models repository that makes it easier to explore the various trained models available for the Coral platform, including image classification, object detection, semantic segmentation, pose estimation, and speech recognition. Each family page lists the various models, including details about training dataset, input size, latency, accuracy, model size, and other parameters, making it easier to select the best fit for the application at hand. Lastly, each family page includes links to training scripts and example code to help you get started. Or for an overview of all our models, you can see them all on one page.

Models, trained TensorFlow models for the Edge TPU

Transfer learning tutorials

Even with our collection of pre-trained models, it can sometimes be tricky to create a task-specific model that’s compatible with our Edge TPU accelerator. To make this easier, we’ve released some new Google Colab tutorials that allow you to perform transfer learning for object detection, using MobileDet and EfficientDet-Lite models. You can find these and other Colabs in our GitHub Tutorials repo.

We are excited to share all that Coral has to offer as we continue to evolve our platform. Keep an eye out for more software and platform related news coming this summer. To discover more about our edge ML platform, please visit Coral.ai and share your feedback at [email protected].

Machine Learning GDEs: Q1 2021 highlights, projects and achievements

Posted by HyeJung Lee and MJ You, Google ML Ecosystem Community Managers. Reviewed by Soonson Kwon, Developer Relations Program Manager.

Google Developers Experts is a community of passionate developers who love to share their knowledge with others. Many of them specialize in Machine Learning (ML). Despite many unexpected changes over the last months and reduced opportunities for various in person activities during the ongoing pandemic, their enthusiasm did not stop.

Here are some highlights of the ML GDE’s hard work during the Q1 2021 which contributed to the global ML ecosystem.

ML GDE YouTube channel

ML GDE YouTube page

With the initiative and lead of US-based GDE Margaret Maynard-Reid, we launched the ML GDEs YouTube channel. It is a great way for GDEs to reach global audiences, collaborate as a community, create unique content and promote each other’s work. It will contain all kinds of ML related topics: talks on technical topics, tutorials, interviews with another (ML) GDE, a Googler or anyone in the ML community etc. Many videos have already been uploaded, including: ML GDE’s intro from all over the world, tips for TensorFlow & GCP Certification and how to use Google Cloud Platform etc. Subscribe to the channel now!!

TensorFlow Everywhere

TensorFlow Everywhere logo

17 ML GDEs presented at TensorFlow Everywhere (a global community-led event series for TensorFlow and Machine Learning enthusiasts and developers around the world) hosted by local TensorFlow user groups. You can watch the recorded sessions in the TensorFlow Everywhere playlist on the ML GDE Youtube channel. Most of the sessions cover new features in Tensorflow.

International Women’s Day

Many ML GDEs participated in activities to celebrate International Women’s Day (March 8th). GDE Ruqiya Bin Safi (based in Saudi Arabia) cooperated with WTM Saudi Arabia to organize “Socialthon” – social development hackathons and gave a talk “Successful Experiences in Social Development“, which reached 77K viervers live and hit 10K replays. India-based GDE Charmi Chokshi participated in GirlScript’s International Women’s Day event and gave a talk: “Women In Tech and How we can help the underrepresented in the challenging world”. If you’re looking for more inspiring materials, check out the “Women in AI” playlist on our ML GDE YouTube channel!

Mentoring

ML GDEs are also very active in mentoring community developers, students in the Google Developer Student Clubs and startups in the Google for Startups Accelerator program. Among many, GDE Arnaldo Gualberto (Brazil) conducted mentorship sessions for startups in the Google Fast Track program, discussing how to solve challanges using Machine Learning/Deep Learning with TensorFlow.

TensorFlow

Practical Adversarial Robustness in Deep Learning: Problems and Solutions
ML using TF cookbook and ML for Dummies book

Meanwhile in Europe, GDEs Alexia Audevart (based in France) and Luca Massaron (based in Italy) released “Machine Learning using TensorFlow Cookbook”. It provides simple and effective ideas to successfully use TensorFlow 2.x in computer vision, NLP and tabular data projects. Additionally, Luca published the second edition of the Machine Learning For Dummies book, first published in 2015. Her latest edition is enhanced with product updates and the principal is a larger share of pages devoted to discussion of Deep Learning and TensorFlow / Keras usage.

YouTube video screenshot

On top of her women-in-tech related activities, Ruqiya Bin Safi is also running a “Welcome to Deep Learning Course and Orientation” monthly workshop throughout 2021. The course aims to help participants gain foundational knowledge of deep learning algorithms and get practical experience in building neural networks in TensorFlow.

TensorFlow Project showcase

Nepal-based GDE Kshitiz Rimal gave a talk “TensorFlow Project Showcase: Cash Recognition for Visually Impaired” on his project which uses TensorFlow, Google Cloud AutoML and edge computing technologies to create a solution for the visually impaired community in Nepal.

Screenshot of TF Everywhere NA talk

On the other side of the world, in Canada, GDE Tanmay Bakshi presented a talk “Machine Learning-powered Pipelines to Augment Human Specialists” during TensorFlow Everywhere NA. It covered the world of NLP through Deep Learning, how it’s historically been done, the Transformer revolution, and how using the TensorFlow & Keras to implement use cases ranging from small-scale name generation to large-scale Amazon review quality ranking.

Google Cloud Platform

Google Cloud Platform YouTube playlist screenshot

We have been equally busy on the GCP side as well. In the US, GDE Srivatsan Srinivasan created a series of videos called “Artificial Intelligence on Google Cloud Platform”, with one of the episodes, “Google Cloud Products and Professional Machine Learning Engineer Certification Deep Dive“, getting over 3,000 views.

ML Analysis Pipeline

Korean GDE Chansung Park contributed to TensorFlow User Group Korea with his “Machine Learning Pipeline (CI/CD for ML Products in GCP)” analysis, focused on about machine learning pipeline in Google Cloud Platform.

Analytics dashboard

Last but not least, GDE Gad Benram based in Israel wrote an article on “Seven Tips for Forecasting Cloud Costs”, where he explains how to build and deploy ML models for time series forecasting with Google Cloud Run. It is linked with his solution of building a cloud-spend control system that helps users more-easily analyze their cloud costs.

If you want to know more about the Google Experts community and all their global open-source ML contributions, visit the GDE Directory and connect with GDEs on Twitter and LinkedIn. You can also meet them virtually on the ML GDE’s YouTube Channel!

Irem from Turkey shares her groundbreaking work in TensorFlow and advice for the community


Posted by Jennifer Kohl, Global Program Manager, Google Developer Groups

Irem presenting at a Google Developer Group event

We recently caught up with Irem Komurcu, a TensorFlow developer and researcher at Istanbul Technical University in Turkey. Irem has been a long-serving member of Google Developer Groups (GDG) Düzce and also serves as a Women Techmakers (WTM) ambassador. Her work with TensorFlow has received several accolades, including being named a Hamdi Ulukaya Girişimi fellow. As one one of twenty-four young entrepreneurs selected, she was flown to New York City last year to learn more about business and receive professional development.

With all this experience to share, we wanted you to hear how she approaches pursuing a career in tech, hones her TensorFlow skills with the GDG community, and thinks about how upcoming programmers can best position themselves for success. Check out the full interview below for more.

What inspired you to pursue a career in technology?

I first became interested in tech when I was in high school and went on to study computer engineering. At university, I had an eye-opening experience when I traveled from Turkey to the Google Developer Day event in India. It was here where I observed various code languages, products, and projects that were new to me.

In particular, I saw TensorFlow in action for the first time. Watching the powerful machine learning tool truly sparked my interest in deep learning and project development.

Can you describe your work with TensorFlow and Machine Learning?

I have studied many different aspects of Tensorflow and ML. My first work was on voice recognition and deep learning. However, I am now working as a computer vision researcher conducting various segmentation, object detection, and classification processes with Tensorflow. In my free time, I write various articles about best practices and strategies to leverage TensorFlow in ML.

What has been a useful learning resource you have used in your career?

I kicked off my studies on deep learning on tensorflow.org. It’s a basic first step, but a powerful one. There were so many blogs, codes, examples, and tutorials for me to dive into. Both the Google Developer Group and TensorFlow communities also offered chances to bounce questions and ideas off other developers as I learned.

Between these technical resources and the person-to-person support, I was lucky to start working with the GDG community while also taking the first steps of my career. There were so many opportunities to meet people and grow all around.

What is your favorite part of the Google Developer Group community?

I love being in a large community with technology-oriented people. GDG is a network of professionals who support each other, and that enables people to develop. I am continuously sharing my knowledge with other programmers as they simultaneously mentor me. The chance for us to collaborate together is truly fulfilling.

What is unique about being a developer in your country/region?

The number of women supported in science, technology, engineering, and mathematics (STEM) is low in Turkey. To address this, I partner with Women Techmakers (WTM) to give educational talks on TensorFlow and machine learning to women who want to learn how to code in my country. So many women are interested in ML, but just need a friendly, familiar face to help them get started. With WTM, I’ve already given over 30 talks to women in STEM.

What advice would you give to someone who is trying to grow their career as a developer?

Keep researching new things. Read everything you can get your eyes on. Technology has been developing rapidly, and it is necessary to make sure your mind can keep up with the pace. That’s why I recommend communities like GDG that help make sure you’re up to date on the newest trends and learnings.

Want to work with other developers like Irem? Then find the right Google Developer Developer Group for you, here.

MediaPipe 3D Face Transform

Posted by Kanstantsin Sokal, Software Engineer, MediaPipe team

Earlier this year, the MediaPipe Team released the Face Mesh solution, which estimates the approximate 3D face shape via 468 landmarks in real-time on mobile devices. In this blog, we introduce a new face transform estimation module that establishes a researcher- and developer-friendly semantic API useful for determining the 3D face pose and attaching virtual objects (like glasses, hats or masks) to a face.

The new module establishes a metric 3D space and uses the landmark screen positions to estimate common 3D face primitives, including a face pose transformation matrix and a triangular face mesh. Under the hood, a lightweight statistical analysis method called Procrustes Analysis is employed to drive a robust, performant and portable logic. The analysis runs on CPU and has a minimal speed/memory footprint on top of the original Face Mesh solution.

MediaPipe image

Figure 1: An example of virtual mask and glasses effects, based on the MediaPipe Face Mesh solution.

Introduction

The MediaPipe Face Landmark Model performs a single-camera face landmark detection in the screen coordinate space: the X- and Y- coordinates are normalized screen coordinates, while the Z coordinate is relative and is scaled as the X coordinate under the weak perspective projection camera model. While this format is well-suited for some applications, it does not directly enable crucial features like aligning a virtual 3D object with a detected face.

The newly introduced module moves away from the screen coordinate space towards a metric 3D space and provides the necessary primitives to handle a detected face as a regular 3D object. By design, you’ll be able to use a perspective camera to project the final 3D scene back into the screen coordinate space with a guarantee that the face landmark positions are not changed.

Metric 3D Space

The Metric 3D space established within the new module is a right-handed orthonormal metric 3D coordinate space. Within the space, there is a virtual perspective camera located at the space origin and pointed in the negative direction of the Z-axis. It is assumed that the input camera frames are observed by exactly this virtual camera and therefore its parameters are later used to convert the screen landmark coordinates back into the Metric 3D space. The virtual camera parameters can be set freely, however for better results it is advised to set them as close to the real physical camera parameters as possible.

MediaPipe image

Figure 2: A visualization of multiple key elements in the metric 3D space. Created in Cinema 4D

Canonical Face Model

The Canonical Face Model is a static 3D model of a human face, which follows the 3D face landmark topology of the MediaPipe Face Landmark Model. The model bears two important functions:

  • Defines metric units: the scale of the canonical face model defines the metric units of the Metric 3D space. A metric unit used by the default canonical face model is a centimeter;
  • Bridges static and runtime spaces: the face pose transformation matrix is – in fact – a linear map from the canonical face model into the runtime face landmark set estimated on each frame. This way, virtual 3D assets modeled around the canonical face model can be aligned with a tracked face by applying the face pose transformation matrix to them.

Face Transform Estimation

The face transform estimation pipeline is a key component, responsible for estimating face transform data within the Metric 3D space. On each frame, the following steps are executed in the given order:

  • Face landmark screen coordinates are converted into the Metric 3D space coordinates;
  • Face pose transformation matrix is estimated as a rigid linear mapping from the canonical face metric landmark set into the runtime face metric landmark set in a way that minimizes a difference between the two;
  • A face mesh is created using the runtime face metric landmarks as the vertex positions (XYZ), while both the vertex texture coordinates (UV) and the triangular topology are inherited from the canonical face model.

Effect Renderer

The Effect Renderer is a component, which serves as a working example of a face effect renderer. It targets the OpenGL ES 2.0 API to enable a real-time performance on mobile devices and supports the following rendering modes:

  • 3D object rendering mode: a virtual object is aligned with a detected face to emulate an object attached to the face (example: glasses);
  • Face mesh rendering mode: a texture is stretched on top of the face mesh surface to emulate a face painting technique.

In both rendering modes, the face mesh is first rendered as an occluder straight into the depth buffer. This step helps to create a more believable effect via hiding invisible elements behind the face surface.

MediaPipe image

Figure 3: An example of face effects rendered by the Face Effect Renderer.

Using Face Transform Module

The face transform estimation module is available as a part of the MediaPipe Face Mesh solution. It comes with face effect application examples, available as graphs and mobile apps on Android or iOS. If you wish to go beyond examples, the module contains generic calculators and subgraphs – those can be flexibly applied to solve specific use cases in any MediaPipe graph. For more information, please visit our documentation.

Follow MediaPipe

We look forward to publishing more blog posts related to new MediaPipe pipeline examples and features. Please follow the MediaPipe label on Google Developers Blog and Google Developers twitter account (@googledevs).

Acknowledgements

We would like to thank Chuo-Ling Chang, Ming Guang Yong, Jiuqiang Tang, Gregory Karpiak, Siarhei Kazakou, Matsvei Zhdanovich and Matthias Grundman for contributing to this blog post.

Doubling down on the edge with Coral’s new accelerator

Posted by The Coral Team

Coral image

Moving into the fall, the Coral platform continues to grow with the release of the M.2 Accelerator with Dual Edge TPU. Its first application is in Google’s Series One room kits where it helps to remove interruptions and makes the audio clearer for better video meetings. To help even more folks build products with Coral intelligence, we’re dropping the prices on several of our products. And for those folks that are looking to level up their at home video production, we’re sharing a demo of a pose based AI director to make multi-camera video easier to make.

Coral M.2 Accelerator with Dual Edge TPU

The newest addition to our product family brings two Edge TPU co-processors to systems in an M.2 E-key form factor. While the design requires a dual bus PCIe M.2 slot, it brings enhanced ML performance (8 TOPS) to tasks such as running two models in parallel or pipelining one large model across both Edge TPUs.

The ability to scale across multiple edge accelerators isn’t limited to only two Edge TPUs. As edge computing expands to local data centers, cell towers, and gateways, multi-Edge TPU configurations will be required to help process increasingly sophisticated ML models. Coral allows the use of a single toolchain to create models for one or more Edge TPUs that can address many different future configurations.

A great example of how the Coral M.2 Accelerator with Dual Edge TPU is being used is in the Series One meeting room kits for Google Meet.

The new Series One room kits for Google Meet run smarter with Coral intelligence

Coral image

Google’s new Series One room kits use our Coral M.2 Accelerator with Dual Edge TPU to bring enhanced audio clarity to video meetings. TrueVoice®, a multi-channel noise cancellation technology, minimizes distractions to ensure every voice is heard with up to 44 channels of echo and noise cancellation, making distracting sounds like snacking or typing on a keyboard a concern of the past.

Enabling the clearest possible communication in challenging environments was the target for the Google Meet hardware team. The consideration of what makes a challenging environment was not limited to unusually noisy environments, such as lunchrooms doubling as conference rooms. Any conference room can present challenging acoustics that make it difficult for all participants to be heard.

The secret to clarity without expensive and cumbersome equipment is to use virtual audio channels and AI driven sound isolation. Read more about how Coral was used to enhance and future-proof the innovative design.

Expanding the AI edge

Earlier this year, we reduced the prices of our prototyping devices and sensors. We are excited to share further price drops on more of our products. Our System-on-Module is now available for $99.99, and our Mini PCIe Accelerator, M.2 Accelerator A+E Key, and M.2 Accelerator B+M key are now available at $24.99. We hope this lower price will make our edge AI more accessible to more creative minds around the world. Later, this month our SoM offering will also expand to include 2 and 4GB RAM options.

Multi-cam with AI

Coral image

As we expand our platform and product family, we continue to keep new edge AI use cases in mind. We are continually inspired by our developer community’s experimentation and implementations. When recently faced with the challenges of multicam video production from home, Markku Lepistö, Solutions Architect at Google Cloud, created this real-time pose-based multicam tool he so aptly dubbed, AI Director.

We love seeing such unique implementations of on-device ML and invite you to share your own projects and feedback at [email protected].

For a list of worldwide distributors, system integrators and partners, visit the Coral partnerships page. Please visit Coral.ai to discover more about our edge ML platform.

Instant Motion Tracking with MediaPipe

Posted by Vikram Sharma, Software Engineering Intern; Jianing Wei, Staff Software Engineer; Tyler Mullen, Senior Software Engineer

Augmented Reality (AR) technology creates fun, engaging, and immersive user experiences. The ability to perform AR tracking across devices and platforms, without initialization, remains important for powering AR applications at scale.

Today, we are excited to release the Instant Motion Tracking solution in MediaPipe. It is built upon the MediaPipe Box Tracking solution we released previously. With Instant Motion Tracking, you can easily place fun virtual 2D and 3D content on static or moving surfaces, allowing them to seamlessly interact with the real world. This technology also powered MotionStills AR. Along with the library, we are releasing an open source Android application to showcase its capabilities. In this application, a user simply taps the camera viewfinder in order to place virtual 3D objects and GIF animations, augmenting the real-world environment.

gif of instant motion tracking in MediaPipe gif of instant motion tracking in MediaPipe

Instant Motion Tracking in MediaPipe

Instant Motion Tracking

The Instant Motion Tracking solution provides the capability to seamlessly place virtual content on static or motion surfaces in the real world. To achieve that, we provide the six degrees of freedom tracking with relative scale in the form of rotation and translation matrices. This tracking information is then used in the rendering system to overlay virtual content on camera streams to create immersive AR experiences.

The core concept behind Instant Motion Tracking is to decouple the camera’s translation and rotation estimation, treating them instead as independent optimization problems. This approach enables AR tracking across devices and platforms without initialization or calibration. We do this by first finding the 3D camera translation using only the visual signals from the camera. This involves estimating the target region’s apparent 2D translation and relative scale across frames. The process can be illustrated with a simple pinhole camera model, relating translation and scale of an object in the image plane to the final 3D translation.

image

By finding the change in relative size of our tracked region from view position V1 to V2, we can estimate the relative change in distance from the camera.

Next, we obtain the device’s 3D rotation from its built-in IMU (Inertial Measurement Unit) sensor. By combining this translation and rotation data, we can track a target region with six degrees of freedom at relative scale. This information allows for the placement of virtual content on any system with a camera and IMU functionality, and is calibration free. For more details on Instant Motion Tracking, please refer to our paper.

A MediaPipe Pipeline for Instant Motion Tracking

A diagram of Instant Motion Tracking pipeline is shown below, consisting of four major components: a Sticker Manager module, a Region Tracking module, a Matrices Manager module, and lastly a Rendering System. Each of the components consists of MediaPipe calculators or subgraphs.

Diagram

Diagram of Instant Motion Tracking Pipeline

The Sticker Manager accepts sticker data from the application and produces initial anchors (tracked region information) based on user taps, and user gesture controls for every sticker object. Initial anchors are then sent to our Region Tracking module to generate tracked anchors. The Matrices Manager combines this data with our device’s rotation matrix to produce six degrees-of-freedom poses as model matrices. After integrating any user-specified transforms like asset scaling, our final poses are forwarded to the Rendering System to render all virtual objects overlaid on the camera frame to produce the output AR frame.

Using the Instant Motion Tracking Solution

The Instant Motion Tracking solution is easy to use by leveraging the MediaPipe cross-platform framework. With camera frames, device rotation matrix, and anchor positions (screen coordinates) as input, the MediaPipe graph produces AR renderings for each frame, providing engaging experiences. If you wish to integrate this Instant Motion Tracking library with your system or application, please visit our documentation to build your own AR experiences on any device with IMU functionality and a camera sensor.

Augmenting The World with 3D Stickers and GIFs

Instant Motion Tracking solution allows bringing both 3D stickers and GIF animations into Augmented Reality experiences. GIFs are rendered on flat 3D billboards placed in the world, introducing fun and immersive experiences with animated content blended into the real environment.Try it for yourself!

Demonstration of GIF placement in 3D Demonstration of GIF placement in 3D

Demonstration of GIF placement in 3D

MediaPipe Instant Motion Tracking is already helping PixelShift.AI, a startup applying cutting-edge vision technologies to facilitate video content creation, to track virtual characters seamlessly in the view-finder for a realistic experience. Building upon Instant Motion Tracking’s high-quality pose estimation, PixelShift.AI enables VTubers to create mixed reality experiences with web technologies. The product is going to be released to the broader VTuber community later this year.

Instant

Instant Motion Tracking helps PixelShift.AI create mixed reality experiences

Follow MediaPipe

We look forward to publishing more blog posts related to new MediaPipe pipeline examples and features. Please follow the MediaPipe label on Google Developers Blog and Google Developers twitter account (@googledevs).

Acknowledgement

We would like to thank Vikram Sharma, Jianing Wei, Tyler Mullen, Chuo-Ling Chang, Ming Guang Yong, Jiuqiang Tang, Siarhei Kazakou, Genzhi Ye, Camillo Lugaresi, Buck Bourdon, and Matthias Grundman for their contributions to this release.

Summer updates from Coral

Posted by the Coral Team

Summer has arrived along with a number of Coral updates. We’re happy to announce a new partnership with balena that helps customers build, manage, and deploy IoT applications at scale on Coral devices. In addition, we’ve released a series of updates to expand platform compatibility, make development easier, and improve the ML capabilities of our devices.

Open-source Edge TPU runtime now available on GitHub

First up, our Edge TPU runtime is now open-source and available on GitHub, including scripts and instructions for building the library for Linux and Windows. Customers running a platform that is not officially supported by Coral, including ARMv7 and RISC-V can now compile the Edge TPU runtime themselves and start experimenting. An open source runtime is easier to integrate into your customized build pipeline, enabling support for creating Yocto-based images as well as other distributions.

Windows drivers now available for the Mini PCIe and M.2 accelerators

Coral customers can now also use the Mini PCIe and M.2 accelerators on the Microsoft Windows platform. New Windows drivers for these products complement the previously released Windows drivers for the USB accelerator and make it possible to start prototyping with the Coral USB Accelerator on Windows and then to move into production with our Mini PCIe and M.2 products.

New fresh bits on the Coral ML software stack

We’ve also made a number of new updates to our ML tools:

  • The Edge TPU compiler is now version 14.1. It can be updated by running sudo apt-get update && sudo apt-get install edgetpu, or follow the instructions here
  • Our new Model Pipelining API allows you to divide your model across multiple Edge TPUs. The C++ version is currently in beta and the source is on GitHub
  • New embedding extractor models for EfficientNet, for use with on-device backpropagation. Embedding extractor models are compiled with the last fully-connected layer removed, allowing you to retrain for classification. Previously, only Inception and MobileNet were available and now retraining can also be done on EfficientNet
  • New Colab notebooks to retrain a classification model with TensorFlow 2.0 and build C++ examples

Balena partners with Coral to enable AI at the edge

We are excited to share that the Balena fleet management platform now supports Coral products!

Companies running a fleet of ML-enabled devices on the edge need to keep their systems up-to-date with the latest security patches in order to protect data, model IP and hardware from being compromised. Additionally, ML applications benefit from being consistently retrained to recognize new use cases with maximum accuracy. Coral + balena together, bring simplicity and ease to the provisioning, deployment, updating, and monitoring of your ML project at the edge, moving early prototyping seamlessly towards production environments with many thousands of devices.

Read more about all the benefits of Coral devices combined with balena container technology or get started deploying container images to your Coral fleet with this demo project.

New version of Mendel Linux

Mendel Linux (5.0 release Eagle) is now available for the Coral Dev Board and SoM and includes a more stable package repository that provides a smoother updating experience. It also brings compatibility improvements and a new version of the GPU driver.

New models

Last but not least, we’ve recently released BodyPix, a Google person-segmentation model that was previously only available for TensorFlow.JS, as a Coral model. This enables real-time privacy preserving understanding of where people (and body parts) are on a camera frame. We first demoed this at CES 2020 and it was one of our most popular demos. Using BodyPix we can remove people from the frame, display only their outline, and aggregate over time to see heat maps of population flow.

Here are two possible applications of BodyPix: Body-part segmentation and anonymous population flow. Both are running on the Dev Board.

We’re excited to add BodyPix to the portfolio of projects the community is using to extend our models far beyond our demos—including tackling today’s biggest challenges. For example, Neuralet has taken our MobileNet V2 SSD Detection model and used it to implement Smart Social Distancing. Using the bounding box of person detection, they can compute a region for safe distancing and let a user know if social distance isn’t being maintained. The best part is this is done without any sort of facial recognition or tracking, with Coral we can accomplish this in real-time in a privacy preserving manner.

We can’t wait to see more projects that the community can make with BodyPix. Beyond anonymous population flow there’s endless possibilities with background and body part manipulation. Let us know what you come up with at our community channels, including GitHub and StackOverflow.

________________________

We are excited to share all that Coral has to offer as we continue to evolve our platform. For a list of worldwide distributors, system integrators and partners, including balena, visit the Coral partnerships page. Please visit Coral.ai to discover more about our edge ML platform and share your feedback at [email protected].

13 Most Common Google Cloud Reference Architectures

Posted by Priyanka Vergadia, Developer Advocate

Google Cloud is a cloud computing platform that can be used to build and deploy applications. It allows you to take advantage of the flexibility of development while scaling the infrastructure as needed.

I’m often asked by developers to provide a list of Google Cloud architectures that help to get started on the cloud journey. Last month, I decided to start a mini-series on Twitter called “#13DaysOfGCP” where I shared the most common use cases on Google Cloud. I have compiled the list of all 13 architectures in this post. Some of the topics covered are hybrid cloud, mobile app backends, microservices, serverless, CICD and more. If you were not able to catch it, or if you missed a few days, here we bring to you the summary!

Series kickoff #13DaysOfGCP

#1: How to set up hybrid architecture in Google Cloud and on-premises

Day 1

#2: How to mask sensitive data in chatbots using Data loss prevention (DLP) API?

Day 2

#3: How to build mobile app backends on Google Cloud?

Day 3

#4: How to migrate Oracle Database to Spanner?

Day 4

#5: How to set up hybrid architecture for cloud bursting?

Day 5

#6: How to build a data lake in Google Cloud?

Day 6

#7: How to host websites on Google Cloud?

Day 7

#8: How to set up Continuous Integration and Continuous Delivery (CICD) pipeline on Google Cloud?

Day 8

#9: How to build serverless microservices in Google Cloud?

Day 9

#10: Machine Learning on Google Cloud

Day 10

#11: Serverless image, video or text processing in Google Cloud

Day 11

#12: Internet of Things (IoT) on Google Cloud

Day 12

#13: How to set up BeyondCorp zero trust security model?

Day 13

Wrap up with a puzzle

Wrap up!

We hope you enjoy this list of the most common reference architectures. Please let us know your thoughts in the comments below!

Building a more resilient world together

Posted by Billy Rutledge, Director of the Coral team

UNDP Hackster.io COVID19 Detect Protect Poster

Recently, we’ve seen communities respond to the challenges of the coronavirus pandemic by using technology in new ways to effect positive change. It’s increasingly important that our systems are able to adapt to new contexts, handle disruptions, and remain efficient.

At Coral, we believe intelligence at the edge is a key ingredient towards building a more resilient future. By making the latest machine learning tools easy-to-use and accessible, innovators can collaborate to create solutions that are most needed in their communities. Developers are already using Coral to build solutions that can understand and react in real-time, while maintaining privacy for everyone present.

Helping our communities stay safe, together

As mandatory isolation measures begin to relax, compliance with safe social distancing protocol has become a topic of primary concern for experts across the globe. Businesses and individuals have been stepping up to find ways to use technology to help reduce the risk and spread. Many efforts are employing the benefits of edge AI—here are a few early stage examples that have inspired us.

woman and child crossing the street

In Belgium, engineers at Edgise recently used Coral to develop an occupancy monitor to aid businesses in managing capacity. With the privacy preserving properties of edge AI, businesses can anonymously count how many customers enter and exit a space, signaling when the area is too full.

A research group at the Sathyabama Institute of Science and Technology in India are using Coral to develop a wearable device to serve as a COVID-19 cough counter and health monitor, allowing medical professionals to better care for low risk patients in an outpatient capacity. Coral’s Edge TPU enables biometric data to be processed efficiently, without draining the limited power resources available in wearable devices.

All across the US, hospitals are seeking solutions to ensure adherence to hygiene policy amongst hospital staff. In one example, a device incorporates the compact, affordable and offline benefits of the Coral modules to aid in handwashing practices at numerous stations throughout a facility.

And around the world, members of the PyImageSearch community are exploring how to train a COVID-19: Face Mask Detector model using TensorFlow that can be used to identify whether people are wearing a mask. Open source frameworks can empower anyone to develop solutions, and with Coral components we can help bring those benefits to everyone.

Eliciting a global response

In an effort to rally greater community involvement, Coral has joined The United Nations Development Programme and Hackster.io, as a sponsor of the COVID-19 Detect and Protect Challenge. The initiative calls on developers to build affordable and reproducible solutions that support response efforts in developing countries. All ideas are welcome—whether they use ML or not—and we encourage you to participate.

To make edge ML capabilities even easier to integrate, we’re also announcing a price reduction for the Coral products widely used for experimentation and prototyping. Our Dev Board will now be offered at $129.99, the USB Accelerator at $59.99, the Camera Module at $19.99, and the Enviro Board at $14.99. Additionally, we are introducing the USB Accelerator into 10 new markets: Ghana, Thailand, Singapore, Oman, Philippines, Indonesia, Kenya, Malaysia, Israel, and Vietnam. For more details, visit Coral.ai/products.

We’re excited to see the solutions developers will bring forward with Coral. And as always, please keep sending us feedback at [email protected]

Alfred Camera: Smart camera features using MediaPipe

Guest post by the Engineering team at Alfred Camera

Please note that the information, uses, and applications expressed in the below post are solely those of our guest author, Alfred Camera.

In this article, we’d like to give you a short overview of Alfred Camera and our experience of using MediaPipe to transform our moving object feature, and how MediaPipe has helped to get things easier to achieve our goals.

What is Alfred Camera?

AlfredCamera logo

Fig.1 Alfred Camera Logo

Alfred Camera is a smart home app for both Android and iOS devices, with over 15 million downloads worldwide. By downloading the app, users are able to turn their spare phones into security cameras and monitors directly, which allows them to watch their homes, shops, pets anytime. The mission of Alfred Camera is to provide affordable home security so that everyone can find peace of mind in this busy world.

The Alfred Camera team is composed of professionals in various fields, including an engineering team with several machine learning and computer vision experts. Our aim is to integrate AI technology into devices that are accessible to everyone.

Machine Learning in Alfred Camera

Alfred Camera currently has a feature called Moving Object Detection, which continuously uses the device’s camera to monitor a target scene. Once it identifies a moving object in the area, the app will begin recording the video and send notifications to the device owner. The machine learning models for detection are hand-crafted and trained by our team using TensorFlow, and run on TensorFlow Lite with good performance even on mid-tier devices. This is important because the app is leveraging old phones and we’d like the feature to reach as many users as possible.

The Challenges

We had started building our AI features at Alfred Camera since 2017. In order to have a solid foundation to support our AI feature requirements for the coming years, we decided to rebuild our real-time video analysis pipeline. At the beginning of the project, the goals were to create a new pipeline which should be 1) modular enough so we could swap core algorithms easily with minimal changes in other parts of the pipeline, 2) having GPU acceleration designed in place, 3) cross-platform as much as possible so there’s no need to create/maintain separate implementations for different platforms. Based on the goals, we had surveyed several open source projects that had the potential but we ended up using none of them as they either fell short on the features or were not providing the readiness/stabilities that we were looking for.

We started a small team to prototype on those goals first for the Android platform. What came later were some tough challenges way above what we originally anticipated. We ran into several major design changes as some key design basics were overlooked. We needed to implement some utilities to do things that sounded trivial but required significant effort to make it right and fast. Dealing with asynchronous processing also led us into a bunch of timing issues, which took the team quite some effort to address. Not to mention debugging on real devices was extremely inefficient and painful.

Things didn’t just stop here. Our product is also on iOS and we had to tackle these challenges once again. Moreover, discrepancies in the behavior between the platform-specific implementations introduced additional issues that we needed to resolve.

Even though we finally managed to get the implementations to the confidence level we wanted, that was not a very pleasant experience and we have never stopped thinking if there is a better option.

MediaPipe – A Game Changer

Google open sourced MediaPipe project in June 2019 and it immediately caught our attention. We were surprised by how it is perfectly aligned with the previous goals we set, and has functionalities that could not have been developed with the amount of engineering resources we had as a small company.

We immediately decided to start an evaluation project by building a new product feature directly using MediaPipe to see if it could live up to all the promises.

Migrating to MediaPipe

To start the evaluation, we decided to migrate our existing moving object feature to see what exactly MediaPipe can do.

Our current Moving Object Detection pipeline consists of the following main components:

  • (Moving) Object Detection Model
    As explained earlier, a TensorFlow Lite model trained by our team, tailored to run on mid-tier devices.
  • Low-light Detection and Low-light Filter
    Calculate the average luminance of the scene, and based on the result conditionally process the incoming frames to intensify the brightness of the pixels to let our users see things in the dark. We are also controlling whether we should run the detection or not as the moving object detection model does not work properly when the frame has been processed by the filter.
  • Motion Detection
    Sending frames through Moving Object Detection still consumes a significant amount of power even with a small model like the one we created. Running inferences continuously does not seem to be a good idea as most of the time there may not be any moving object in front of the camera. We decided to implement a gating mechanism where the frames are only being sent to the Moving Object Detection model based on the movements detected from the scene. The detection is done mainly by calculating the differences between two frames with some additional tricks that take the movements detected in a few frames before into consideration.
  • Area of Interest
    This is a mechanism to let users manually mask out the area where they do not want the camera to see. It can also be done automatically based on regional luminance that can be generated by the aforementioned low-light detection component.

Our current implementation has taken GPU into consideration as much as we can. A series of shaders are created to perform the tasks above and the pipeline is designed to avoid moving pixels between CPU/GPU frequently to eliminate the potential performance hits.

The pipeline involves multiple ML models that are conditionally executed, mixed CPU/GPU processing, etc. All the challenges here make it a perfect showcase for how MediaPipe could help develop a complicated pipeline.

Playing with MediaPipe

MediaPipe provides a lot of code samples for any developer to bootstrap with. We took the Object Detection on Android sample that comes with the project to start with because of the similarity with the back-end part of our pipeline. It did take us sometimes to fully understand the design concepts of MediaPipe and all the tools associated. But with the complete documentation and the great responsiveness from the MediaPipe team, we got up to speed soon to do most of the things we wanted.

That being said, there were a few challenges we needed to overcome on the road to full migration. Our original pipeline of Moving Object Detection takes the input frame asynchronously, but MediaPipe has timestamp bound limitations such that we cannot just show the result in an allochronic way. Meanwhile, we need to gather data through JNI in a specific data format. We came up with a workaround that conquered all the issues under the circumstances, which will be mentioned later.

After wrapping our models and the processing logics into calculators and wired them up, we have successfully transformed our existing implementation and created our first MediaPipe Moving Object Detection pipeline like the figure below, running on Android devices:

Fig.2 Moving Object Detection Graph

Fig.2 Moving Object Detection Graph

We do not block the video frame in the main calculation loop, and set the detection result as an input stream to show the annotation on the screen. The whole graph is designed as a multi-functioned process, the left chunk is the debug annotation and video frame output module, and the rest of the calculation occurs in the rest of the graph, e.g., low light detection, motion triggered detection, cropping of the area of interest and the detection process. In this way, the graph process will naturally separate into real-time display and asynchronous calculation.

As a result, we are able to complete a full processing for detection in under 40ms on a device with Snapdragon 660 chipset. MediaPipe’s tight integration with TensorFlow Lite provides us the flexibility to get even more performance gain by leveraging whatever acceleration techniques available (GPU or DSP) on the device.

The following figure shows the current implementation working in action:

Fig.3 Moving Object Detection running in Alfred Camera

Fig.3 Moving Object Detection running in Alfred Camera

After getting things to run on Android, Desktop GPU (OpenGL-ES) emulation was our next target to evaluate. We are already using OpenGL-ES shaders for some computer vision operations in our pipeline. Having the capability to develop the algorithm on desktop, seeing it work in action before deployment onto mobile platforms is a huge benefit to us. The feature was not ready at the time when the project was first released, but MediaPipe team had soon added Desktop GPU emulation support for Linux in follow-up releases to make this possible. We have used the capability to detect and fix some issues in the graphs we created even before we put things on the mobile devices. Although it currently only works on Linux, it is still a big leap forward for us.

Testing the algorithms and making sure they behave as expected is also a challenge for a camera application. MediaPipe helps us simplify this by using pre-recorded MP4 files as input so we could verify the behavior simply by replaying the files. There is also built-in profiling support that makes it easy for us to locate potential performance bottlenecks.

MediaPipe – Exactly What We Were Looking For

The result of the evaluation and the feedback from our engineering team were very positive and promising:

  1. We are able to design/verify the algorithm and complete core implementations directly on the desktop emulation environment, and then migrate to the target platforms with minimum efforts. As a result, complexities of debugging on real devices are greatly reduced.
  2. MediaPipe’s modular design of graphs/calculators enables us to better split up the development into different engineers/teams, try out new pipeline design easily by rewiring the graph, and test the building blocks independently to ensure quality before we put things together.
  3. MediaPipe’s cross-platform design maximizes the reusability and minimizes fragmentation of the implementations we created. Not only are the efforts required to support a new platform greatly reduced, but we are also less worried about the behavior discrepancies on different platforms due to different interpretations of the spec from platform engineers.
  4. Built-in graphics utilities and profiling support saved us a lot of time creating those common facilities and making them right, and we could be more focused on the key designs.
  5. Tight integration with TensorFlow Lite really saves lots of effort for a company like us that heavily depends on TensorFlow, and it still gives us the flexibility to easily interface with other solutions.

With just a few weeks working with MediaPipe, it has shown strong capabilities to fundamentally transform how we develop our products. Without MediaPipe we could have spent months creating the same features without the same level of performance.

Summary

Alfred Camera is designed to bring home security with AI to everyone, and MediaPipe has significantly made achieving that goal easier for our team. From Moving Object Detection to future AI-powered features, we are focusing on transforming a basic security camera use case into a smart housekeeper that can help provide even more context that our users care about. With the support of MediaPipe, we have been able to accelerate our development process and bring the features to the market at an unprecedented speed. Our team is really excited about how MediaPipe could help us progress and discover new possibilities, and is looking forward to the enhancements that are yet to come to the project.

Microsoft Connected Vehicle Platform: trends and investment areas

This post was co-authored by the extended Azure Mobility Team.

The past year has been eventful for a lot of reasons. At Microsoft, we’ve expanded our partnerships, including Volkswagen, LG Electronics, Faurecia, TomTom, and more, and taken the wraps off new thinking such as at CES, where we recently demonstrated our approach to in-vehicle compute and software architecture.

Looking ahead, areas that were once nominally related now come into sharper focus as the supporting technologies are deployed and the various industry verticals mature. The welcoming of a new year is a good time to pause and take in what is happening in our industry and in related ones with an aim to developing a view on where it’s all heading.

In this blog, we will talk about the trends that we see in connected vehicles and smart cities and describe how we see ourselves fitting in and contributing.

Trends

Mobility as a Service (Maas)

MaaS (sometimes referred to as Transportation as a Service, or TaaS) is about people getting to goods and services and getting those goods and services to people. Ride-hailing and ride-sharing come to mind, but so do many other forms of MaaS offerings such as air taxis, autonomous drone fleets, and last-mile delivery services. We inherently believe that completing a single trip—of a person or goods—will soon require a combination of passenger-owned vehicles, ride-sharing, ride-hailing, autonomous taxis, bicycle-and scooter-sharing services transporting people on land, sea, and in the air (what we refer to as “multi-modal routing”). Service offerings that link these different modes of transportation will be key to making this natural for users.

With Ford, we are exploring how quantum algorithms can help improve urban traffic congestion and develop a more balanced routing system. We’ve also built strong partnerships with TomTom for traffic-based routing as well as with AccuWeather for current and forecast weather reports to increase awareness of weather events that will occur along the route. In 2020, we will be integrating these routing methods together and making them available as part of the Azure Maps service and API. Because mobility constitutes experiences throughout the day across various modes of transportation, finding pickup locations, planning trips from home and work, and doing errands along the way, Azure Maps ties the mobility journey with cloud APIs and iOS and Android SDKs to deliver in-app mobility and mapping experiences. Coupled with the connected vehicle architecture of integration with federated user authentication, integration with the Microsoft Graph, and secure provisioning of vehicles, digital assistants can support mobility end-to-end. The same technologies can be used in moving goods and retail delivery systems.

The pressure to become profitable will force changes and consolidation among the MaaS providers and will keep their focus on approaches to reducing costs such as through autonomous driving. Incumbent original equipment manufacturers (OEMs) are expanding their businesses to include elements of car-sharing to continue evolving their businesses as private car ownership is likely to decline over time.

Connecting vehicles to the cloud

We refer holistically to these various signals that can inform vehicle routing (traffic, weather, available modalities, municipal infrastructure, and more) as “navigation intelligence.” Taking advantage of this navigation intelligence will require connected vehicles to become more sophisticated than just logging telematics to the cloud.

The reporting of basic telematics (car-to-cloud) is barely table-stakes; over-the-air updates (OTA, or cloud-to-car) will become key to delivering a market-competitive vehicle, as will command-and-control (more cloud-to-car, via phone apps). Forward-thinking car manufacturers deserve a lot of credit here for showing what’s possible and for creating in consumers the expectation that the appearance of new features in the car after it is purchased isn’t just cool, but normal.

Future steps include the integration of in-vehicle infotainment (IVI) with voice assistants that blend the in- and out-of-vehicle experiences, updating AI models for in-market vehicles for automated driving levels one through five, and of course pre-processing the telemetry at the edge in order to better enable reinforcement learning in the cloud as well as just generally improving services.

Delivering value from the cloud to vehicles and phones

As vehicles become more richly connected and deliver experiences that overlap with what we’ve come to expect from our phones, an emerging question is, what is the right way to make these work together? Projecting to the IVI system of the vehicle is one approach, but most agree that vehicles should have a great experience without a phone present.

Separately, phones are a great proxy for “a vehicle” in some contexts, such as bicycle sharing, providing speed, location, and various other probe data, as well as providing connectivity (as well as subsidizing the associated costs) for low-powered electronics on the vehicle.

This is probably a good time to mention 5G. The opportunity 5G brings will have a ripple effect across industries. It will be a critical foundation for the continued rise of smart devices, machines, and things. They can speak, listen, see, feel, and act using sensitive sensor technology as well as data analytics and machine learning algorithms without requiring “always on” connectivity. This is what we call the intelligent edge. Our strategy is to enable 5G at the edge through cloud partnerships, with a focus on security and developer experience.

Optimizations through a system-of-systems approach

Connecting things to the cloud, getting data into the cloud, and then bringing the insights gained through cloud-enabled analytics back to the things is how optimizations in one area can be brought to bear in another area. This is the essence of digital transformation. Vehicles gathering high-resolution imagery for improving HD maps can also inform municipalities about maintenance issues. Accident information coupled with vehicle telemetry data can inform better PHYD (pay how you drive) insurance plans as well as the deployment of first responder infrastructure to reduce incident response time.

As the vehicle fleet electrifies, the demand for charging stations will grow. The way in-car routing works for an electric car is based only on knowledge of existing charging stations along the route—regardless of the current or predicted wait-times at those stations. But what if that route could also be informed by historical use patterns and live use data of individual charging stations in order to avoid arriving and having three cars ahead of you? Suddenly, your 20-minute charge time is actually a 60-minute stop, and an alternate route would have made more sense, even if, on paper, it’s more miles driven.

Realizing these kinds of scenarios means tying together knowledge about the electrical grid, traffic patterns, vehicle types, and incident data. The opportunities here for brokering the relationships among these systems are immense, as are the challenges to do so in a way that encourages the interconnection and sharing while maintaining privacy, compliance, and security.

Laws, policies, and ethics

The past several years of data breaches and elections are evidence of a continuously evolving nature of the security threats that we face. That kind of environment requires platforms that continuously invest in security as a fundamental cost of doing business.

Laws, regulatory compliance, and ethics must figure into the design and implementation of our technologies to as great a degree as goals like performance and scalability do. Smart city initiatives, where having visibility into the movement of people, goods, and vehicles is key to doing the kinds of optimizations that increase the quality of life in these cities, will confront these issues head-on.

Routing today is informed by traffic conditions but is still fairly “selfish:” routing for “me” rather than for “we.” Cities would like a hand in shaping traffic, especially if they can factor in deeper insights such as the types of vehicles on the road (sending freight one way versus passenger traffic another way), whether or not there is an upcoming sporting event or road closure, weather, and so on.

Doing this in a way that is cognizant of local infrastructure and the environment is what smart cities initiatives are all about.

For these reasons, we have joined the Open Mobility Foundation. We are also involved with Stanford’s Digital Cities Program, the Smart Transportation Council, the Alliance to Save Energy by the 50×50 Transportation Initiative, and the World Business Council for Sustainable Development.

With the Microsoft Connected Vehicle Platform (MCVP) and an ecosystem of partners across the industry, Microsoft offers a consistent horizontal platform on top of which customer-facing solutions can be built. MCVP helps mobility companies accelerate the delivery of digital services across vehicle provisioning, two-way network connectivity, and continuous over-the-air updates of containerized functionality. MCVP provides support for command-and-control, hot/warm/cold path for telematics, and extension hooks for customer/third-party differentiation. Being built on Azure, MCVP then includes the hyperscale, global availability, and regulatory compliance that comes as part of Azure. OEMs and fleet operators leverage MCVP as a way to “move up the stack” and focus on their customers rather than spend resources on non-differentiating infrastructure.

Innovation in the automotive industry

At Microsoft, and within the Azure IoT organization specifically, we have a front-row seat on the transformative work that is being done in many different industries, using sensors to gather data and develop insights that inform better decision-making. We are excited to see these industries on paths that are trending to converging, mutually beneficial paths. Our colleague Sanjay Ravi shares his thoughts from an automotive industry perspective in this great article.

Turning our attention to our customer and partner ecosystem, the traction we’ve gotten across the industry has been overwhelming:

The Volkswagen Automotive Cloud will be one of the largest dedicated clouds of its kind in the automotive industry and will provide all future digital services and mobility offerings across its entire fleet. More than 5 million new Volkswagen-specific brand vehicles are to be fully connected on Microsoft’s Azure cloud and edge platform each year. The Automotive Cloud subsequently will be rolled out on all Group brands and models.

Cerence is working with us to integrate Cerence Drive products with MCVP. This new integration is part of Cerence’s ongoing commitment to delivering a superior user experience in the car through interoperability across voice-powered platforms and operating systems. Automakers developing their connected vehicle solutions on MCVP can now benefit from Cerence’s industry-leading conversational AI, in turn delivering a seamless, connected, voice-powered experience to their drivers.

Ericsson, whose Connected Vehicle Cloud connects more than 4 million vehicles across 180 countries, is integrating their Connected Vehicle Cloud with Microsoft’s Connected Vehicle Platform to accelerate the delivery of safe, comfortable, and personalized connected driving experiences with our cloud, AI, and IoT technologies.

LG Electronics is working with Microsoft to build its automotive infotainment systems, building management systems and other business-to-business collaborations. LG will leverage Microsoft Azure cloud and AI services to accelerate the digital transformation of LG’s B2B business growth engines, as well as Automotive Intelligent Edge, the in-vehicle runtime environment provided as part of MCVP.

Global technology company ZF Friedrichshafen is transforming into a provider of software-driven mobility solutions, leveraging Azure cloud services and developer tools to promote faster development and validation of connected vehicle functions on a global scale.

Faurecia is collaborating with Microsoft to develop services that improve comfort, wellness, and infotainment as well as bring digital continuity from home or the office to the car. At CES, Faurecia demonstrated how its cockpit integration will enable Microsoft Teams video conferencing. Using Microsoft Connected Vehicle Platform, Faurecia also showcased its vision of playing games on the go, using Microsoft’s new Project xCloud streaming game preview.

Bell has revealed AerOS, a digital mobility platform that will give operators a 360° view into their aircraft fleet. By leveraging technologies like artificial intelligence and IoT, AerOS provides powerful capabilities like fleet master scheduling and real-time aircraft monitoring, enhancing Bell’s Mobility-as-a-Service (MaaS) experience. Bell chose Microsoft Azure as the technology platform to manage fleet information, observe aircraft health, and manage the throughput of goods, products, predictive data, and maintenance.

Luxoft is expanding its collaboration with Microsoft to accelerate the delivery of connected vehicle solutions and mobility experiences. By leveraging MCVP, Luxoft will enable and accelerate the delivery of vehicle-centric solutions and services that will allow automakers to deliver unique features such as advanced vehicle diagnostics, remote access and repair, and preventive maintenance. Collecting real usage data will also support vehicle engineering to improve manufacturing quality.

We are incredibly excited to be a part of the connected vehicle space. With MCVP, our ecosystem partners and our partnerships with leading automotive players, both vehicle OEMs and automotive technology suppliers, we believe we have a uniquely capable offering enabling at global scale the next wave of innovation in the automotive industry as well as related verticals such as smart cities, smart infrastructure, insurance, transportation, and beyond.

MediaPipe on the Web

Posted by Michael Hays and Tyler Mullen from the MediaPipe team

MediaPipe is a framework for building cross-platform multimodal applied ML pipelines. We have previously demonstrated building and running ML pipelines as MediaPipe graphs on mobile (Android, iOS) and on edge devices like Google Coral. In this article, we are excited to present MediaPipe graphs running live in the web browser, enabled by WebAssembly and accelerated by XNNPack ML Inference Library. By integrating this preview functionality into our web-based Visualizer tool, we provide a playground for quickly iterating over a graph design. Since everything runs directly in the browser, video never leaves the user’s computer and each iteration can be immediately tested on a live webcam stream (and soon, arbitrary video).

Running the MediaPipe face detection example in the Visualizer

Figure 1 shows the running of the MediaPipe face detection example in the Visualizer

MediaPipe Visualizer

MediaPipe Visualizer (see Figure 2) is hosted at viz.mediapipe.dev. MediaPipe graphs can be inspected by pasting graph code into the Editor tab or by uploading that graph file into the Visualizer. A user can pan and zoom into the graphical representation of the graph using the mouse and scroll wheel. The graph will also react to changes made within the editor in real time.

MediaPipe Visualizer hosted at https://viz.mediapipe.dev

Figure 2 MediaPipe Visualizer hosted at https://viz.mediapipe.dev

Demos on MediaPipe Visualizer

We have created several sample Visualizer demos from existing MediaPipe graph examples. These can be seen within the Visualizer by visiting the following addresses in your Chrome browser:

Edge Detection

Face Detection

Hair Segmentation

Hand Tracking

Edge detection
Face detection
Hair segmentation
Hand tracking

Each of these demos can be executed within the browser by clicking on the little running man icon at the top of the editor (it will be greyed out if a non-demo workspace is loaded):

This will open a new tab which will run the current graph (this requires a web-cam).

Implementation Details

In order to maximize portability, we use Emscripten to directly compile all of the necessary C++ code into WebAssembly, which is a special form of low-level assembly code designed specifically for web browsers. At runtime, the web browser creates a virtual machine in which it can execute these instructions very quickly, much faster than traditional JavaScript code.

We also created a simple API for all necessary communications back and forth between JavaScript and C++, to allow us to change and interact with the MediaPipe graph directly from JavaScript. For readers familiar with Android development, you can think of this as a similar process to authoring a C++/Java bridge using the Android NDK.

Finally, we packaged up all the requisite demo assets (ML models and auxiliary text/data files) as individual binary data packages, to be loaded at runtime. And for graphics and rendering, we allow MediaPipe to automatically tap directly into WebGL so that most OpenGL-based calculators can “just work” on the web.

Performance

While executing WebAssembly is generally much faster than pure JavaScript, it is also usually much slower than native C++, so we made several optimizations in order to provide a better user experience. We utilize the GPU for image operations when possible, and opt for using the lightest-weight possible versions of all our ML models (giving up some quality for speed). However, since compute shaders are not widely available for web, we cannot easily make use of TensorFlow Lite GPU machine learning inference, and the resulting CPU inference often ends up being a significant performance bottleneck. So to help alleviate this, we automatically augment our “TfLiteInferenceCalculator” by having it use the XNNPack ML Inference Library, which gives us a 2-3x speedup in most of our applications.

Currently, support for web-based MediaPipe has some important limitations:

  • Only calculators in the demo graphs above may be used
  • The user must edit one of the template graphs; they cannot provide their own from scratch
  • The user cannot add or alter assets
  • The executor for the graph must be single-threaded (i.e. ApplicationThreadExecutor)
  • TensorFlow Lite inference on GPU is not supported

We plan to continue to build upon this new platform to provide developers with much more control, removing many if not all of these limitations (e.g. by allowing for dynamic management of assets). Please follow the MediaPipe tag on the Google Developer blog and Google Developer twitter account. (@googledevs)

Acknowledgements

We would like to thank Marat Dukhan, Chuo-Ling Chang, Jianing Wei, Ming Guang Yong, and Matthias Grundmann for contributing to this blog post.