Coral, Google’s platform for Edge AI, chooses ASUS as OEM partner for global scale

We launched Coral in 2019 with a mission to make edge AI powerful, private, and efficient, and also accessible to a wide variety of customers with affordable tools that reliably go from prototype to production. In these first few years, we’ve seen a strong growth in demand for our products across industries and geographies, and with that, a growing need for worldwide availability and support.

That’s why we’re pleased to announce that we have signed an agreement with ASUS IoT, to help scale our manufacturing, distribution and support. With decades of experience in electronics manufacturing at a global scale, ASUS IoT will provide Coral with the resources to meet our growth demands while we continue to develop new products for edge computing.

ASUS IoT is a sub-brand of ASUS dedicated to the creation of solutions in the fields of AI and the internet of things (IoT). Their mission is to become a trusted provider of embedded systems and the wider AI and IoT ecosystem. ASUS IoT strives to deliver best-in-class products and services across diverse vertical markets, and to partner with customers in the development of fully-integrated and rapid-to-market applications that drive efficiency – providing convenient, efficient, and secure living and working environments for people everywhere.

ASUS IoT already has a long-standing history of collaboration with Coral, being the first partner to release a product using the Coral SoM when they launched the Tinker Edge T development board. ASUS IoT has also integrated Coral accelerators into their enterprise class intelligent edge computers and was the first to release a multi Edge TPU device with the award winning AI Accelerator PCIe Card. Because we have this history of collaboration, we know they share our strong commitment to new innovation in edge computing.

ASUS IoT also has an established manufacturing and distribution processes, and a strong reputation in enterprise-level sales and support. So we’re excited to work with them to enable scale and long-term availability for Coral products.

With this agreement, the Coral brand and user experience will not change, as Google will maintain ownership of the brand and product portfolio. The Coral team will continue to work with our customers on partnership initiatives and case studies through our Coral Partnership Program. Those interested in joining our partner ecosystem can visit our website to learn more and apply.

Coral.ai will remain the home for all product information and documentation, and in the coming months ASUS IoT will become the primary channel for sales, distribution and support. With this partnership, our customers will gain access to dedicated teams for sales and technical support managed by ASUS IoT.

ASUS IoT will be working to expand the distribution network to make Coral available in more countries. Distributors interested in carrying Coral products will be able to contact ASUS IoT for consideration.

We continue to be impressed by the innovative ways in which our customers use Coral to explore new AI-driven solutions. And now with ASUS IoT bringing expanded sales, support and resources for long-term availability, our Coral team will continue to focus on building the next generation of privacy-preserving features and tools for neural computing at the edge.

We look forward to the continued growth of the Coral platform as it flourishes and we are excited to have ASUS IoT join us on our journey.

AI Fest in Spain: Exploring the Potential of Artificial Intelligence in Careers, Communities, and Commerce

Posted by Alessandro Palmieri, Regional Lead for Spain Developer Communities

Google Developer Groups (GDGs) around the world are in a unique position to organize events on technology topics that community members are passionate about. That’s what happened in Spain in July 2021, where two GDG chapters decided to put on an event called AI Fest after noticing a lack of conferences dedicated exclusively to artificial intelligence. “Artificial intelligence is everywhere, although many people do not know it,” says Irene Ruiz Pozo, the organizer of GDG Murcia and GDG Cartagena. While AI has the potential to transform industries from retail to real estate with products like Dialogflow and Lending DocAI, “there are still companies falling behind,” she notes.

Image of Irene standing on stage at AI Fest Spain

Irene and her GDG team members recognized that creating a space for a diverse mix of people—students, academics, professional developers, and more—would not only enable them to share valuable knowledge about AI and its applications across sectors and industries, but it could also serve as a potential path for skill development and post-pandemic economic recovery in Spain. In addition, AI Fest would showcase GDGs in Spain as communities offering developer expertise, education, networking, and support.

Using the GDG network to find sponsors, partners, and speakers

The GDGs immediately got to work calling friends and contacts with experience in AI. “We started calling friends who were great developers and worked at various companies, we told them who we are, what we wanted to do, and what we wanted to achieve,” Irene says.

The GDG team found plenty of organizations eager to help: universities, nonprofit organizations, government entities, and private companies. The final roster included the Instituto de Fomento, the economic development agency of Spain’s Murcia region; the city council of Cartagena; Biyectiva Technology, which develops AI tools used in medicine, retail, and interactive marketing; and the Polytechnic University of Cartagena, where Irene founded and led the Google Developer Student Club in 2019 and 2020. Some partners also helped with swag and merchandising and even provided speakers. “The CEOs and different executives and developers of the companies who were speakers trusted this event from the beginning,” Irene says.

A celebration of AI and its potential

The event organizers lined up a total of 55 local and international speakers over the two-day event. Due to the ongoing COVID-10 pandemic, in-person attendance was limited to 50 people in a room at El Batel Auditorium and Conference Center in Cartagena, but sessions—speakers, roundtables, and workshops—were also live-streamed on YouTube on three channels to a thousand viewers.

Some of the most popular sessions included economics professor and technology lab co-founder Andrés Pedreño on “Competing in the era of Artificial Intelligence,” a roundtable on women in technology; Intelequa software developer Elena Salcedo on “Happy plants with IoT”; and Google Developer Expert and technology firm CEO Juantomás García on “Vertex AI and AutoML: Democratizing access to AI.” The sessions were also recorded for later viewing, and in less than a week after the event, there were more than 1500 views in room A, over 1100 in room B and nearly 350 views in the Workshops room.

The event made a huge impact on the developer community in Spain, setting an example of what tech-focused gatherings can look like in the COVID-19 era and how they can support more education, collaboration, and innovation across a wide range of organizations, ultimately accelerating the adoption of AI. Irene also notes that it has helped generate more interest in GDGs and GDSCs in Spain and their value as a place to learn, teach, and grow. “We’re really happy that new developers have joined the communities and entrepreneurs have decided to learn how to use Google technologies,” she says.

The effect on the GDG team was profound as well. “I have remembered why I started creating events–for people: to discover the magic of technology,” Irene says.

Taking AI Fest into the future—and more

Irene and her fellow GDG members are already planning for a second installment of AI Fest in early 2022, where they hope to be able to expect more in-person attendance. The team would also like to organize events focused on topics such as Android, Cloud, AR /VR, startups, the needs of local communities, and inclusion. Irene, who serves as a Women Techmakers Ambassador, is particularly interested in using her newly expanded network to host events that encourage women to choose technology and other STEM areas as a career.

Finally, Irene hopes that AI Fest will become an inspiration for GDGs around the world to showcase the potential of AI and other technologies. It’s a lot of work, she admits, but the result is well worth it. “My advice is to choose the area of technology that interests you the most, get organized, relax, and have a good team,” she advises.

Machine Learning GDEs: Q2 ‘21 highlights and achievements

Posted by HyeJung Lee, MJ You, ML Ecosystem Community Managers

Google Developers Experts (GDE) is a community of passionate developers who love to share their knowledge with others. Many of them specialize in Machine Learning (ML).

Here are some highlights showcasing the ML GDEs achievements from last quarter, which contributed to the global ML ecosystem. If you are interested in becoming an ML GDE, please scroll down to see how you can apply!

ML Developers meetup @Google I/O

ML Developer meetup at Google I/O

At I/O this year, we held two ML Developers Meetups (America/APAC and EMEA/APAC). Merve Noyan/Yusuf Sarıgöz (Turkey), Sayak Paul/Bhavesh Bhatt (India), Leigh Johnson/Margaret Maynard-Reid (USA), David Cardozo (Columbia), Vinicius Caridá/Arnaldo Gualberto (Brazil) shared their experiences in developing ML products with TensorFlow, Cloud AI or JAX and also introduced projects they are currently working on.

I/O Extended 2021

Chart showing what's included in Vertex AI

After I/O, many ML GDEs posted recap summaries of the I/O on their blogs. Chansung Park (Korea) outlined the ML keynote summary, while US-based Victor Dibia wrapped up the Top 10 Machine Learning and Design Insights from Google IO 2021.

Vertex AI was the topic of conversation at the event. Minori Matsuda from Japan wrote a Japanese article titled “Introduction of powerful Vertex AI AutoML Forecasting.” Similarly, Piero Esposito (Brazil) posted an article titled “Serverless Machine Learning Pipelines with Vertex AI: An Introduction,” including a tutorial on fully customized code. India-based Sayak Paul co-authored a blog post discussing key pieces in Vertex AI right after the Vertex AI announcement showing how to run a TensorFlow training job using Vertex AI.

Communities such as Google Developers Groups (GDG) and TensorFlow User Groups (TFUG) held extended events where speakers further discussed different ML topics from I/O, including China-based Song Lin’s presentation on TensorFlow highlights and Applications experiences from I/O which had 24,000 online attendees. Chansung Park (Korea) also gave a presentation on what Vertex AI is and what you can do with Vertex AI.

Cloud AI

Cloud AI

Leigh Johnson (USA) wrote an article titled Soft-launching an AI/ML Product as a Solo Founder, covering GCP AutoML Vision, GCP IoT Core, TensorFlow Model Garden, and TensorFlow.js. The article details the journey of a solo founder developing an ML product for detecting printing failure for 3D printers (more on this story is coming up soon, so stay tuned!)

Demo and code examples from Victor Dibia (USA)’s New York Taxi project, Minori Matsuda (Japan)’s article on AutoML and AI Platform notebook, Srivatsan Srinivasan (USA)’s video tutorials, Sayak Paul (India)’s Distributed Training in TensorFlow with AI Platform & Docker and Chansung Park (Korea)’s curated personal newsletter were all published together on Cloud blog.

Aqsa Kausar (Pakistan) gave a talk about Explainable AI in Google Cloud at the International Women’s Day Philippines event. She explained why it is important and where and how it is applied in ML workflows.

Learn agenda

Finally, ML Lab by Robert John from Nigeria, introduces the ML landscape on GCP covering from BigQueryML through AutoML to TensorFlow and AI Platform.

TensorFlow

Image of TensorFlow 2 and Learning TensorFlow JS books

Eliyar Eziz (China) published a book “TensorFlow 2 with real-life use cases”. Gant Laborde from the US authored book “Learning TensorFlow.js” which is published by O’Reilly and wrote an article “No Data No Problem – TensorFlow.js Transfer Learning” about seeking out new datasets to boldly train where no models have trained before. He also published “A Riddikulus Dataset” which talks about creating the Harry Potter dataset.

Iterated dilated convolutional neural networks for word segmentation

Hong Kong-based Guan Wang published a research paper, “Iterated Dilated Convolutional Neural Networks for Word Segmentation,” covering state-of-the-art performance improvement, which is implemented on TensorFlow by Keras.

Elyes Manai from Tunisia wrote an article “Become a Tensorflow Certified Developer ” – a guide to TensorFlow Certificate and tips.

BERT model

Greece-based George Soloupis wrote a tutorial “Fine-tune a BERT model with the use of Colab TPU” on how to finetune a BERT model that was trained specifically on greek language to perform the downstream task of text classification, using Colab’s TPU (v2–8).

JAX

India-based Aakash Nain has published the TF-JAX tutorial series (Part1, Part2, Part3, Part 4), aiming to teach everyone the building blocks of TensorFlow and JAX frameworks.

TensorFlow with Jax thumbnail

Online Meetup TensorFlow and JAX by Tzer-jen Wei from Taiwan covered JAX intro and use cases. It also touched upon different ways of writing TensorFlow models and training loops.

Neural Networks, with a practical example written in JAX

YouTube video Neural Networks, with a practical example written in JAX, probably the first JAX techtalk in Portuguese by João Guilherme Madeia Araújo (Brazil).

Keras

Keras logo

A lot of Keras examples were contributed by Sayak Paul from India and listed below are some of these examples.

Kaggle

Kaggle character distribution chart

Notebook “Simple Bayesian Ridge with Sentence Embeddings” by Ertuğrul Demir (Turkey) about a natural language processing task using BERT finetuning followed by simple linear regression on top of sentence embeddings generated by transformers.

TensorFlow logo screenshot from Learning machine learning and tensorflow with Kaggle competition video

Youhan Lee from Korea gave a talk about “Learning machine learning and TensorFlow with Kaggle competition”. He explained how to use the Kaggle platform for learning ML.

Research

Advances in machine learning and deep learning research are changing our technology, and many ML GDEs are interested and contributing.

Learning Neurl Compositional Neural Programs for Continuous Control

Karim Beguir (UK) co-authored a paper with the DeepMind team covering a novel compositional approach using Deep Reinforcement Learning to solve robotics manipulation tasks. The paper was accepted in the NeurIPS workshop.

Finally, Sayak Paul from India, together with Pin-Yu Chen, published a research paper, “Vision Transformers are Robust Learners,” covering the robustness of the Vision Transformer (ViT) against common corruptions and perturbations, distribution shifts, and natural adversarial examples.

If you want to know more about the Google Experts community and their global open-source ML contributions, please check the GDE Program website, visit the GDE Directory and connect with GDEs on Twitter and LinkedIn. You can also meet them virtually on the ML GDE’s YouTube Channel!

Developer updates from Coral

Posted by The Coral Team

We’re always excited to share updates to our Coral platform for building edge ML applications. In this post, we have some interesting demos, interfaces, and tutorials to share, and we’ll start by pointing you to an important software update for the Coral Dev Board.

Important update for the Dev Board / SoM

If you have a Coral Dev Board or Coral SoM, please install our latest Mendel update as soon as possible to receive a critical fix to part of the SoC power configuration. To get it, just log onto your board and install the update as follows:

Dev Board / Som

This will install a patch from NXP for the Dev Board / SoM’s SoC, without which it’s possible the SoC will overstress and the lifetime of the device could be reduced. If you recently flashed your board with the latest system image, you might already have this fix (we also updated the flashable image today), but it never hurts to fetch all updates, as shown above.

Note: This update does not apply to the Dev Board Mini.

Manufacturing demo

We recently published the Coral Manufacturing Demo, which demonstrates how to use a single Coral Edge TPU to simultaneously accomplish two common manufacturing use-cases: worker safety and visual inspection.

The demo is designed for two specific videos and tasks (worker keepout detection and apple quality grading) but it is designed to be easily customized with different inputs and tasks. The demo, written in C++, requires OpenGL and is primarily targeted at x86 systems which are prevalent in manufacturing gateways – although ARM Cortex-A systems, like the Coral Dev Board, are also supported.

demo image

Web Coral

We’ve been working hard to make ML acceleration with the Coral Edge TPU available for most popular systems. So we’re proud to announce support for WebUSB, allowing you to use the Coral USB Accelerator directly from Chrome. To get started, check out our WebCoral demo, which builds a webpage where you can select a model and run an inference accelerated by the Edge TPU.

 Edge TPU

New models repository

We recently released a new models repository that makes it easier to explore the various trained models available for the Coral platform, including image classification, object detection, semantic segmentation, pose estimation, and speech recognition. Each family page lists the various models, including details about training dataset, input size, latency, accuracy, model size, and other parameters, making it easier to select the best fit for the application at hand. Lastly, each family page includes links to training scripts and example code to help you get started. Or for an overview of all our models, you can see them all on one page.

Models, trained TensorFlow models for the Edge TPU

Transfer learning tutorials

Even with our collection of pre-trained models, it can sometimes be tricky to create a task-specific model that’s compatible with our Edge TPU accelerator. To make this easier, we’ve released some new Google Colab tutorials that allow you to perform transfer learning for object detection, using MobileDet and EfficientDet-Lite models. You can find these and other Colabs in our GitHub Tutorials repo.

We are excited to share all that Coral has to offer as we continue to evolve our platform. Keep an eye out for more software and platform related news coming this summer. To discover more about our edge ML platform, please visit Coral.ai and share your feedback at [email protected].

Announcing the 12 remarkable innovators selected for the upcoming Google for Startups Accelerator: Voice AI program


Posted by Jason Scott, Head of Startup Developer Ecosystem, USA & Saurabh Sharma, Head of Assistant Investments

Image from accelertor

In December 2020, we announced our inaugural Google for Startups Accelerator: Voice AI program, a 10-week digital accelerator designed to help North American voice technology startups to take their businesses to the next level. Today, we are proud to announce our cohort of 12 companies – collectively leveraging voice user interfaces to solve complex challenges across accessibility, education, and care:

Babbly, Toronto, Ontario, Canada

Babbly provides parents real-time insights on their child’s speech and language skills and recommends personalized activities that promote their child’s development.

Bespoken, Seattle, Washington, United States

Bespoken is the leader in automated testing, training, and monitoring for voice applications and devices. If you can talk to it, Bespoken can test it!

conversationHEALTH, Toronto, Ontario, Canada

conversationHEALTH enables conversational agents for patients and healthcare professionals in clinical trials, medical affairs, and commercial lines of business.

Nēdl, Santa Monica, California, United States

nēdl is democratizing access to the microphone by giving everyone their own live call-in radio station that transcribes, amplifies, and monetizes the audio creator’s words as they speak.

OTO.AI, New York, New York, United States

OTO is building an acoustic engine capable of delivering non-semantic insights (intonation, emotions, laughter,etc.) from voice streams in real-time, on a small compute footprint.

Piffle, San Francisco, California, United States

Piffle is a voice gaming platform that aims to nurture professional wellness through conversational gameplay.

Powow AI, New York, New York, United States

Powow is a SaaS platform which unleashes the power of AI in business meetings. Powow uses proprietary AI algorithms to transcribe and analyze meetings, transforming them into actionable insights.

SiMBi, Vancouver, British Columbia, Canada

SiMBi combines learners’ narrations with the text of a story to create an engaging audiovisual book that learners worldwide can read along to.

Talkatoo, Halifax, Nova Scotia, Canada

Talkatoo is a dictation software explicitly designed for veterinary and medical professionals, enabling them to save time in their practice.

Tinychef, New York, New York, United States

tinychef is a voice-first Culinary AI™ platform that helps consumers in their kitchen from their dinner dilemma, to grocery planning, grocery shopping, and cooking their meals with interactive experiences on smart speakers.

Voicify, Boston, Massachusetts, United States

Voicify’s SaaS platform allows brands and large enterprises to easily design, build, and deploy voice apps, chatbots, and other conversational experiences across voice assistants, chatbots, and social media platforms.

Vowel, New York, New York, United States

Vowel brings the best of productivity and communication platforms into a single, integrated meeting tool.

The program kicks off on Monday, March 15th and will focus on product design, technical infrastructure, customer acquisition, and leadership development – granting our founders access to an expansive network of mentors, senior executives, and industry leaders,

We are incredibly excited to support this group of entrepreneurs over the next three months, connecting them with the best of our people, products, and programming to advance their companies and solutions.

We look forward to augmenting the work of these 12 innovators and to showcasing their accomplishments on Thursday, May 20th at 12:30pm EST at our Google for Startups Accelerator: Voice AI Demo Day.

A Google for Startups Accelerator for startups using voice technology to better the world


Posted by Jason Scott, Head of Startup Developer Ecosystem, U.S., Google

At Google, we have long understood that voice user interfaces can help millions of people accomplish their goals more effectively. Our journey in voice began in 2008 with Voice Search — with notable milestones since, such as building our first deep neural network in 2012, our first sequence-to-sequence network in 2015, launching Google Assistant in 2016, and processing speech fully on device in 2019. These building blocks have enabled the unique voice experiences across Google products that our users rely on everyday.

Voice AI startups play a key role in helping build and deliver innovative voice-enabled experiences to users. And, Google is committed to helping tech startups deliver high impact solutions in the voice space. This month, we are excited to announce the Google for Startups Accelerator: Voice AI program, which will bring together the best of Google’s programs, products, people and technology with a joint mission to advance and support the most promising voice-enabled AI startups across North America.

As part of this Google for Startups Accelerator, selected startups will be paired with experts to help tackle the top technical challenges facing their startup. With an emphasis on product development and machine learning, founders will connect with voice technology and AI/ML experts from across Google to take their innovative solutions to the next level.

We are proud to launch our first ever Google for Startups Accelerator: Voice AI — building upon Google’s longstanding efforts to advance the future of voice-based computing. The accelerator will kick off in March 2021, bringing together a cohort of 10 to 12 innovative voice technology startups. If this sounds like your startup, we’d love to hear from you. Applications are open until January 28, 2021.

Coral makes edge AI even more accessible in 2020


Posted by the Coral team

Coral Dev Board Mini and Accelerator Module feature Google's Edge TPU co-processor to accelerate AI at the edge.

Since we launched Coral back in March 2019, we’ve added a number of new product form factors to accommodate the many ways users are adding on-device ML to their products. We’ve also streamlined the ML workflow and added capabilities like model pipelining with multiple Edge TPUs for an easier and more robust developer experience. And from this, we’ve helped enable amazing use cases from smart water meters that prevent water loss with Olea Edge, to systems for improving harvest yield with Farmwave, to noise cancellation in meetings in Google’s own Series One meeting kits.

This week, we’ll begin shipping the Coral Accelerator Module, a multi-chip module that combines the Edge TPU and it’s power circuitry into a solderable package. The module exposes PCIe and USB2 interfaces, which make it even easier to integrate Coral into custom designs. Several companies are already taking advantage of the compact size and capabilities with their new products coming to market. Read more about how Gumstix, STD, Siana Systems and IEI are using our module.

And in December, we’ll begin shipping the Dev Board Mini, a smaller, more power-efficient, and value-oriented board that brings forward a more traditional, flattened single-board computer design. The Dev Board Mini pairs a Mediatek 8167 SoC with the Coral Accelerator Module over USB 2 and is a great way to evaluate the module as the center of a project or deployment.

You can see the new Dev Board Mini and Accelerator Module in action in the latest episode of Level Up, where Markku Lepisto controls his studio lights with speech commands.

To get updates on when the board will be available for purchase and other Coral news, sign up for our newsletter.

Developing for the edge, now simplified

We recently announced a new version of the Coral ML APIs and tools. This release brings the C++ API into parity with Python and makes it more modular, reusable and performant. At the same time it eliminates unnecessary abstractions and surfaces replacing them with native TensorFlow Lite APIs. This release also graduates the Model Pipelining API out of beta and introduces a new model partitioner that automatically partitions models based on profiling and up to 10x better performance.

We’ve added a pre-trained version of MobileDet — a state-of-the-art object detection model for mobile systems — into our models portfolio. We’re migrating our model-development workflow to TensorFlow 2, and we’re including a handful of updated or new models based on the TF2 Keras framework. For details, check out the full announcement on the TensorFlow blog.

We’re also excited to see great developer tools coming from our ecosystem partners. For example, PerceptiLabs offers a visual API for building TensorFlow models and recently published a new demo which trains a machine learning model to identify sign language optimized for the edge with Coral.

The MRQ design from SigFox enables prototyping at the edge for low bandwidth IoT solutions with Coral

The MRQ design from SigFox enables prototyping at the edge for low bandwidth IoT solutions with Coral

And SigFox released a radio transceiver board that stacks on either the Coral Dev Board or Dev Board Mini. This allows small data payloads to be transmitted across low power, long range radio networks for use cases like smart cities, fleet management, asset tracking, agriculture and energy. The PCB design will be offered as a free download on SigFox’s website. Google Cloud Solutions Architect Markku Lepisto will present the new design today, in the opening keynote at SigFox Connect.

Customers with a Coral edge

The tool, from Farmwave, includes custom-developed ML models, a harvester-mounted box with cameras, an in-cab display, and on- device AI acceleration from Coral.

The tool, from Farmwave, includes custom-developed ML models, a harvester-mounted box with cameras, an in-cab display, and on- device AI acceleration from Coral.

Just in time for harvest we wanted to share a story about how Farmwave is using Coral to improve the efficiency of farm equipment and reduce food waste. Traditional yield loss analysis involves hand-counting grains of corn left on the ground mid harvest. It’s a time and labor intensive task, and not feasible for farmers who measure the value of their half-million-dollar combines in minutes spent running them.

By leveraging Coral’s on-device AI capabilities, Farmwave was able to build a system that automates the count while the machine is running. Thus allowing farmers to make real-time adjustments to harvesting machines in response to conditions in the field, which can make a big difference in yield.

Kura Sushi designed their intelligent QA system using a Raspberry Pi paired with the Coral USB Accelerator

Kura Sushi designed their intelligent QA system using a Raspberry Pi paired with the Coral USB Accelerator

Kura Revolving Sushi Bar in Japan has always been committed to the highest standards of health and safety for its customers. Known for their tech forward approach, Kura has dabbled in sushi making robots, an automated prize machine called Bikkura-pon, and a patented dome-shaped dish cover, aptly dubbed Mr. Fresh. But most recently, Kura has used Coral to develop an AI powered system that not only facilitates efficiency for better customer experiences, but also enables better tracking to prevent foodborne illnesses.

Making AI more accessible

While this year has presented the world with many obstacles, we’ve been impressed by the new ideas and innovations coming forward through technology. By providing the necessary tools and technology for edge AI, we strive to empower society to create affordable, adaptable, and intelligent systems.

We are excited to share all that Coral has to offer as we evolve our platform. For a list of worldwide distributors, system integrators and partners, visit the Coral partnerships page.

Please visit Coral.ai to discover more about our edge ML platform and share your feedback at [email protected]. To receive future Coral updates directly in your inbox, sign up for our newsletter.

MediaPipe 3D Face Transform

Posted by Kanstantsin Sokal, Software Engineer, MediaPipe team

Earlier this year, the MediaPipe Team released the Face Mesh solution, which estimates the approximate 3D face shape via 468 landmarks in real-time on mobile devices. In this blog, we introduce a new face transform estimation module that establishes a researcher- and developer-friendly semantic API useful for determining the 3D face pose and attaching virtual objects (like glasses, hats or masks) to a face.

The new module establishes a metric 3D space and uses the landmark screen positions to estimate common 3D face primitives, including a face pose transformation matrix and a triangular face mesh. Under the hood, a lightweight statistical analysis method called Procrustes Analysis is employed to drive a robust, performant and portable logic. The analysis runs on CPU and has a minimal speed/memory footprint on top of the original Face Mesh solution.

MediaPipe image

Figure 1: An example of virtual mask and glasses effects, based on the MediaPipe Face Mesh solution.

Introduction

The MediaPipe Face Landmark Model performs a single-camera face landmark detection in the screen coordinate space: the X- and Y- coordinates are normalized screen coordinates, while the Z coordinate is relative and is scaled as the X coordinate under the weak perspective projection camera model. While this format is well-suited for some applications, it does not directly enable crucial features like aligning a virtual 3D object with a detected face.

The newly introduced module moves away from the screen coordinate space towards a metric 3D space and provides the necessary primitives to handle a detected face as a regular 3D object. By design, you’ll be able to use a perspective camera to project the final 3D scene back into the screen coordinate space with a guarantee that the face landmark positions are not changed.

Metric 3D Space

The Metric 3D space established within the new module is a right-handed orthonormal metric 3D coordinate space. Within the space, there is a virtual perspective camera located at the space origin and pointed in the negative direction of the Z-axis. It is assumed that the input camera frames are observed by exactly this virtual camera and therefore its parameters are later used to convert the screen landmark coordinates back into the Metric 3D space. The virtual camera parameters can be set freely, however for better results it is advised to set them as close to the real physical camera parameters as possible.

MediaPipe image

Figure 2: A visualization of multiple key elements in the metric 3D space. Created in Cinema 4D

Canonical Face Model

The Canonical Face Model is a static 3D model of a human face, which follows the 3D face landmark topology of the MediaPipe Face Landmark Model. The model bears two important functions:

  • Defines metric units: the scale of the canonical face model defines the metric units of the Metric 3D space. A metric unit used by the default canonical face model is a centimeter;
  • Bridges static and runtime spaces: the face pose transformation matrix is – in fact – a linear map from the canonical face model into the runtime face landmark set estimated on each frame. This way, virtual 3D assets modeled around the canonical face model can be aligned with a tracked face by applying the face pose transformation matrix to them.

Face Transform Estimation

The face transform estimation pipeline is a key component, responsible for estimating face transform data within the Metric 3D space. On each frame, the following steps are executed in the given order:

  • Face landmark screen coordinates are converted into the Metric 3D space coordinates;
  • Face pose transformation matrix is estimated as a rigid linear mapping from the canonical face metric landmark set into the runtime face metric landmark set in a way that minimizes a difference between the two;
  • A face mesh is created using the runtime face metric landmarks as the vertex positions (XYZ), while both the vertex texture coordinates (UV) and the triangular topology are inherited from the canonical face model.

Effect Renderer

The Effect Renderer is a component, which serves as a working example of a face effect renderer. It targets the OpenGL ES 2.0 API to enable a real-time performance on mobile devices and supports the following rendering modes:

  • 3D object rendering mode: a virtual object is aligned with a detected face to emulate an object attached to the face (example: glasses);
  • Face mesh rendering mode: a texture is stretched on top of the face mesh surface to emulate a face painting technique.

In both rendering modes, the face mesh is first rendered as an occluder straight into the depth buffer. This step helps to create a more believable effect via hiding invisible elements behind the face surface.

MediaPipe image

Figure 3: An example of face effects rendered by the Face Effect Renderer.

Using Face Transform Module

The face transform estimation module is available as a part of the MediaPipe Face Mesh solution. It comes with face effect application examples, available as graphs and mobile apps on Android or iOS. If you wish to go beyond examples, the module contains generic calculators and subgraphs – those can be flexibly applied to solve specific use cases in any MediaPipe graph. For more information, please visit our documentation.

Follow MediaPipe

We look forward to publishing more blog posts related to new MediaPipe pipeline examples and features. Please follow the MediaPipe label on Google Developers Blog and Google Developers twitter account (@googledevs).

Acknowledgements

We would like to thank Chuo-Ling Chang, Ming Guang Yong, Jiuqiang Tang, Gregory Karpiak, Siarhei Kazakou, Matsvei Zhdanovich and Matthias Grundman for contributing to this blog post.

Doubling down on the edge with Coral’s new accelerator

Posted by The Coral Team

Coral image

Moving into the fall, the Coral platform continues to grow with the release of the M.2 Accelerator with Dual Edge TPU. Its first application is in Google’s Series One room kits where it helps to remove interruptions and makes the audio clearer for better video meetings. To help even more folks build products with Coral intelligence, we’re dropping the prices on several of our products. And for those folks that are looking to level up their at home video production, we’re sharing a demo of a pose based AI director to make multi-camera video easier to make.

Coral M.2 Accelerator with Dual Edge TPU

The newest addition to our product family brings two Edge TPU co-processors to systems in an M.2 E-key form factor. While the design requires a dual bus PCIe M.2 slot, it brings enhanced ML performance (8 TOPS) to tasks such as running two models in parallel or pipelining one large model across both Edge TPUs.

The ability to scale across multiple edge accelerators isn’t limited to only two Edge TPUs. As edge computing expands to local data centers, cell towers, and gateways, multi-Edge TPU configurations will be required to help process increasingly sophisticated ML models. Coral allows the use of a single toolchain to create models for one or more Edge TPUs that can address many different future configurations.

A great example of how the Coral M.2 Accelerator with Dual Edge TPU is being used is in the Series One meeting room kits for Google Meet.

The new Series One room kits for Google Meet run smarter with Coral intelligence

Coral image

Google’s new Series One room kits use our Coral M.2 Accelerator with Dual Edge TPU to bring enhanced audio clarity to video meetings. TrueVoice®, a multi-channel noise cancellation technology, minimizes distractions to ensure every voice is heard with up to 44 channels of echo and noise cancellation, making distracting sounds like snacking or typing on a keyboard a concern of the past.

Enabling the clearest possible communication in challenging environments was the target for the Google Meet hardware team. The consideration of what makes a challenging environment was not limited to unusually noisy environments, such as lunchrooms doubling as conference rooms. Any conference room can present challenging acoustics that make it difficult for all participants to be heard.

The secret to clarity without expensive and cumbersome equipment is to use virtual audio channels and AI driven sound isolation. Read more about how Coral was used to enhance and future-proof the innovative design.

Expanding the AI edge

Earlier this year, we reduced the prices of our prototyping devices and sensors. We are excited to share further price drops on more of our products. Our System-on-Module is now available for $99.99, and our Mini PCIe Accelerator, M.2 Accelerator A+E Key, and M.2 Accelerator B+M key are now available at $24.99. We hope this lower price will make our edge AI more accessible to more creative minds around the world. Later, this month our SoM offering will also expand to include 2 and 4GB RAM options.

Multi-cam with AI

Coral image

As we expand our platform and product family, we continue to keep new edge AI use cases in mind. We are continually inspired by our developer community’s experimentation and implementations. When recently faced with the challenges of multicam video production from home, Markku Lepistö, Solutions Architect at Google Cloud, created this real-time pose-based multicam tool he so aptly dubbed, AI Director.

We love seeing such unique implementations of on-device ML and invite you to share your own projects and feedback at [email protected].

For a list of worldwide distributors, system integrators and partners, visit the Coral partnerships page. Please visit Coral.ai to discover more about our edge ML platform.

Instant Motion Tracking with MediaPipe

Posted by Vikram Sharma, Software Engineering Intern; Jianing Wei, Staff Software Engineer; Tyler Mullen, Senior Software Engineer

Augmented Reality (AR) technology creates fun, engaging, and immersive user experiences. The ability to perform AR tracking across devices and platforms, without initialization, remains important for powering AR applications at scale.

Today, we are excited to release the Instant Motion Tracking solution in MediaPipe. It is built upon the MediaPipe Box Tracking solution we released previously. With Instant Motion Tracking, you can easily place fun virtual 2D and 3D content on static or moving surfaces, allowing them to seamlessly interact with the real world. This technology also powered MotionStills AR. Along with the library, we are releasing an open source Android application to showcase its capabilities. In this application, a user simply taps the camera viewfinder in order to place virtual 3D objects and GIF animations, augmenting the real-world environment.

gif of instant motion tracking in MediaPipe gif of instant motion tracking in MediaPipe

Instant Motion Tracking in MediaPipe

Instant Motion Tracking

The Instant Motion Tracking solution provides the capability to seamlessly place virtual content on static or motion surfaces in the real world. To achieve that, we provide the six degrees of freedom tracking with relative scale in the form of rotation and translation matrices. This tracking information is then used in the rendering system to overlay virtual content on camera streams to create immersive AR experiences.

The core concept behind Instant Motion Tracking is to decouple the camera’s translation and rotation estimation, treating them instead as independent optimization problems. This approach enables AR tracking across devices and platforms without initialization or calibration. We do this by first finding the 3D camera translation using only the visual signals from the camera. This involves estimating the target region’s apparent 2D translation and relative scale across frames. The process can be illustrated with a simple pinhole camera model, relating translation and scale of an object in the image plane to the final 3D translation.

image

By finding the change in relative size of our tracked region from view position V1 to V2, we can estimate the relative change in distance from the camera.

Next, we obtain the device’s 3D rotation from its built-in IMU (Inertial Measurement Unit) sensor. By combining this translation and rotation data, we can track a target region with six degrees of freedom at relative scale. This information allows for the placement of virtual content on any system with a camera and IMU functionality, and is calibration free. For more details on Instant Motion Tracking, please refer to our paper.

A MediaPipe Pipeline for Instant Motion Tracking

A diagram of Instant Motion Tracking pipeline is shown below, consisting of four major components: a Sticker Manager module, a Region Tracking module, a Matrices Manager module, and lastly a Rendering System. Each of the components consists of MediaPipe calculators or subgraphs.

Diagram

Diagram of Instant Motion Tracking Pipeline

The Sticker Manager accepts sticker data from the application and produces initial anchors (tracked region information) based on user taps, and user gesture controls for every sticker object. Initial anchors are then sent to our Region Tracking module to generate tracked anchors. The Matrices Manager combines this data with our device’s rotation matrix to produce six degrees-of-freedom poses as model matrices. After integrating any user-specified transforms like asset scaling, our final poses are forwarded to the Rendering System to render all virtual objects overlaid on the camera frame to produce the output AR frame.

Using the Instant Motion Tracking Solution

The Instant Motion Tracking solution is easy to use by leveraging the MediaPipe cross-platform framework. With camera frames, device rotation matrix, and anchor positions (screen coordinates) as input, the MediaPipe graph produces AR renderings for each frame, providing engaging experiences. If you wish to integrate this Instant Motion Tracking library with your system or application, please visit our documentation to build your own AR experiences on any device with IMU functionality and a camera sensor.

Augmenting The World with 3D Stickers and GIFs

Instant Motion Tracking solution allows bringing both 3D stickers and GIF animations into Augmented Reality experiences. GIFs are rendered on flat 3D billboards placed in the world, introducing fun and immersive experiences with animated content blended into the real environment.Try it for yourself!

Demonstration of GIF placement in 3D Demonstration of GIF placement in 3D

Demonstration of GIF placement in 3D

MediaPipe Instant Motion Tracking is already helping PixelShift.AI, a startup applying cutting-edge vision technologies to facilitate video content creation, to track virtual characters seamlessly in the view-finder for a realistic experience. Building upon Instant Motion Tracking’s high-quality pose estimation, PixelShift.AI enables VTubers to create mixed reality experiences with web technologies. The product is going to be released to the broader VTuber community later this year.

Instant

Instant Motion Tracking helps PixelShift.AI create mixed reality experiences

Follow MediaPipe

We look forward to publishing more blog posts related to new MediaPipe pipeline examples and features. Please follow the MediaPipe label on Google Developers Blog and Google Developers twitter account (@googledevs).

Acknowledgement

We would like to thank Vikram Sharma, Jianing Wei, Tyler Mullen, Chuo-Ling Chang, Ming Guang Yong, Jiuqiang Tang, Siarhei Kazakou, Genzhi Ye, Camillo Lugaresi, Buck Bourdon, and Matthias Grundman for their contributions to this release.

ML Kit Pose Detection Makes Staying Active at Home Easier

Posted by Kenny Sulaimon, Product Manager, ML Kit; Chengji Yan and Areeba Abid, Software Engineers, ML Kit

ML Kit logo

Two months ago we introduced the standalone version of the ML Kit SDK, making it even easier to integrate on-device machine learning into mobile apps. Since then we’ve launched the Digital Ink Recognition API, and also introduced the ML Kit early access program. Our first two early access APIs were Pose Detection and Entity Extraction. We’ve received an overwhelming amount of interest in these new APIs and today, we are thrilled to officially add Pose Detection to the ML Kit lineup.

ML Kit Overview

A New ML Kit API, Pose Detection

Examples of ML Kit Pose Detection

ML Kit Pose Detection is an on-device, cross platform (Android and iOS), lightweight solution that tracks a subject’s physical actions in real time. With this technology, building a one-of-a-kind experience for your users is easier than ever.

The API produces a full body 33 point skeletal match that includes facial landmarks (ears, eyes, mouth, and nose), along with hands and feet tracking. The API was also trained on a variety of complex athletic poses, such as Yoga positions.

Skeleton image detailing all 33 landmark points

Skeleton image detailing all 33 landmark points

Under The Hood

Diagram of the ML Kit Pose Detection Pipeline

The power of the ML Kit Pose Detection API is in its ease of use. The API builds on the cutting edge BlazePose pipeline and allows developers to build great experiences on Android and iOS, with little effort. We offer a full body model, support for both video and static image use cases, and have added multiple pre and post processing improvements to help developers get started with only a few lines of code.

The ML Kit Pose Detection API utilizes a two step process for detecting poses. First, the API combines an ultra-fast face detector with a prominent person detection algorithm, in order to detect when a person has entered the scene. The API is capable of detecting a single (highest confidence) person in the scene and requires the face of the user to be present in order to ensure optimal results.

Next, the API applies a full body, 33 landmark point skeleton to the detected person. These points are rendered in 2D space and do not account for depth. The API also contains a streaming mode option for further performance and latency optimization. When enabled, instead of running person detection on every frame, the API only runs this detector when the previous frame no longer detects a pose.

The ML Kit Pose Detection API also features two operating modes, “Fast” and “Accurate”. With the “Fast” mode enabled, you can expect a frame rate of around 30+ FPS on a modern Android device, such as a Pixel 4 and 45+ FPS on a modern iOS device, such as an iPhone X. With the “Accurate” mode enabled, you can expect more stable x,y coordinates on both types of devices, but a slower frame rate overall.

Lastly, we’ve also added a per point “InFrameLikelihood” score to help app developers ensure their users are in the right position and filter out extraneous points. This score is calculated during the landmark detection phase and a low likelihood score suggests that a landmark is outside the image frame.

Real World Applications

Examples of a pushup and squat counter using ML Kit Pose Detection

Keeping up with regular physical activity is one of the hardest things to do while at home. We often rely on gym buddies or physical trainers to help us with our workouts, but this has become increasingly difficult. Apps and technology can often help with this, but with existing solutions, many app developers are still struggling to understand and provide feedback on a user’s movement in real time. ML Kit Pose Detection aims to make this problem a whole lot easier.

The most common applications for Pose detection are fitness and yoga trackers. It’s possible to use our API to track pushups, squats and a variety of other physical activities in real time. These complex use cases can be achieved by using the output of the API, either with angle heuristics, tracking the distance between joints, or with your own proprietary classifier model.

To get you jump started with classifying poses, we are sharing additional tips on how to use angle heuristics to classify popular yoga poses. Check it out here.

Learning to Dance Without Leaving Home

Learning a new skill is always tough, but learning to dance without the aid of a real time instructor is even tougher. One of our early access partners, Groovetime, has set out to solve this problem.

With the power of ML Kit Pose Detection, Groovetime allows users to learn their favorite dance moves from popular short-form dance videos, while giving users automated real time feedback on their technique. You can join their early access beta here.

Groovetime App using ML Kit Pose Detection

Staying Active Wherever You Are

Our Pose Detection API is also helping adidas Training, another one of our early access partners, build a virtual workout experience that will help you stay active no matter where you are. This one-of-a-kind innovation will help analyze and give feedback on the user’s movements, using nothing more than just your phone. Integration into the adidas Training app is still in the early phases of the development cycle, but stay tuned for more updates in the future.

How to get started?

If you would like to start using the Pose Detection API in your mobile app, head over to the developer documentation or check out the sample apps for Android and iOS to see the API in action. For questions or feedback, please reach out to us through one of our community channels.

Digital Ink Recognition in ML Kit

Posted by Mircea Trăichioiu, Software Engineer, Handwriting Recognition

A month ago, we announced changes to ML Kit to make mobile development with machine learning even easier. Today we’re announcing the addition of the Digital Ink Recognition API on both Android and iOS to allow developers to create apps where stylus and touch act as first class inputs.

Digital ink recognition: the latest addition to ML Kit’s APIs

Digital Ink Recognition is different from the existing Vision and Natural Language APIs in ML Kit, as it takes neither text nor images as input. Instead, it looks at the user’s strokes on the screen and recognizes what they are writing or drawing. This is the same technology that powers handwriting recognition in Gboard – Google’s own keyboard app, which we described in detail in a 2019 blog post. It’s also the same underlying technology used in the Quick, Draw! and AutoDraw experiments.

Handwriting input in Gboard

Turning doodles into art with Autodraw

With the new Digital Ink Recognition API, developers can now use this technology in their apps as well, for everything from letting users input text and figures with a finger or stylus to transcribing handwritten notes to make them searchable; all in near real time and entirely on-device.

Supports many languages and character sets

Digital Ink Recognition supports 300+ languages and 25+ writing systems including all major Latin languages, as well as Chinese, Japanese, Korean, Arabic, Cyrillic, and more. Classifiers parse written text into a string of characters

Recognizes shapes

Other classifiers can describe shapes, such as drawings and emojis, by the class to which they belong (circle, square, happy face, etc). We currently support an autodraw sketch recognizer, an emoji recognizer, and a basic shape recognizer.

Works offline

Digital Ink Recognition API runs on-device and does not require a network connection. However, you must download one or more models before you can use a recognizer. Models are downloaded on demand and are around 20MB in size. Refer to the model download documentation for more information.

Runs fast

The time to perform a recognition call depends on the exact device and the size of the input stroke sequence. On a typical mobile device recognizing a line of text takes about 100 ms.

How to get started

If you would like to start using Digital Ink Recognition in your mobile app, head over to the documentation or check out the sample apps for Android and iOS to see the API in action. For questions or feedback, please reach out to us through one of our community channels.