Recommended strategies and best practices for designing and developing games and stories on Google Assistant


Posted by Wally Brill and Jessica Dene Earley-Cha

Illustration of pink car collecting coins

Since we launched Interactive Canvas, and especially in the last year we have been helping developers create great storytelling and gaming experiences for Google Assistant on smart displays. Along the way we’ve learned a lot about what does and doesn’t work. Building these kinds of interactive voice experiences is still a relatively new endeavor, and so we want to share what we’ve learned to help you build the next great gaming or storytelling experience for Assistant.

Here are three key things to keep in mind when you’re designing and developing interactive games and stories. These three were selected from a longer list of lessons learned (stay tuned to the end for the link for the 10+ lessons) because they are dependent on Action Builder/SDK functionality and can be slightly different for the traditional conversation design for voice only experiences.

1. Keep the Text-To-Speech (TTS) brief

Text-to-speech, or computer generated voice, has improved exponentially in the last few years, but it isn’t perfect. Through user testing, we’ve learned that users (especially kids) don’t like listening to long TTS messages. Of course, some content (like interactive stories) should not be reduced. However, for games, try to keep your script simple. Wherever possible, leverage the power of the visual medium and show, don’t tell. Consider providing a skip button on the screen so that users can read and move forward without waiting until the TTS is finished. In many cases the TTS and text on a screen won’t always need to mirror each other. For example the TTS may say “Great job! Let’s move to the next question. What’s the name of the big red dog?” and the text on screen may simply say “What is the name of the big red dog?

Implementation

You can provide different audio and screen-based prompts by using a simple response, which allows different verbiage in the speech and text sections of the response. With Actions Builder, you can do this using the node client library or in the JSON response. The following code samples show you how to implement the example discussed above:

candidates:
- first_simple:
variants:
- speech: Great job! Let's move to the next question. What’s the name of the big red dog?
text: What is the name of the big red dog?

Note: implementation in YAML for Actions Builder

app.handle('yourHandlerName', conv => {
conv.add(new Simple({
speech: 'Great job! Let's move to the next question. What’s the name of the big red dog?',
text: 'What is the name of the big red dog?'
}));
});

Note: implementation with node client library

2. Consider both first-time and returning users

Frequent users don’t need to hear the same instructions repeatedly. Optimize the experience for returning users. If it’s a user’s first time experience, try to explain the full context. If they revisit your action, acknowledge their return with a “Welcome back” message, and try to shorten (or taper) the instructions. If you noticed the user has returned more than 3 or 4 times, try to get to the point as quickly as possible.

An example of tapering:

  • Instructions to first time users: “Just say words you can make from the letters provided. Are you ready to begin?”
  • For a returning user: “Make up words from the jumbled letters. Ready?”
  • For a frequent user: “Are you ready to play?”

Implementation

You can check the lastSeenTime property in the User object of the HTTP request. The lastSeenTime property is a timestamp of the last interaction with this particular user. If this is the first time a user is interacting with your Action, this field will be omitted. Since it’s a timestamp, you can have different messages for a user who’s last interaction has been more than 3 months, 3 weeks or 3 days. Below is an example of having a default message that is tapered. If the lastSeenTime property is omitted, meaning that it’s the first time the user is interacting with this Action, the message is updated with the longer message containing more details.

app.handle('greetingInstructions', conv => {
let message = 'Make up words from the jumbled letters. Ready?';
if (!conv.user.lastSeenTime) {
message = 'Just say words you can make from the letters provided. Are you ready to begin?';
}
conv.add(message);
});

Note: implementation with node client library

3. Support strongly recommended intents

There are some commonly used intents which really enhance the user experience by providing some basic commands to interact with your voice app. If your action doesn’t support these, users might get frustrated. These intents help create a basic structure to your voice user interface, and help users navigate your Action.

  • Exit / Quit

    Closes the action

  • Repeat / Say that again

    Makes it easy for users to hear immediately preceding content at any point

  • Play Again

    Gives users an opportunity to re-engage with their favorite experiences

  • Help

    Provides more detailed instructions for users who may be lost. Depending on the type of Action, this may need to be context specific. Defaults returning users to where they left off in game play after a Help message plays.

  • Pause, Resume

    Provides a visual indication that the game has been paused, and provides both visual and voice options to resume.

  • Skip

    Moves to the next decision point.

  • Home / Menu

    Moves to the home or main menu of an action. Having a visual affordance for this is a great idea. Without visual cues, it’s hard for users to know that they can navigate through voice even when it’s supported.

  • Go back

    Moves to the previous page in an interactive story.

Implementation

Actions Builder & Actions SDK support System Intents that cover a few of these use case which contain Google support training phrase:

  • Exit / Quit -> actions.intent.CANCEL This intent is matched when the user wants to exit your Actions during a conversation, such as a user saying, “I want to quit.”
  • Repeat / Say that again -> actions.intent.REPEAT This intent is matched when a user asks the Action to repeat.

For the remaining intents, you can create User Intents and you have the option of making them Global (where they can be triggered at any Scene) or add them to a particular scene. Below are examples from a variety of projects to get you started:

So there you have it. Three suggestions to keep in mind for making amazing interactive games and story experiences that people will want to use over and over again. To check out the full list of our recommendations go to the Lessons Learned page.

Thanks for reading! To share your thoughts or questions, join us on Reddit at r/GoogleAssistantDev.

Follow @ActionsOnGoogle on Twitter for more of our team’s updates, and tweet using #AoGDevs to share what you’re working on. Can’t wait to see what you build!

Location-based search results with DynamoDB and Geohash

Searching for the nearest or farthest location is easy thanks to DynamoDB and a small geocoding NPM package

Many datasets now include geospatial information — especially if you are working with mobile apps or Google Maps. Geospatial data is a fancy way of describing items containing a latitude and longitude, but handling queries based on the nearest, furthest or within a certain distance has some complexity.

Why is this difficult? Imagine you have 10 retail stores and a customer wants to know the nearest location. The obvious way is to find the location of the customer and then calculate the distance between that location and each of the stores. What’s the problem?

Well, there are two problems —the first is that the previous calculation assumes the two points are on a flat surface and the world isn’t flat (really, it isn’t). You have to factor in the roundness of the rock we live on, which brings us to the Haversine formula.

While the first problem will lead to calculation errors, the second problem is much more problematic. As the number of retail stores increases, the speed of your search slows down — the dreaded O(n) — because you must compare the user’s location against every store. And what if the user is moving so the original location updates frequently?

If you broaden the problem to “find the nearest coffee shop”, our first solution would require us to compare the user’s location to every coffee shop on the planet. So we need a much better way of solving the question, and preferably one that can manage global scale. Basically, something like the way Google Maps does it.

The Power of Geohash

Geohashes are a little complicated — but the key thing to know is that it divides the world into a number of squares in a grid. By doing this, we can eliminate most of the data upfront and only focus on the square our potential targets are in. It’s a pretty elegant solution.

  • The hash is up to 12 characters in length — the longer the hash, the smaller the square it references.
  • The first character of the hash identifies one of 32 cells in the grid, roughly 5000km x 5000km on the planet.
  • The second character identifies one of the 32 squares in the first cell, so the resolution of the two characters combined is now 1250km x 1250km.
  • This continues until the twelfth character, which can identify an area as small as just a couple of square inches on Earth.
For a detailed explanation, see https://www.movable-type.co.uk/scripts/geohash.html.

If you use a hash that is too long — in other words, the squares are too small — you will end up searching more squares for the results. If you use a hash that’s too short, you will end up returning too many items in the square and your search won’t be very fast. So you need to know the resolution of your search before selecting the length of the hash.

That might sound difficult — but it’s easier than you’d think. For our example of finding retail stores and coffee shops, it’s not necessary to have a 2-inch grid of the planet. We can resolve 1–10 kilometer searches with 5–7 characters.

The Geo Library for Amazon DynamoDB

There’s an AWS blog post from 2013 that details the Geo Library but it’s a Java-specific solution that needs a server. More recently there’s an NPM package that ports this work across to Node and can be used within a serverless architecture.

To test this out, I found a list of all the Starbucks locations in the US and then discovered a link where someone had turned this into JSON. Following the documentation in the NPM package, I cobbled together this script to configure the DynamoDB table and load it with data.

Let me walk you through the highlights.

First, let’s create the table which spins up a new DynamoDB table with the partitionKey and indexes needed to make the search work.

const ddbGeo = require('dynamodb-geo')

const config = new ddbGeo.GeoDataManagerConfiguration(ddb, 'yourTableName')
config.hashKeyLength = 5
const myGeoTableManager = new ddbGeo.GeoDataManager(config)
const setupTable = () => {

const createTableInput = ddbGeo.GeoTableUtil.getCreateTableRequest(config)

createTableInput.ProvisionedThroughput.ReadCapacityUnits = 5
console.dir(createTableInput, { depth: null })
  ddb.createTable(createTableInput).promise()
.then(function () { return ddb.waitFor('tableExists', { TableName: config.tableName }).promise() })
.then(function () { console.log('Table created and ready!') })
}

Next we load up the data:

const loadTable = () => {

const data = require('../data/starbucks_us_locations')

const putPointInputs = data.map(function (location) {
return {
RangeKeyValue: { S: uuid.v4() }
GeoPoint: {
latitude: location.position.lat,
longitude: location.position.lng
},
PutItemInput: {
Item: {
name: { S: location.name },
address: { S: location.address }
}
}
}
})

const BATCH_SIZE = 25
const WAIT_BETWEEN_BATCHES_MS = 1000
let currentBatch = 1

function resumeWriting() {
if (putPointInputs.length === 0) return Promise.resolve()
const thisBatch = []
for (let i = 0, itemToAdd = null; i < BATCH_SIZE && (itemToAdd = putPointInputs.shift()); i++) {
thisBatch.push(itemToAdd)
}
console.log('Writing batch ' + (currentBatch++) + '/' + Math.ceil(data.length / BATCH_SIZE))

return myGeoTableManager.batchWritePoints(thisBatch).promise()
.then(function () {
return new Promise(function (resolve) {
setInterval(resolve,WAIT_BETWEEN_BATCHES_MS)
})
})
.then(function () {
return resumeWriting()
})
}
return resumeWriting().catch(function (error) {
console.warn(error)
})
}

This loads the Starbucks locations from the json file, creates an array of items to insert into the tables, and uploads into DynamoDB in batches of 25 items. There is a delay introduced between each batch to slow down the insertion process, and reduce the burn on the Write Capacity Units (WCUs).

Once uploaded, you can see all the locations together with their geohash values and hash keys, which were calculated by the library automatically:

With the data loaded, querying is simple — we just supply the latitude and longitude of the search location, together with the radius of the search area:

const AWS = require('aws-sdk')
AWS.config.update({region: 'us-east-1'})

const ddb = new AWS.DynamoDB()
const ddbGeo = require('dynamodb-geo')

const config = new ddbGeo.GeoDataManagerConfiguration(ddb, 'yourTableName')
config.hashKeyLength = 5

const myGeoTableManager = new ddbGeo.GeoDataManager(config)

myGeoTableManager.queryRadius({
RadiusInMeter: 1000,
CenterPoint: {
latitude: 40.7769099,
longitude: -73.9822532
}
})
.then((locations) => console.log(locations))

Let’s try it out!

I built a simple front-end that uses the Starbucks data list to show the nearest Starbucks within 1 kilometer — it defaults to New York City since this looks like the most densely packed location for their stores.

If you click around the map, it fetches the list from DynamoDB based upon the new center of the map:

Visit https://caffeinate.jbes.dev/

Overall, the advantage of using DynamoDB for this solution is the extremely steady performance under load. And with on-demand capacity now available for DynamoDB, it can handle spiky traffic without you needing to manage provisioned throughput.

Let me know what you think in the comments below!


Location-based search results with DynamoDB and Geohash was originally published in A Cloud Guru on Medium, where people are continuing the conversation by highlighting and responding to this story.