Build Talking Apps for Alexa
Creating Voice-First, Hands-Free User Experiences
by Craig Walls
Voice recognition is here at last. Alexa and other voice assistants have
now become widespread and mainstream. Is your app ready for voice
interaction? Learn how to develop your own voice applications for Amazon
Alexa. Start with techniques for building conversational user interfaces
and dialog management. Integrate with existing applications and visual
interfaces to complement voice-first applications. The future of
human-computer interaction is voice, and we’ll help you get ready for
it.
For decades, voice-enabled computers have only existed in the realm of
science fiction. But now the Alexa Skills Kit (ASK) lets you develop
your own voice-first applications. Leverage ASK to create engaging and
natural user interfaces for your applications, enabling them to listen
to users and talk back. You’ll see how to use voice and sound as
first-class components of user-interface design.
We’ll start with the essentials of building Alexa voice applications,
called skills, including useful tools for creating, testing, and
deploying your skills. From there, you can define parameters and dialogs
that will prompt users for input in a natural, conversational style.
Integrate your Alexa skills with Amazon services and other backend
services to create a custom user experience. Discover how to tailor
Alexa’s voice and language to create more engaging responses and speak
in the user’s own language. Complement the voice-first experience with
visual interfaces for users on screen-based devices. Add options for
users to buy upgrades or other products from your application. Once all
the pieces are in place, learn how to publish your Alexa skill for
everyone to use.
Create the future of user interfaces using the Alexa Skills Kit today.
What You Need
You will need a computer capable of running the latest version of
Node.js, a Git client, and internet access.
Resources
Releases:
- P1.0 2022/04/23
- B9.0 2022/04/11
- B8.0 2022/01/06
- B7.0 2021/07/16
- Introduction
- Who Should Read This Book?
- About This Book
- Online Resources
- Alexa, Hello
- How Alexa Works
- Dissecting Skills
- Installing the ASK CLI
- Creating Your First Alexa Skill
- Deploying the Skill
- Wrapping Up
- Testing Alexa Skills
excerpt
- Considering Skill Testing Styles
- Semi-Automated Testing
- Automated Testing with Alexa Skill Test Framework
- Automated Testing with Bespoken’s BST
- Wrapping Up
- Parameterizing Intents with Slots
excerpt
- Adding Slots to an Intent
- Fetching Entity Information
- Creating Custom Types
- Extending Built-In Types
- Enabling Flexibility with Synonyms
- Handling Multi-Value Slots
- Wrapping Up
- Creating Multi-Turn Dialogs
- Adding Dialogs and Prompts in the Interaction Model
- Eliciting Missing Slot Values
- Validating Slot Values
- Confirming Slots
- Explicitly Handling Dialog Delegation
- Wrapping Up
- Integrating User Data
- Accessing a User’s Amazon Info
- Linking with External APIs
- Wrapping Up
- Embellishing Response Speech
excerpt
- Getting to Know SSML
- Testing SSML with the Text-to-Speech Simulator
- Changing Alexa’s Voice
- Adjusting Pronunciation
- Inserting Breaks in Speech
- Adding Sound Effects and Music
- Applying SSML in Skill Responses
- Escaping Reserved Characters
- Writing Responses with Markdown
- Wrapping Up
- Mixing Audio
- Introducing APL for Audio
- Authoring APL-A Templates
- Making a Sound
- Combining Sounds
- Applying Filters
- Defining Custom Components
- Returning APL-A Responses
- Wrapping Up
- Localizing Responses
- Translating the Interaction Model
- Localizing Spoken Responses
- Testing Localization
- Fixing Language Pronunciation
- Using Language-Specific Voices
- Wrapping Up
- Complementing Responses with Cards
- Embellishing Responses with Cards
- Returning Simple Cards
- Rendering Images on Cards
- Wrapping Up
- Creating Visual Responses
- Introducing the Alexa Presentation Language
- Creating a Rich APL Document
- Applying Styles
- Defining Resource Values
- Injecting Model Data
- Handling Touch Events
- Wrapping Up
- Sending Events
- Publishing Proactive Events
- Sending Reminders
- Wrapping Up
- Selling Stuff
- Creating Products
- Handling Purchases
- Upselling Products
- Refunding a Purchase
- Wrapping Up
- Publishing Your Skill
- Tying Up Loose Ends
- Completing the Skill Manifest
- Submitting for Certification and Publication
- Promoting Your Skill with Quick Links
- Gathering Usage Metrics
- Wrapping Up
- Defining Conversational Flows
- Introducing Alexa Conversations
- Starting a New Alexa Conversations Project
- Getting to know ACDL
- Defining the Conversation Model
- Simplifying the Conversation Model
- Defining Response Templates
- Handling the Action Request
- Deploying and Testing the Conversation
- Wrapping Up
- Running and Debugging Skill Code Locally
- Deploy Your Skill
- Running Skill Code Locally
- Debugging Skill Code
- Troubleshooting Your Skill
- “Hmmm. I don’t know that one.” or “Hmmm. I have a few skills
that can help.”
- A Skill Other Than the One You Expect Is Launched
- An Utterance Is Handled by the Wrong Intent Handler
- You See “
<Audio Only>
” in the Response
- “There was a problem with the requested skill’s response.”
- “You just triggered {Some Intent}.”
- Skill ID Not Found when Deploying a Skill
- “Sorry, I had trouble doing what you asked. Please try again.”
Author
Craig Walls is a principal software engineer at Pivotal, a popular
author, an enthusiastic supporter of Spring Framework and voice-first
applications, and a frequent conference speaker.