Last year, Apple announced SiriKit, which provides a lot of opportunities for the developers to provide functionality to the users via Siri. However, there are restrictions in the domains that can utilize this feature and if your app isn’t part of those domains, then you can’t make a big use of it. This year, Apple announced more domains, which means new opportunities for developers. There are updates in the Payments domain, meaning users can now send/transfer money between their accounts, pay bills and search through their accounts. New domain called visual codes is also introduced, which gives your app a chance to show meaningful visual code in the Siri context. This can be very handy in cases where your app stores digital tickets (for transport, cinemas, sport events etc), and when the user is nearby a place for validating the ticket, they can just say ‘Hey Siri, show my MovieTicketsApp ticket’ and Siri will ask your app to provide the ticket. Another new domain is Lists and Notes, which can be used for adding/removing items to a todo list, or adding notes. It’s really handy domain, which we will explore in more details in this post. We will create an app that will add/remove items to a grocery list, similar to what we did by using the Speech framework and Api.ai in this post.
It’s that time of the year – Apple’s annual worldwide developer conference (WWDC), where new technologies and updates to the existing ones in Apple’s platforms are announced to the public. It’s the time when iOS developers start testing their apps to see which of the introduced changes might have broken their products. Also, the time innovators eagerly anticipate, to start exploring the new technologies and how they might be utilized in their existing apps or in completely new ideas. It’s an exciting time with lots of expectations and wishes, and Apple always delivers (and sometimes even surprises us). So let’s see what’s new in this edition.
From 18th to 20th April, I had the chance to attend the CodeMobile conference in Chester, UK. This was the first edition of the conference and in this post I will share my impressions of what was happening in those 3 days.
The organizers had the idea to have a conference in Chester, since there are not many developer conferences outside of London. Chester is a lovely town in north-west England, around 40 miles from Manchester. Getting there is pretty easy – we took the plane to Manchester and then the train to Chester, which was about an hour ride. The town itself has an interesting architecture, with bits of Roman influence.
On the WWDC conference in 2016, Apple announced SiriKit, which enables developers to provide functionality that can be executed directly from Siri, without opening the app. This is just another step to the idea of using new, innovative ways to interact with the users by using conversational interfaces, simplifying the whole user experience. Your app can now provide functionality to Siri directly from the lock screen and when the app is not even started. However, as it’s usually the case with Apple, there are some limitations. You can use SiriKit only for certain predefined domains, check the Siri programming guide for reference:
– VoIP calling – Messaging – Payments
– Ride booking
– CarPlay (automotive vendors only)
– Restaurant reservations (requires additional support from Apple).
So if your app is not solving problems in one of those domains, you will need to wait (or even suggest to Apple) for an extension in the domain that your app needs. In this post, we will look at the “Ride booking” domain. We will build a simple app that will reserve (fake) ride between the two locations provided by the user. So let’s get started!
Api.ai is a conversational user experience platform, recently acquired by Google. It uses natural language processing and machine learning algorithms to extract entities and actions from text. The best thing is that it has a web application, through which you can train your intents with custom sentences and based on that, get a JSON response with the recognized data. This brings a whole new set of opportunities for developers, since natural language processing and machine learning are not trivial tasks – it requires a lot of expertise and research in this area to get it right. On top of that, the service is currently free for developers. As we will see, api.ai offers a lot of powerful features and it’s definitely worth a look.
In this post, we will extend the grocery list app we were developing in Playing with Speech and Text to speech with synthesizers, so make sure to check those two posts first. One thing we did very naively in those two posts was the extraction of the words in a sentence – it was done by plain string matching with hardcoded predefined words in our app. It didn’t take in consideration the context in which the key words were spoken. For example, if you said something like “I don’t need chicken anymore”, it will still add chicken to the list, although it’s clear that we have to remove it. Let’s solve this and put some intelligence in our app by using api.ai!
We’ve seen in the previous post how an iOS device can understand and transcript the voice commands we give to it (speech to text). In this post, we will see the opposite – how the device can communicate an information we have as a string in our app, with speech. We will extend the grocery list app from the previous post (make sure to check that one out first), by adding a functionality to tell the user what remaining products they need to buy from the list. We will also provide a way to customize the voice that will do the speaking, through a settings page.
In order to accomplish this, we will need a different class (AVSpeechSynthesizer) from a different framework (AVFoundation). As the Apple docs tell us, this class produces synthesized speech from text on an iOS device, and provides methods for controlling or monitoring the progress of ongoing speech – which is exactly what we need, so let’s get started!
At the latest WWDC (2016), Apple announced SiriKit, which enables developers to provide extensions to Siri with their apps’ functionality. We will talk about SiriKit in other posts. Now we will focus on another brand new framework, which was probably in the shadow of SiriKit – Speech framework. Although it only had one short (11 minutes) prerecorded video on WWDC, the functionalities it offers might be very interesting to developers. The Speech framework is actually the same voice recognition system that SiriKit uses.
What does the Speech framework offer? It recognizes both live and prerecorded speech, creates transcriptions and alternative interpretations of the recongnized text, as well as confidence levels on how accurate is the transcription. Sounds similar to what Siri does, so what’s the difference between SiriKit and the Speech Framework?