Let's dictate

In my favorite movie, Bladerunner, there is a scene where Deckard is enhancing a photograph he found in Leon’s room.
Unlike most sci-fi movies he doesn’t talk to an all-knowing AI entity but instead dictates his way to the information he’s after.

With all the hype around the Alexa, Siri and Google assistants I started Googling for Bladerunner style speech recognition and came across Web Speech API which is still in experimental stage. A further search lead me to the annyang JavaScript library. Yet another evidence supporting Atwood’s Law:

Any application that can be written in JavaScript, will eventually be written in JavaScrip.

It was tempting to write a Bladerunner style app where you can upload an image and then zoom around it with dictation. But I was put-off by having to replicate the feed-back sounds – so I figured that the dictation would be enough of a challenge to start with.

A little more practical solution is a helper to keep the score in a card game we call Kani in Iceland (pronounced like can in American and i like in imbecile).

Four players are dealt 13 cards in one hit. With all cards dealt they take turns in bidding for tricks with minimum bid of 8 tricks and a special bid of Kani for all 13 tricks. The bid’s winner then selects the trump suit and a playing partner who has his chosen card from that suit.
For example the bid is 10 tricks, the trump suit is spades and his partner is the player how holds the ace of spades.

So the app challenge was to dictate the selection of tricks and suit.

kani

This is a screen-shot of cKani responding to my dictations. To make this as simple as possible, the app is programmed to recognise next and last to rotate between suits and high and low to change the bid. The app is also programmed to recognise the suits of hearts, spades, clubs and diamonds; and bids of 8 - 13 and 50 for Kani.

Before you give this a try there are some compatibility issues to keep in mind. Or more specifically, this works with latest versions of Chrome on Mac and Windows 10 (I’ve tested it).
This does not work with Chrome on Android and you can forget about IE and Safari (but at least Microsoft is trying to keep up with Edge).

Besides the fun to program, I believe that there is a great potential in dictation as user interaction.

The most successful user interface to date is Google, just a text box where you type in what you want. The next step must surely be where you just say what you want.
And interestingly, programming the Kani interface with annyang is far easier than any keyboard or mouse interface I can think of.

This is the menu logic:

  cmds: CommandOption = {
    HEARTS: () => this.store.dispatch(new NavActions.SuitAction(SUITS.hearts)),
    SPADES: () => this.store.dispatch(new NavActions.SuitAction(SUITS.spades)),
    DIAMONDS: () => this.store.dispatch(new NavActions.SuitAction(SUITS.diamonds)),
    CLUBS: () => this.store.dispatch(new NavActions.SuitAction(SUITS.clubs)),
    NEXT: () => this.store.dispatch(new NavActions.NextSuitAction()),
    LAST: () => this.store.dispatch(new NavActions.LastSuitAction()),
    HIGH: () => this.store.dispatch(new NavActions.NextBidAction()),
    LOW: () => this.store.dispatch(new NavActions.LastBidAction()),
    EIGHT: () => this.store.dispatch(new NavActions.BidAction(BIDS.eight)),
    NINE: () => this.store.dispatch(new NavActions.BidAction(BIDS.nine)),
    TEN: () => this.store.dispatch(new NavActions.BidAction(BIDS.ten)),
    ELEVEN: () => this.store.dispatch(new NavActions.BidAction(BIDS.eleven)),
    TWELVE: () => this.store.dispatch(new NavActions.BidAction(BIDS.twelve)),
    THIRTEEN: () => this.store.dispatch(new NavActions.BidAction(BIDS.thirteen)),
    FIFTY: () => this.store.dispatch(new NavActions.BidAction(BIDS.KANI)),
    YES: () => this.sb.open('Ok', '', { duration: 3000 }),
    NO: () => this.sb.open('Try again then', '', { duration: 3000 })
  };

The clever thing about this is that the response to HEARTS is the the command SuitAction(SUITS.hearts), you don’t have to worry about putting a widget somewhere on the screen for user to type or gesture HEARTS. And the user is saved from the, often mind-boggling, task of figuring out out where that widget is.

Bladerunner is a movie that hasn’t just passed the test of time, but exceeded it. What has to be one of the greatest cinematic scene of all time is more relevant now than in the early 80’s. What is it to be human?