How Apple Fumbled the Voice First Future
Xerox fumbled the future when it invented and then ignored the personal computer. With the Macintosh, Apple created the personal computer the Xerox Alto might have been.
Apple is also fumbling the future—the Voice First future. Voice First simply means our primary mode of interacting with computers in the future will be with our voice. When Apple bought Siri it had a solid 5 year lead in voice control. Now Amazon’s Alexa and Google’s Assistant have not only caught Siri, but they’ve surpassed her.
The story of how Apple is fumbling the Voice First future is passionately told by Brian Roemmele in a great interview with Rene Ritchie in his Vector podcast Why Siri needs to be a platform.
Brian covers a lot of ground in the interview, but there are a few main themes: Voice First is the Future; Apple Fumbled Voice First; Engineering First Cultures Suck at Product; Apple Needs to Lose the iPhone Tax and Build Siri as a Platform.
In each section I paraphrase quotes Brian made in the interview to explain the theme. I think you'll find it fun and provocative. Brian is an interesting guy.
Voice First is the Future
Brian mentions several times that he's been thinking about voice interfaces for a long time. If you are interested in more of his work he has his own site: Voice First Expert.
We think in a voice in our head. Anyone trying to type has to first put it in a voice in their head before typing. You’re transcribing your inner voice onto the keyboard. When you speak it’s quicker. Over 60% of text iMessages are composed using voice.
We’ve reached peak app. Voice will replace the app.
Some say humans are lazy. I say humans are always tool builders and are trying to make their life more productive.
Ninety percent of what we do is sifting and sorting Google results. An assistant will know how to do this. When you really analyze work to be done we have become the machine of the end result of a 9 million result Google search. You have to spend an hour sifting through 9 million results.
Google’s algorithm gets better all the time? No it doesn’t. Even though it knows what’s in your Gmail, a lot about your context, it still isn’t good enough. It’s not deeply contextual to you in a way a personal assistant would be. That’s what we're are ultimately going for is a personal assistant. None exist today. What they are are voice front-ends to AI.
AI will be on one chip. The question is what abstraction layers are built on top of that? Everyone said Steve, you need to buy a cell phone company. He had the wisdom to say no, I’m going to build an abstraction layer on their dumb pipes. The dumb pipes of AI are natural language processing, intent extraction, and all the other stuff. The entrepreneur will create an abstraction on top. The next social networks.
The next generation will have grown up with voice all around them. Older folks don’t touch apps anymore. Gets rid of apps.
Every appliance will just take a command from you. You don’t want to download an app to talk to a device.
Children grow up with iOS devices. They expect every screen should be manipulable with their fingers. Stop the philosophical bullshit about laptops not needing touch. See the world with the eyes of a child. They want to go to a screen and move something. Every computer should hear them, understand them, and interact with them.
Apple Fumbled Voice First
The last act as CEO of Steve Jobs was to acquire Siri. He saw Siri as Apple’s future, more important than the iPhone, iPad, and Mac combined. That’s how big he thought voice would become. Steve saw a future where people don’t need to be in-front of screens all the time. We should be able to tell our systems the work we want done. Does this mean no more screens? No. We’ll use our screens less, but voice will be first. The future is Voice First. In the AR and VR future we won’t be typing.
Siri felt like the same moment I touched the first iPhone. Little hairs went up on my back. I’m interacting with something historic.
In some ways Siri was more powerful as a standalone system than after Apple integrated it. There was great anticipation when it was being acquired. We didn’t know Steve wasn’t going to be around.
The dark ages came. A lot of behind the scenes promises were made to the people who built Siri, that Siri was going to be a platform, not just an appendage. Platform vs OS appendage is a philosophical construct that has really hurt Apple.
Siri died on the vine. Some of the best minds left the company. The main Siri people left and started Viv.
Apple had the opportunity to buy Viv but someone at Apple decided Viv wasn’t of value and let Samsung acquire it. Apple gave their chief competitor the most powerful AI tool I’ve seen in my life. I don’t know what kind of thinking was going on other than the philosophical divide inside a company that’s aging. Everything gets old, you have to reinvent yourself. How do you do that in the post Jobs world?
You start drinking too much of your own kool aid and you start believing the future is going to look like the past. You think that surfaces and things that you carry around in your pocket that you’ve gotten very used to and very rich making, you don’t want it to go away. Even though we’ve reached peak app nobody wants to say that. The average person downloaded less than three apps last year.
If you are Apple and your vision is thinner, faster, more feature rich devices and someone wakes you up one day and says your device is going to go away and most of your work is going to be done using your voice the advantage of your OS being beautiful having a device that is functionally more beautiful. You don’t want the voice world. We need a device. Yah, voice is interesting, but people is going to type because that’s what they’ve done in the past. That’s not how history has ever worked out.
Amazon has 12,000 people working in Alexa. That’s more the Google, Microsoft, Apple, everybody.
After CES a lot of noted analysis are saying Apple is glaringly behind. Made a bad mistake not taking Siri as a platform seriously.
Bottom line people are buying Echos and are using them. People are buying Echo’s in packs of half a dozen. That means people were sticking them in every room. People are using them.
People are listening to music and setting timers, but they are also getting things done. The average person adopted Echo before the tech world.
Siri could have been number one. The reason they aren’t is because they aren’t using a technology that’s really their own, they are borrowing technologies from other companies.
Siri teams said to Apple this is just a demo platform. We need to make a self-programming platform. We need to create an AI that writes its own code using voice to mediate. This is where Viv is heading. Viv builds on its own ontologies and taxonomies. What if building an app is your kid talking to the assistant and building it in real-time?
Engineering First Cultures Suck at Product
This is my favorite theme from the talk. He’s right. Engineers will putter around their digital garden forever. Someone has to have the vision of when it’s time to harvest. If you don't harvest you don't eat.
Engineers are going to be too careful. They think this use case it might break. Need a leader to say don’t care, we made something beautiful, we’re shipping it. Every product needs a leader to say we’re shipping it. It’s good enough.
Expectation was Google would be where Amazon is. The reason Amazon won is because the Echo was built by a merchant, not an engineer. It was built by someone who has to satisfy people in real-time. When you’re a merchant if you don’t sell stuff you’re out of business. Steve was a merchant. When he got on stage he was a merchant he was doing a sales seminar. We don’t have that. Jeff Bezos is the closest we have. There’s a rationalism. People have to prove it with their wallet. Steve was always number two. He was always fighting a bigger company. He had to make sure he was satisfying people and delighting people to a level that was beyond their expectation.
You need to have a balance with the real world. The reason Steve did so well walking into the Palo Alto research center is he walked into an engineering only operation. That computer was done. The Alto was ready to go but the engineers wouldn’t let go of it. Steve said I only saw three things and I should have seen ten and that gave me the Mac. They said it wasn’t ready. He said what are you talking about? I’m going to slap them together and get it out. It’s ready.
You need someone who transcends engineering. That says let’s go with it. It ain't’ perfect but it’s better than what's out there. If you live and breathe by the engineering culture you have a problem.
Google is engineers. If you believe the thing that’s going to make you successful is only engineering talent then good luck with that.
Apple Needs to Lose the iPhone Tax and Build Siri as a Platform
Microsoft fumbled their future because of the Windows Strategy Tax. Windows was the cash cow and anything internally that threatened the cash cow had its air supply cut off. Look how will Microsoft is doing now that they've stopped that nonsense.
What I hadn't considered in Siri languished because Apple has an iPhone tax. Everything has to serve the iPhone ecosystem, even the new HomePod, which makes no sense at all. Brian makes a persuasive case that Apple held Siri back from being all it could be because it would compete with iOS. As everyone knows, that's a dead end.
When you’re getting disrupted by an interface that doesn’t allow you to showcase the greatness of your company you don’t want to accept it. You don’t want to thing will be controlled by a disembodied voice. The struggle will be the personal assistant the bonds with us better, that understands us better, that we trust more not to use our information to sell us toasters. Apple is in the best position to deliver this. The company needs a reset. Needs to say this is it’s own platform. It’s going to mediate everything Apple does. It needs to have SiriOS. It needs to have its own development team. Amazon already employs most of the experts
The problem is the debate in Apple is if Siri is a platform or not.
We need SiriOS. It’s it’s own platform. It’s going to live and die on its own, but it’s going to touch everything we do from now and the future. It’s an AI mediated OS. It connects all the ontologies and taxonomies that were’ building. Voice will mediate it. Open it up to the developer community to a level no other voice based system has ever been opened up before. Need to allow developers in real-time build what workflow promises. This real-time ability to build solutions based on the intent of the user. The ability to in real-time pull from the cloud. All the apps will be in the cloud. Downloading and invoking the app will not last. Downloading an app will be antiquated. The OS creates the context and continuity. What did the person just ask me? Is it in the same context of what they just asked me? Is a continuity of what I just did? That’s where the low level OS functions. Carry along the conversation wherever it goes. It’s not general AI. Threading the context of the ontologies you need and solving the problem you need. It remembers these contexts in a “neuron” that grows and is added to at a time. Note: there's more to the explanation in the pdocast, but it's very difficult to accurately paraphrase.
Don’t care about general AI or making people care about thinking they’re talking to another human. Care about extracting contexts so they can make a command and get a lot of work done.
HomePod requires an iPhone around. It as no intelligence unless the iPhone is around. Someone inside Apple won the argument where the HomePod is just an appendage to an iPhone. Need an iPhone tethered to it. Bad decision. See Where Are the Siri Apps? for some background.
Apple can dominate by leveraging privacy. Promising data won’t be used in ways users can’t imagine.
Apple hobbled Siri on AirPods. Didn’t give Siri and Vocal IQ teams. Real time contextual programming. Powerful. Not seeing the results of that.
Stop saying Siri is an appendage to an OS. Let it become its own platform. Let it grow. If it end the iphone then it was supposed to end. Have a rich and vital developer ecosystem.
The fallacy of Amazon is using skills and keywords is a dead end.
Google has their own problem. They see the assistant as an appendage to the search arm.
The business model of Voice First is not pay-per-click ads, it’s voice commerce. There’s no brand when you’re ordering toilet paper via voice. Amazon doesn’t care. They just hope people buy more paper towels.
This is the vision programmers dreamed of when Apple bought Siri. Let's hope there's still time for this dream to come true.
Related Articles
- On HackerNews
- Learn more about the Voice First future in the podcast Why Siri needs to be a platform.
- ‘I’m Not Sure I Understand’—How Apple’s Siri Lost Her Mojo