AI Assistants: The Next User Interface Paradigm
SEE LAST PAGE OF THIS REPORT Paul Sagawa / Artur Pylak
FOR IMPORTANT DISCLOSURES 203.901.1633 /.1634
psagawa@ / firstname.lastname@example.org
July 13, 2016
AI Assistants: The Next User Interface Paradigm
AI powered virtual assistants are poised to replace the “App” paradigm as the primary user interface for electronic devices and the internet. The capabilities of these products will improve dramatically with developments in deep learning technology that will come in four areas: 1. Recognizing queries – voice, gestures, facial expressions, queries embedded in other apps, etc.; 2. Interpreting queries – natural language decoding, context awareness, etc.; 3. Anticipating needs – pattern analysis, context awareness, etc.; and 4. Executing tasks – marshalling resources, managing requests, selecting optimal responses, etc. Long term success will depend upon the progress each company can make in these areas, as well as execution on non-AI factors – 1. Data resources; 2. Engagement with large user bases; 3. Platform integration; and 4. Ecosystem development. GOOGL, uniquely strong across all AI and non-AI factors, has an extraordinary opportunity to re-establish its primacy as a user gateway, not just for information, but, this time, for services as well. MSFT is weak in the consumer market, but well positioned to push Cortana as a unifying enterprise interface tying together its own applications and those of a 3rd party ecosystem. AAPL’s Siri must overcome AI and data weaknesses to sustain the competitive positioning of the iOS platform, while staving off 3rd party challenges from GOOGL, AMZN and FB to displace it amongst iPhone users. Without anchor device platforms to drive primary usage, AMZN’s Alexa and FB’s M could be somewhat marginalized despite excellent AI and, at least in FB’s case extensive user data and massive engagement. For independent apps without AMZN and FB’s user lock-in, AI assistants could prove to be an existential threat.
- AI assistants to get much more powerful. We believe that AI will soon drive a revolution in the way users interact with devices and cloud-based services, obviating the app-based GUI that has been, thus far, the hallmark of the mobile era. AI software will accurately interpret queries – spoken and written commands, even gestures and facial expressions – and intuit unexpressed needs – based on context and data rich user profiles – to answer questions and fulfill service requirements with increasing expediency and effectiveness. These assistants will link directly into apps to find information or execute a task, providing a consistent user interface across as many devices as the assistant can reach.
- Deep learning fills 4 key roles. 1. Recognition – accurately identifying user queries as they happen, turning sounds into words, gestures into commands, etc., while filtering out irrelevant noise. 2. Interpretation – Determining the true meaning of inputs that may be imprecise, grammatically irregular, dependent on context, or otherwise ambiguous to determine actionable queries. 3. Inference – Predicting user needs based on context (time, place, proximity to other people, etc.), interests/obligations (schedule, previous requests, social connections, etc.), and information (traffic, weather, product availability, etc.) and recommend or even take, actions. 4. Execution – Performing jobs by finding the best answers to questions and employing the most appropriate tools to complete requested tasks.
- Critical non-AI requirements. 1. Data – AI developers need massive data bases to design and train their solutions, while the resulting products will need access to broad personal data to effectively operate. 2. Engagement – Integration with applications that already deliver regular user engagement will accelerate reach and adoption. 3. Platform – Integration with device platforms – smartphones, home appliances, game consoles, PCs, etc. – can put AI assistants in front of app interfaces while pushing availability closer to ubiquity. 4. Ecosystem – The value of the assistant is dependent on the range of services that it can command, both those directly controlled and those provided by 3rd party partners.
- GOOGL strikes back. A paradigm shift to a single AI powered user interface reverses a trend toward independent apps which had begun to blunt the strength of GOOGL’s search franchise. GOOGL is uniquely strong across all AI and non-AI ingredients winning the virtual assistant race, with a combination of cutting edge technical excellence, massive data assets, and long reach through its dominant applications and leading Android platform. GOOGL Assistant is likely deliver capabilities that none of its rivals will be able to match, and could drive share gains for Android devices and gain traction on competitive platforms.
- MSFT Cortana key to enterprise strategy. MSFT’s much discussed weakness in engagement and platform reach amongst consumers is more than offset by its strength in the enterprise. Cortana becomes an interface paradigm fronting not just for Windows, which may become irrelevant, but also for Office365, Dynamics, LinkedIn, and other applications, which certainly will not. Managing the primary user interface as work moves to the cloud greatly advances MSFT’s strategy to deliver value to enterprise customers through intelligent analysis and management of workflows within the organization and out to customers and business partners.
- Siri awaits the attacking hordes. AAPL is vulnerable – it has relatively weak AI development capabilities and has eschewed collecting the sort of usage data that its rivals will use to design, train and implement their competing virtual assistants. It will use tight integration to the on-board functions of its devices, its well-established ecosystem of developers, and its iron-fisted control of the 3rd party technology approved to run on its platform to promote Siri to its loyal users while subtly impairing the effectiveness of alternatives. The risk is that superior AI assistants on rival platforms will take device market share and erode AAPL’s ability to command premium prices.
- AMZN and FB banging on the door. AMZN’s Alexa is available to a few million consumers via the company’s Echo and Fire products. FB’s M (in beta testing) requires that Messenger be launched before it can be used, slotting it behind platform-integrated solutions like Google Assistant, Siri or Cortana. While both companies have strong AI development teams with access to extensive data resources, and (particularly in FB’s case) impressive engagement from a huge base of users, it will be a reach for either to emerge as a leading primary interface solution vs. the integrated alternatives with direct access to hardware and default software and deep linking into 3rd party apps.
- Bad news for apps. AMZN and FB are powerful enough to resist deep linking, and can maintain their strong, independent engagement on their rival’s platforms despite the paradigm shift to AI assistants, even if their dreams of gaining primacy are unrealistic. However, less frequently used apps with more questionable user loyalty face an existential threat, as enabling deep linking erodes differentiation, while blocking it invites irrelevance.
Opening the Pod Bay Doors
Both Larry Page and Mark Zuckerberg talk of being inspired by the computer in “Star Trek”, which could take spoken commands from Captain Kirk and unerringly find answers and execute tasks. While we are still quite far from “unerring”, virtual assistants are making dramatic progress in their ease of use and efficacy. We believe that services like GOOGL’s Assistant, AAPL’s Siri, MSFT’s Cortana, AMZN’s Alexa and FB’s M will replace the App user interface paradigm with a greatly enhanced search-like facility that intuits meaning to user queries and reaches into apps and web sites to answer questions and perform requested tasks directly. Spoken commands, text entries, and clicks, will be enhanced by gesture controls, eye tracking and facial expression analysis, and user needs will be inferred by context and previous experience to provide timely information and services without explicit queries.
Deep learning AI is the key to all of this, filling 4 key roles: 1. Recognition – identifying user commands amidst a world of noise; 2. Interpretation – intuiting meaning from words and gestures; 3. Inference – anticipating user needs without explicit commands; and 4. Execution – determining the best course of action and tasking the most appropriate resource. The more advanced the AI, the better a virtual assistant will perform in these roles. Of course, overall performance will depend on factors beyond the deep learning: 1. Data – the more data an AI has to train and operate, the more accurate its performance; 2. Engagement – assistants integrated with applications that already reach large and engaged user bases have obvious advantage; 3. Platform – integration to an OS gives an assistant privileged access to the local functions and default services tied to it, while putting would-be rivals at an access disadvantage; 4. Ecosystem – the more 3rd party apps that allow directly links to information and services within, the more powerful the assistant.
This coming paradigm shift plays directly to GOOGL’s strength, and gives it a golden opportunity to reverse a perceived deterioration in the importance of its search franchise. The company has a substantial lead in its AI development capabilities, which can be brought to bear against enormous data resources, the engagement of 7 different services with 1 billion+ users, and the world’s dominant mobile platform. Expect Google Assistant to offer obvious and significant service differentiation from its would be rivals. MSFT can zig where GOOGL zags, using its own ample AI resources to focus Cortana toward the enterprise applications where it can leverage similarly substantial advantages.
We are less certain for AAPL, AMZN and FB, which have considerable ambition for their own AI assistant initiatives. AAPL has far less deep learning development firepower than any of its rivals, and is hamstrung by its lack of data and focus the device rather than the cloud. Still, its strict control over 3rd parties can keep competition at bay as long as Siri isn’t so weak as to damage the relative appeal of iOS devices. AMZN and FB lack platforms (Alexa serves only a few million Echoes and Fires), and as such, slot in behind solutions like GOOGL Assistant and Siri. This will limit the range of their engagement with users to interactions within AMZN and FB apps – opportunities for sure, but likely to be significantly less comprehensive than those available to the platform players. For app companies with less certain engagement (and AI know-how) than AMZN and FB, the paradigm shift to AI assistants will prompt a hard choice between allowing deep linking to the information and services available within, thus obscuring brand and inviting commoditization, or not, which could be the path to oblivion. Names like PCLN, EXPE, TRIP, YELP, GRUB, YHOO, and others come to mind.
With the first iPhone, Apple changed the way that users accessed the internet. iOS offered a browser, but the action was in apps – one touch, and right into a customized interface tailored by the company that provided the app. Each user curated their own roster of apps acquired from the app store and slotted into precious open slots in the rows and columns provided the main pages of the device, or thrown into the app drawer, perhaps never to be seen again. The system offered a convenient short cut to the most often used apps, and many app publishers used the freedom of the paradigm to design unique and compelling user experiences. However, as mobile devices grow ever more capable and the list of key functions, and thus apps, grow ever longer, the app model has grown cumbersome. The few apps used regularly are front and center, but infrequently used apps are hard to find when they are needed and new apps find it nearly impossible to get attention. Moreover, answers and actions may be several touches away, and comparing and contrasting across apps is a minefield.
Apple was perhaps prescient when it bought Siri in 2012, although talk at the time was mostly about replacing Google Search for those times when you didn’t have an App. Siri was a bit underwhelming at first, frustrating users with poor speech recognition, a limited palette of functions and strict syntax requirements, but it got a lot better with each new iOS release. A recent survey by the AI company MindMeld reports that more than half of US respondents use voice commands on their smartphones at least weekly, up 6 percentage points in just 3 months and a shocking contrast to the 15% of Apple users that reported regularly using Siri just 30 months before (Exhibit 1). 72% suggested that they would increase their use of voice-based AI assistants if performance issues, such as misunderstood queries and required syntax, could be resolved (Exhibit 2-3). We expect these obstacles to erode dramatically in the next couple of years, not just for Siri but for the cohort of similar AI-based virtual assistants that have hit (or will soon hit) the market – Google Now (soon to be called Assistant), Microsoft Cortana, Amazon Alexa, and Facebook’s M (in Beta).
The next generation of assistants will understand most queries the first time and will be forgiving of thick accents or irregular syntax. They will take not just typed or spoken commands, but may respond to gestures, facial expressions or eye movements as well. Often, the assistant will anticipate requests before they are given, based on context and experience. No matter how they are determined, tasks will be executed based on the best available information and using the best available resources. Assistants will reach into individual apps, even accessing apps that haven’t been downloaded to a device and running just the part that is needed to respond a user query. By fronting for the apps, these assistants can greatly improve the user experience but at the same time, sap the apps of their ability to brand and differentiate.
Exh 1: Frequency of Voice Assistant Use, 1Q 16 versus 4Q 15
Exh 2: Voice Assistant Improvement Wishlist
Exh 3: Voice Assistant Improvement and Potential Engagement Upside
Deep learning technology has entered a golden age, with the top commercial players (Alphabet, Microsoft, IBM, Facebook, Amazon, Baidu, and Alibaba) fielding impressive teams of scientific talent, each with the necessary access to the huge data sets and massive computing platforms necessary to drive dramatic improvement in the capabilities of AI-based solutions (Exhibit 4). For virtual assistants, deep learning will play four specific roles (Exhibit 5).
Exh 4: Artificial Intelligence Scientists by Organization and Number of Citations, Major TMT Companies
Exh 5: Four Roles for AI in Virtual Assistants
Recognition – Speech recognition has been with us for a long time, showing first for consumers in telephone voice response systems that were able to distinguish “yes”, “no” and “operator” … at least some of the time. Improvements in AI has meant a much broader vocabulary of words and sounds that a computer can distinguish and accurately transcribe, with much greater flexibility with regard to accents, voice tone, ambient noise, and other factors that introduce ambiguity to the task. Recognition will also increasingly include non-verbal cues, such as gestures, eye movements, motion, and facial expressions, that could serve as inputs to a holistic user interface mechanism that augments touches and text entry. All of these will be enabled by AI generated algorithms.
Interpretation – The first round of AI virtual assistants cheated on natural language interfaces, requiring users to phrase their queries in strict syntax with specific key words. The system didn’t have to understand the meaning of a whole sentence, as long as the key words “order”, “Uber”, “Airport” and “5 minutes” were included in the right sequence. “Book me a ride for my flight” would not be understood as a query. The next phase of development will see assistants that are much, much better able to parse requests, no matter how carefully they are phrased. Queries may also have different meanings depending on context – time, place, schedule, previous queries, known preferences, etc. – and AI will help virtual assistants adapt their interpretation of queries based on these factors. A good assistant would know that “Book me a restaurant for my next meeting” means breakfast for four at 8AM in Chicago at a place with good vegetarian options, and a virtual assistant should be able to figure that out as well. Finally, interpretation may also mean finding meaning in the non-verbal cues that the system can recognize – a grimace or thumbs down are obvious signs of dissatisfaction, a lingering stare on a photograph could prompt a device to zoom in.
Inference – A good assistant should be able to anticipate needs without being explicitly asked. Those needs will vary depending on context (What time is it? Where are you? Who are you with?), on a user’s established interests and responsibilities (What is on the to-do list? What appointments are upcoming? What have you asked previously? What do you like? Who do you know?), and on available information (What is the weather going to be like? How is the traffic? What is the price of this product?). The assistant would combine all of the information and provide alerts, make recommendations or even take specific actions. Ultimately, we believe that the quality of these inferences may prove to be the biggest differentiator as AI virtual assistants grow more mature.
Execution – Having determined the meaning of a query, an AI virtual assistant must determine the best way to respond to it – with an answer, a set of recommendations or a direct action. The first step is to decide the best application or device function with which to complete the task. In many cases, the queries are information searches, against which both Google and Microsoft (Bing) have already invested extensive deep learning resources to push the efficacy of their applications further. Other queries may involve linking into various apps and evaluating the most effective way to complete a task – for example, choosing Lyft over Uber because of a shorter estimated pick-up interval or the imposition of surge pricing. Understanding the trade-offs in making accurate choices requires AI to understand the user’s inherent preferences and to act accordingly.
Exh 6: Roles of Deep Learning Positioning Among Major Virtual Assistants
In our previous AI work (http://www.ssrllc.com/publication/a-deep-learning-primer-the-reality-may-exceed-the-hype/, http://www.ssrllc.com/publication/ai-where-the-brains-are/) we have examined the strength of the AI research being done at all of the companies that have lined up to compete in the AI virtual assistant arena (Exhibit 6). Alphabet, with overwhelming advantage in the size and quality of its scientific effort, is the obvious leader, a status that is evident in the improvements already made to its flagship Google Search franchise and in the extensions added to nearly every other business in the portfolio. We see Microsoft as the clear number two in AI technology, a position that is not yet apparent in Cortana, a service that has been held back by the company’s poor positioning with consumers rather than by the quality of the AI. Facebook and Amazon are viewed as strong contenders in deep learning technology, while Apple is not. Indeed, we see the relative success of Siri as driven more by non-AI considerations than by the technology embedded in it.
Exh 7: Four Critical non-AI Requirements for Virtual Assistants
AI isn’t Enough
Dramatic advancements in AI are the key driver for this new user interface paradigm, but AI is not the only critical success factor. Indeed, ordinary AI may be sufficient if the rest of the solution can hit its marks. We see four other business elements as necessary for success (Exhibit 7).
Data – Data is the fuel that propels deep learning forward. By constant iteration against huge and growing data sets, learning systems extract the insights that allow them to make optimal choices in the face of ambiguity. Companies with access to substantial data resources will be able to develop better AI solutions more quickly in the contexts provided by that data. Here, like in the AI technology itself, Google is in a class by itself with search history, image and video archives, emails and schedules, location data and the world’s best maps, amongst many other things. Facebook can also claim huge data archives, but its resources are a bit less germane to many of the tasks that may be asked of a digital assistant. Microsoft has good data assets through Bing, Skype, Xbox live and other consumer facing services, but has its best position in the enterprise, where the transition to Office 365 and the addition of LinkedIn gives it a unique picture of life on the job. Amazon’s data on consumers is a bit idiosyncratic – best-in-class for shopping and buying habits, extensive for its 60M+ prime members, but spotty for non-commerce related topics. Apple is positioned to gather a lot of useful data for its tightly controlled user group, but has vocally opposed collecting it on the basis of user privacy and lacks many of the resources that would be helpful in analyzing it even if it had chosen to change its policy.
Engagement – The more people that use a company’s services, and the more frequently that they use them, the more chances that company will have to engage consumers with the capabilities of an AI based interface (Exhibit 8). Growing accustomed to the usefulness of the AI virtual assistant in one context may entice them to use it for more. This is Facebook’s calling card, and with more than 1.5B global users on its flagship app and with WhatsApp, Messenger and Instagram all on trajectories to hit the billion user mark on their own, all with industry best marks for frequency of visits and time spent, it will have ample opportunity to build interest in its soon-to-launch M digital assistant. Alphabet, with more than a billion users each for Android, Google Play, Gmail, Chrome, Search, Maps, and YouTube, is similarly blessed with engagement. Apple boasts of a billion iOS and Mac OS devices in use, although there is undoubtedly significant overlap amongst the iPhone, iPad and Mac customer bases. These users are deeply engaged with Apple’s suite of services, including iMessage, Apple Music, iCalendar, iCloud and others. Microsoft boasts of 1.2B people using its Office productivity suite, including 400M using cloud-based Outlook email, on nearly 2B PCs. 350M PCs have upgraded to Windows 10. In contrast, Amazon counts 300M active customers, of whom more than 60M are paying Prime members, presumably with significantly higher than average engagement.
Exh 8: Number of Monthly Active Users by Property
Platform – The platform governs the first interface that a user sees upon opening a device. The platform sets the rules by which other applications can engage users, reserving certain functionality for itself and controlling access to physical elements of the device. 3rd party assistants sit behind the platform, invokable only when the associated app is running. The power of the platform as the potential to entirely neuter or block would-be 3rd party AI virtual assistant challengers, while the platform owner can tightly integrate the functionality of its assistant to the OS. This power is at the center of Apple’s Siri strategy, but it has the potential to be a double edged sword. If other AI virtual assistants prove to be clearly superior to Siri, an aggressively closed strategy could impair the overall relative value of Apple’s devices – a potentially catastrophic outcome. Google is much less aggressive in asserting its platform power with Android, but nonetheless enjoys significant advantages gained with the integration of Google Now (and soon Assistant) with the platform. Microsoft is even less able to drive bundling with Cortana onto its Windows devices, given the power of corporate buyers, but benefits from a lack of real competition in the enterprise space. Amazon has platform power over the less than 50M combined Fire tablets and Echo home speakers that have been sold to date. This is a very modest foundation on which to build Alexa. Finally, Facebook’s ambitions for M will certainly circumscribed by its lack of a platform, turning it to emphasize specific use cases tied to its dominant messaging applications.
Ecosystem – The utility of any AI virtual assistant will be heavily dependent upon the range of services to which they have access, and no one company will be self-sufficient. Google, Apple, Microsoft, Amazon and Facebook will negotiate terms with 3rd parties for allowing their assistants to link to data and services contained within apps and websites. Microsoft and Amazon have established open APIs for 3rd parties to allow Cortana and Alexa access to their apps and have actively recruited marquee partners like Spotify and Uber. Google will go a step further with the upcoming Assistant, allowing developers to use the AI’s knowledge of context to trigger automatic notifications and other interactions from the App through the API, an extension to the deep linking capabilities introduced with Android Marshmallow. Apple just announced an API for Siri during its recent WWDC developers conference, but notably continues to exclude applications that compete with its own services, such as Apple Music. Moving forward, engaging 3rd parties productively in ways that enhance the overall functionality of the AI virtual assistant while preserving upside for the partners will likely be needed to move past the tipping point for an interface paradigm shift. 3rd parties will have incentive to cut reasonable deals early and gain share vs. holdouts (Exhibit 9).
Exh 9: Non-AI Requirement Positioning Among Major Virtual Assistants
Google Assistant – The 800 Lb Gorilla
Google has been adding AI fueled functionality to its core products for years. Six years ago, it launched Google Instant, which predicted likely search terms as the user typed saving an average of 5 seconds per query. Two years later, it introduced Google Now, which delivered information to users without prompt based on context and prior interests. In 2015, it launched Google Photo, which automatically categorized pictures based on an AI analysis of their content. Recently, Google business CEO Sundar Pichai indicated that deep learning would be inherent in every product and service provided by the company. More than 10% of the company’s 25,000 engineers are proficient in the science of deep learning AI, with more than 10% of the world’s top experts in deep learning on the payroll (Exhibit 10).
Exh 10: Virtual Assistant Summary – Google Now
Assistant is consistent with that vision. Google has leadership in every branch of AI that is important for an AI virtual assistant. It has the machine vision and audio processing capabilities needed for recognition. It is the world leader in natural language processing, with the acknowledged top scholar in the field, Dr. Geoffrey Hinton, leading the effort. Its predictive system analysis capabilities are apparent in its AlphaGo system, which beat the world champion of Go, supposedly the world’s most complex competitive reasoning game. Those scientific chops can chew on massive and matchless data resources that are more directly germane to the needs of a virtual assistant than any competitor can possibly bring to bear.
Assistant is an extension of Google Now, which is already integrated into hundreds of millions of devices that run Android and available to hundreds of million more who could download the app to their iOS devices. Assistant will run on a connected home speaker, directly rivaling Amazon’s Echo, and will power a new messaging platform, Allo, which aims to take on Facebook on its home turf. Assistant will take queries – spoken, typed or tapped – and provide answers or execute tasks, while also anticipating needs and following users as they move from device to device.
Now is already considered best in class amongst the existing competition, but Assistant is expected to set the bar much higher. We expect that it will deliver superior recognition of natural language commands in a range of languages, and that its proactive functions will be unmatched. It will enhance the functionality of popular Google applications – Maps, Photos, YouTube, Translate, Gmail, etc. For Android users, with calendars, email accounts, and other services integrated directly into Assistant, there will be added benefits – speed, better anticipation, more functions available directly from a command, an ability to deep link directly into 3rd party apps. We believe that Google Assistant will reassert and expand upon the company’s historic position as the primary gateway to the internet, certainly on its own platform and as the primary alternative for all others. This will open up new avenues for monetization, both in enabling new services to consumers but also in providing privileged access to 3rd parties willing to pay for it through advertising or revenue sharing.
Exh 11: Virtual Assistant Summary – Microsoft Cortana
Microsoft Cortana – An Enterprise Tool
Microsoft’s strength in AI is poorly appreciated. It has been at it a long time, spurred through competition in search with Google, and its particular interest in gesture control and natural language processing. While it has 40% fewer AI experts than Alphabet, it has more than twice as many as anybody else, giving it a considerable critical mass of talent, most of it working in areas directly relevant to Cortana (Exhibit 11). Amongst consumers, Cortana gets high marks for its recognition and interpretation of commands, but gets dinged for relatively poor performance in returning specific answers and executing tasks. This is likely due to the inherent weakness in the positioning of its platforms with consumers.
Microsoft may have lost the battle for consumer engagement, but it is the dominant leader the enterprise. As its customers transition to the cloud-based Office 365, Microsoft can begin providing valuable insight into the workflows in each organization and provide substantial new value through Cortana. Emails automatically informing calendars can automatically reserve meeting rooms, issue meeting invitations, manage travel, and handle a further variety of the administrivia that often hinders actual productivity. Cortana can help employees find information and marshal information within the organization, and with the addition of LinkedIn to Dynamics CRM, aid in the building of relationships outside of the organization. For the employees whose work schedule takes precedence over their family schedules, this may also be a bridge for Cortana into the consumer market as well.
Exh 12: Virtual Assistant Summary – Apple Siri
Apple Siri – Arming the Walls of the Garden
Despite confident talk, Apple does not have a deep learning development talent base that compares to its rivals. Siri was the product of an acquisition, and its founders left Apple as soon as they were free to do so. All five of the scientists at Apple who have been cited in academic literature at least 1,000 times have come from more recent acquisitions, and given the company’s restrictive and secretive nature, we would not be surprised to see them follow the Siri founders once their lock-ups expire (Exhibit 12).
Still, Apple has made real improvements to Siri since its launch. It is better at recognizing words. It is better at interpreting commands, including follow up commands that require it to retain context from a previous query. It has been trained to control more functions of Apple’s devices and to deliver direct answers to a wider range of questions. At the recent WWDC, Apple announced that Siri would begin to proactively offer content suggestions and reminders, a la Google Now. None of this is cutting edge AI, but Apple has been deft at narrowing the solution set to improve efficacy within it, following its rivals’ innovations as needed, and tempering possible user frustration with pre-programmed humor. Thus, Siri was early enough to tag Apple as an innovator and just good enough not to detract from overall user satisfaction.
The job is likely to get more difficult as the competition steps up their games. It would seem unlikely that Apple could go toe to toe with Google on AI functionality – Now is already better on recognition, interpretation and inference, with Assistant likely due to deliver step function improvement. Moreover, Apple’s stand on collecting and using customer data to train and operate its systems, and its insistence on running most of its solution on the device rather than in the cloud will be further burdens for Siri. Apple’s strategy seems to be to use its strict control of 3rd party apps to hamstring competitive AI virtual assistants looking to run on the iOS platform, while keeping Siri’s functionality close enough to stop Android from establishing a meaningful advantage with Assistant that could threaten the iPhone’s market share or price premium. We see this as a substantial risk.
Amazon Alexa – The First in the Home
Amazon’s Jeff Bezos has made a substantial commitment to deep learning, acquiring AI talent in deals, poaching top thinkers from university posts, and even loosening his previous restrictions on academic publishing and conferences in an effort to make the company more expert friendly. We rate Amazon as the number 5 company in terms of its overall deep learning research team, but its ambition runs higher than that. Alexa, launched on the Amazon Echo connected speaker, was a smart and aggressive move, levering streaming music as the killer app for the first viable AI assistant for the home (Exhibit 13).
The Echo combines Alexa’s strong voice recognition AI with excellent long distance microphones to enable users to voice queries and commands from across the room. Alexa’s interpretation capabilities are pedestrian, with a narrow range of functions and required syntax limiting both mistakes and overall usefulness. Still, an open API is attracting 3rd parties to the platform that should expand functionality, while Amazon’s development resources should drive significant improvement to Alexa in future iterations.
Amazon’s real problem is the very limited reach provided by its hardware platforms. It has likely sold fewer than 10 million Echoes, and could extend Alexa to the fewer than 50 million Fire tablets that have been sold to date without coming within a magnitude of the Android or Apple devices in use. This leaves it vulnerable, having to engage most of its customers via platforms controlled by rivals, and likely, through AI virtual assistants that threaten its direct connection to those users. With Google poised to counter the Echo with its own connected home speaker system loaded with its Assistant solution and Apple very likely to follow, Amazon’s early mover advantage may be near an end, leaving Alexa as a modest sideline for the company. Given Amazon’s enormous advantages in its e-commerce business and the strength of its Prime program, we do not see the rise of platform-integrated assistants as particularly threatening to Amazon’s core, but highlight the enormous obstacles for any would-be 3rd party AI assistant solution.
Exh 13: Virtual Assistant Summary – Amazon
Facebook M – New Kid on the Block
Facebook’s AI virtual assistant solution, M (in beta tests with customers in Silicon Valley) will not feature speech recognition out of the box. This is almost certainly a concession to reality – Facebook does not control any device platform and can hardly hope to supersede either Google Assistant or Siri from their prime position. Rather, Facebook aims to aim its ample AI knowhow at its billion plus messaging users, allowing text queries from Messenger and WhatsApp. At present, Facebook is augmenting its AI interpretation and execution with human concierges that can step in when M’s AI bots are stymied by a query or command. Over time, as the M deep learning solution gains experience, the human touch will be phased out (Exhibit 14).
Like Amazon, Facebook is likely privately frustrated by the disadvantage posed by the lack of a major device platform upon which to hang its AI virtual assistant. Still, the astounding engagement for Facebook’s apps – which may account for as much as a quarter of all time spent online with mobile devices around the world – leaves the company an important point of leverage. This is particularly relevant for emerging markets, where users are less likely to have the premium smartphones able to support all of the bells and whistles of the platform-integrated rivals and a practical down market solution using bandwidth friendly text input through the already wildly popular messaging platforms could be a big winner. In this, Facebook is mirroring the strategy of the Chinese Tencent, its only real global messaging competition.
Exh 14: Virtual Assistant Summary – Facebook M
Again like Amazon, Facebook is far too big and important to users for either Google or Apple to elbow away with their platform integrated AI assistants, but their presence will limit M’s ability to target more than the functionality specifically available within the context of the messaging app itself. Thus, the focus on commercially funded customer service bots. We would expect Facebook to extend M to its social networking apps as well, perhaps enabling self-managed news feed filters, more accurate media distribution and ad placement, and more powerful tools to help coordinate groups of friends. Still, platform integrated assistants, poised to become the primary interfaces for devices, carrying control over key functions like payments, navigation, security, calendar, and search, and holding advantages in functions important to Facebook, like photo archiving and messaging, are a real threat to the company’s hegemony.
3rd Party Apps – Losing the Spotlight
Apple’s app model was a massive paradigm shift for accessing the internet. Unlike the desktop browser, which dictated the methodology for finding, viewing and navigating internet content, apps allowed 3rd parties leeway in innovating the way that their apps worked and gave them clear mechanisms for generating revenues via upfront payment, subscriptions, in-app purchases and advertising. While the cost was high – 30% to Apple off of the top – the freedom was valuable, allowing app publishers to control the basis of their relationships with their customers without pesky search engines offering competitive alternatives.
Fast forward 9 years and the downsides of the app model are becoming more apparent. Smartphone users have app fatigue. 90% of usage is concentrated in a user’s top 6 apps, while 9 of 10 apps that are downloaded are used no more than once before being relegated to the scrap heap. Most users do not download even a single new app in a given month. Rather than opening doors for developers and service providers, the app model is becoming an obstacle for competition – good if you are well positioned on the inside, but very bad indeed if you are not.
Exh 15: Share of Time Spent on Apps by Rank
AI virtual assistants will change things. In 2015, Google introduced deep linking, which enables Search and Google Now to reach inside of apps to find pieces of information and service functionality of interest to users. For apps long ago banished from the home screen to the app drawer, this could be new life, but to the icons enshrined on page one, it is a threat. This year, it announced Android Instant Apps, which allows it to grab content from apps that have not even been installed and temporarily run the just portion of the app necessary to show the information or provide the service requested. Now layer in Assistant as the primary interface to the device – request a service and Google can find the best solution based on the user’s stated preferences and demonstrated needs, whether or not the user had even downloaded the app to their smartphone.
Popular apps will lose the inertia factor that brings users back for convenience even when the app may not be the best way to find information or complete a task. They will lose brand consciousness that may lead users to believe that the services being rendered are better values than they really are. They will bleed engagement which will limit their ability to market ancillary services and deliver ads. They may lose leverage over suppliers who will find cheaper ways to reach their customers. This will be good for users but bad for popular apps. Long term, this is a serious problem for companies like PCLN, EXPE, TRIP, GRUB, YELP, YHOO, EBAY, and others who have benefited from the app model and could see significant competition enabled by the rise of AI virtual assistants.