AI as a Service: Deep Learning is Fundamental
SEE LAST PAGE OF THIS REPORT Paul Sagawa / Artur Pylak
FOR IMPORTANT DISCLOSURES 203.901.1633 /.1634
psagawa@ / firstname.lastname@example.org
October 6, 2016
AI as a Service: Deep Learning is Fundamental
Deep learning is game changing technology that will significantly enhance nearly every category of software going forward. However, the development tools for building AIs remain fairly rudimentary – akin to the “machine language” programming that dominated early software development – and the expertise required to create and train complicated learning systems is in desperately short supply. This is a substantial opportunity for the top cloud hosting companies – AMZN, MSFT, and GOOGL – who, along with IBM, are levering their own deep learning expertise to build sophisticated development platforms that will make it easier for less capable organizations to design and train AI systems. To that end, these companies are investing to offer: 1. AI optimized data center capacity; 2. Higher level AI development libraries; 3. Pre-trained modules for widely used elements, like natural language recognition or image identification; and 4. Customized development support. While we expect all 4 to find strong demand for these capabilities, MSFT and GOOGL are a bit ahead of IaaS market leader AMZN and may be able to use AI as a differentiator to gain share in the rapidly growing market. IBM’s Watson offers a more bespoke product that is gaining traction with enterprises with more complex needs.
- AI can improve almost any software. Deep learning allows software to find optimal results to ambiguous questions, often hidden in massive data sets. With this tech, applications can better predict future outcomes, can deliver better analytical conclusions, can be personalized to specific users and adjust to specific circumstances, and can communicate more naturally and infer intentions.
- Basic AI tools require scarce expertise. The rise of deep learning is relatively recent, and the programming tools available are granular, requiring special training and long practice to address more than simple problems. This is analogous to programming in the machine language used by computer hardware, prior to the development of more sophisticated compiled languages like Fortran or Cobol. Because of this, and because of the iterative nature of “training” an AI, experienced talent is critical for more complicated systems. This talent is in very short supply and concentrated in a few organizations.
- Top IaaS platforms looking to leverage their AI skills. The leading commercial webscale platform providers – AMZN, MSFT, and GOOGL – are amongst the top 5 global leaders in AI talent, as is IBM, which hopes to use AI to gain better traction in hosting. All of these companies are building deep learning support capabilities for their IaaS platforms, hoping to enable widespread AI development by their less talent rich customers and to offer optimized infrastructure to run the resulting applications. We see five basic areas of investment.
- 1 – AI optimized data centers. Typical processors are optimized for flexibility, able to run a wide array of compute jobs with steady efficiency. In contrast, deep learning systems will run the same basic algorithm through millions of iterations, rewarding processors optimized for speed, such as Nvidis’s Tesla GPU. The big 4 have all deployed Nvidia GPUs in their commercial data centers, with GOOGL transitioning to its internally developed TPU ASIC chips, and MSFT supplementing the GPUs with flexible FPGAs. We wrote about AI hardware recently: http://www.ssrllc.com/publication/37440/
- 2 – AI development tools. In November, GOOGL published its TensorFlow deep learning development libraries as open source software on GitHub. MSFT followed with its Computational Network Tool Kit (CNTK) in January. AMZN added its DSSTNE (pronounced Destiny) tools in May. IBM supports both TensorFlow and CNTK, and has hinted at a future framework of its own. These once proprietary rubrics will compete with unaffiliated AI platforms developed by academic researchers – such as the FB-favored Torch. All of these frameworks aim to improve the efficiency of deep learning development and remove barriers for less experienced programmers. GOOGL, IBM, MSFT and AMZN are all hoping that their tools can become de facto standards that could pull enterprise customers to their platforms. Arguably, GOOGL has a big edge here, with MSFT #2.
- 3 – Pre-trained modules. All of the major AI hosts offer APIs for pre-trained AI functions. GOOGL offers speech-to-text, language translation, image classification, and predictive analytics as services. MSFT provides speech-to-text, translation, image classification, and facial recognition APIs. AMZN has API access to the decision engine behind its product recommendations. IBM, which acquired AI API company Alchemy AI in 2015, offers more than a dozen machine learning functions as services to its customers, with more specialization than its rivals.
- 4 – Proprietary data sets. GOOGL has taken the lead in providing pre-scrubbed, pre-labeled data sets, offering 8M YouTube videos and 9M image files to train AI models, and has many other valuable data assets – searches, maps/traffic, etc – that could someday be made available, at least on a limited basis. Through acquisition, IBM offers access to proprietary databases containing detailed historical weather data and more than 100M scrubbed patient medical records. Both AMZN and MSFT have valuable data assets, but, thus far, have not moved to offer them to 3rd parties.
- 5 – Customized development support. While all of the AI cloud leaders have worked with major partners on specific AI projects, IBM has made co-developing bespoke solutions with clients a priority. It has helped build analytic systems to solve problems ranging from predicting crime to discovering new drugs. This is an obvious next step for MSFT and a possible one for GOOGL and AMZN.
- Substantial value added atop commodity processing/storage. As basic computing and storage becomes more commoditized through container technology and management tools, AI will provide a venue for service differentiation, driving not just AI specific revenues but data center volumes as well. Deep learning development is noted for its massive data sets and processing demands, and the resulting production systems may be tuned to a specific hardware platform, making it difficult to multisource.
- MSFT and GOOGL to take IaaS share, IBM with unique niche. MSFT and GOOGL have the most AI expertise and are pursuing platform opportunities as a high priority. We believe both will be able to lever deep learning to take cloud hosting share. AMZN is somewhat behind – it recently hired the dean of the CMU computer science department to head up its AWS AI development team. IBM’s more customized approach has its merits, particularly given the relative lack of rivalry in the emerging market. We believe that other companies hoping to use AI as a lever to gain traction in the enterprise cloud will be disappointed, although we see BIDU, BABA and Tencent as formidable players in Asia.
AI, machine learning, deep learning – these are the buzzwords of the day, and nearly every company with even the slightest tech angle claims to be in the middle of it all. Of course, most of that is hype. The tools of deep learning remain rudimentary, AI scientific talent is scarce, and building the complex systems needed to address the biggest problems – i.e. computers that can converse in natural language, vehicles that can drive themselves, models that can predict the weather or diagnose disease, or devices that are uncannily able to anticipate user needs – is far beyond all but a handful of enterprises. Fortunately for the technological hoi polloi, GOOGL, MSFT, AMZN and IBM are looking to bring more capable AI to the masses in 5 ways.
First, the top cloud hosts are providing deep learning optimized hardware on their IaaS platforms. AI systems train and run dramatically faster on processors, like GPUs, narrowly designed to iterate the same computational tasks as quickly as possible. Interested customers can buy time on Nvidia’s Tesla AI processors on AWS, MSFT Azure or IBM Watson Cloud. GOOGL takes it a step further, having designed a custom ASIC processor, specifically for its internally developed TensorFlow deep learning framework. MSFT is augmenting its GPU infrastructure with FPGAs, programmable chips that can be configured more exactly to specific deep learning algorithms for speed and efficiency.
Second, deep learning development tools are still at an early stage – akin to the arcane machine language that early computer programmers used before the advent of efficient compiled languages like Fortran or Cobol. The top AI platforms want to change that, and have invested in deep learning toolkits to help engineers without long experience address more sophisticated problems. GOOGL’s TensorFlow framework is gaining momentum, as is MSFT’s CNTK. AMZN offers DSSTNE, used internally to develop its recommendations engine, and recently poached AI guru Alex Smola from Carnegie Mellon’s computer science department to lead a team developing a more comprehensive framework. IBM has pledged to deliver its own fully realized development environment, but currently supports TensorFlow and CNTK, along with open source tools, like Torch, contributed by academic researchers.
Third, the top AI hosts also offer pre-trained modules as APIs for clients to incorporate in their applications. GOOGL and MSFT have similar sets, supporting general natural language and image recognition tasks. IBM goes further, offering more than a dozen fairly specific AI APIs to its customers. AMZN is a bit behind, offering only access to its recommendations engine core. The fourth arena for the AI hosts is proprietary data sets. GOOGL offers access to 8M annotated YouTube videos and 9M labeled image files as fodder for AI training, and has the potential to tap its vast data resources for its hosting clients in the future. Through acquisition, IBM has acquired some unique data sets – e.g. 100M scrubbed patient medical records, detailed historical weather data – that are available to Watson customers. IBM takes this a step further, offering bespoke development support, a fifth service area its rivals have yet to enter.
We believe that these investments offer real differentiation and customer lock-in atop more commodity cloud processing and storage services. In particular, we see MSFT and GOOGL likely to take some market share in the fast growing IaaS market with these capabilities. IBM, with its focus on more customized solutions, has found a lucrative niche, with only MSFT positioned to challenge it. AMZN has some catch-up to do, but has time and committed investment on its side. The Chinese internet giants, in particular BIDU, should be substantial players in their regional sphere.
Exh 1: Timeline of Select Artificial Intelligence Milestones, 1950-Present
Better Software with Deep Learning
Deep learning is an old idea (Exh 1). Back in the ‘50’s early computer scientists postulated that an algorithm could mimic the mechanics of a human neuron, reacting to the stimulus of data by adjusting itself so that the next time it confronted similar data, it could return a calculation closer to the optimal result. Over the course of many iterations, the algorithm/neuron would train itself to perform the calculation perfectly. A network of these algorithms, working as a team, could break a larger problem into digestible bits, all informing one another such that the whole system homed in on a more optimal result with time. This was the birth of neural networks, and the elegant concept went exactly nowhere for a long time.
Neural networks required three things – massive amounts of data for training, extraordinary computing resources to run the iterations needed to really get to useful results, and very talented engineers able to design the initial system and tweak it over time to get to the best answers as quickly as possible (Exh 2). The first two were viewed as almost insurmountable obstacles to the concept of deep learning neural networks, and thus, the third became a self-fulfilling prophesy. For many years, machine learning was the province impractical and stubborn eggheads too fascinated with the elegant concepts to accept the pragmatic guidance of university computer science departments eager to see the most promising students focusing on database architectures or interface design.
Exh 2: Requirements for AI and Neural Networks
However, time changes all things, and the twin obstacles of data and processing capacity have become far less imposing. The rise of the internet and the subsequent explosion of smartphones has taken care of the data, while the advent of hyperscale datacenter architecture – invented by Google out of necessity, when the ‘90’s paradigm of clustered servers proved inadequate to the task of indexing the entire web – along with the inexorable march of Moore’s Law, has made the necessary computing resources less extraordinary (Exh 3).
Exh 3: Moore’s Law – Processor Transistor Counts Over Time
Now deep learning is suddenly everywhere, and with good reason. Just as nearly any sandwich can be made better with the addition of bacon, software applications are almost inevitably improved by AI. With deep learning, applications can offer more practical analyses and insight, more accurately predict future outcomes and give users better ways to deal with those future outcomes. Learning applications can adjust their interfaces, and tailor their functionality personalized to the needs of individual users. These systems can communicate ever more naturally in any human language, and interpret ambiguous information, like sounds and images, with uncanny fidelity. Deep neural networks can be trusted to control messy processes with myriad inputs and decision points, like driving a car, and can anticipate needs or actions – useful for offering user assistance or beating a Go champion. These capabilities make enterprise applications more powerful and efficient, and make consumer applications more personal and useful.
Exh 4: The Basic Deep Learning Reinforcement Learning Algorithm
Building a Lego Deathstar without Instructions
The basic building blocks of deep learning systems – those neural algorithms with their feedback loops – are fairly simple, but the networks needed to address the big problems being solved with deep learning are anything but simple (Exh 4). These systems are comprised of many thousands of algorithms stacked in a dozen or more layers, each layer feeding increasingly integrated conclusions to the layers above it, with feedback loops reinforcing the learning by driving subtle changes in the individual algorithms (Exh 5). The way that those simple building blocks are wired together, how the feedback loops are designed, and importantly, how the system may be adjusted as innumerable iterations across the data begin to reveal insights, is both an art and a science, requiring serious knowledge, experience and talent to make it work.
The state of deep learning development is analogous to the very early days of more traditional computer programming. Instructions for the processor were written in granular machine language and debugging a program required elbow grease and sheer luck. With the introduction of compiled computer languages, like Cobol or Fortran, programming shifted from an esoteric academic exercise to a business necessity, a transition further accelerated with the advent of ever more powerful and easier to use languages over time.
Exh 5: Hierarchy of Deep Neural Networks
Right now, there is not Cobol or Fortran for deep learning – widely taught standard rubrics comprehensive enough to allow relative beginners to be commercially productive in banging out AI code. Deep learning still requires experienced engineers for relatively simple projects and acknowledged gurus, who have been through the process again and again, to lead the difficult ones. These gurus are in very short supply. Yoshua Bengio of IBM and the University of Montreal, one of the handful of scientists on the Mount Rushmore of deep learning technology, has suggested that there are only about 50 individuals on the face of the earth that are truly expert in the field. We have identified 850 scientists in AI who have been cited at least 5,000 in peer reviewed academic journals. Of these, less than 40% work in private industry, and nearly 60% of those work for the top 5 employers – Google, Microsoft, IBM, Facebook and Amazon (Exh 6-7).
In this environment, newly minted deep learning Ph.D.s are fielding seven figure offers, and experienced talent commands even more. Google acquired deep learning think tank Deepmind and its ~70 employees for $500M in 2014, and it would almost certainly command a much higher premium today. Ordinary companies cannot play on this level. Even if they had the massive datasets and hyperscale datacenter infrastructure, building and training deep learning systems at the highest level would be beyond them.
Exh 6: Distribution of Highly Cited (5,000+) AI Scientists by Organization
Exh 7: AI Citation Summary – Tech Companies
Exh 8: Deep Learning Services Offered by the Big 4
Heroes for Hire
Fortunately, ordinary companies can get help. To varying degrees, Google, Microsoft, IBM and Amazon all provide deep learning support as value added services atop their cloud hosting platforms. That support comes in five areas: 1. Infrastructure; 2. Tools; 3. Pre-trained modules; 4. Proprietary datasets; and 5. Bespoke development (Exh 8).
Infrastructure – Not only does training deep learning systems require hyperscale datacenter resources, but they run best on specialized hardware. We wrote about AI hardware recently in the following note: http://www.ssrllc.com/publication/37440/. A few years back, researchers discovered that the iterative nature of machine learning algorithms were much better suited to specialized graphics processing unit (GPU) chips, designed to calculate pixel changes at lightning speed, rather than CPUs, which were designed for computational flexibility. Microsoft, IBM, and Amazon have all made substantial investments to augment their datacenters with GPUs, typically made by Nvidia, and offer GPU capacity to 3rd party customers on their IaaS clouds. Google has taken the further step of designing a custom AI processor chip, called a Tensor Processing Unit (N.B. Tensor is the technical term for the multidimensional data arrays that are analyzed at the core of deep learning systems), tailored specifically to its internal deep learning development framework, TensorFlow. This is only practical because of the huge number of Google internal projects that use the TensorFlow framework, and gives an incremental boost in speed and efficiency vs. the more generic GPU architecture. Google cloud customers that use the increasingly popular rubric can take similar advantage. Microsoft has started to deploy Field Programmable Gate Array (FPGA) chips, that can be reprogrammed to the specific and often changing needs of AI systems. Currently, the FPGAs are tuned to internal development, but could be an advantage for Azure customers in the future.
Tools – In the absence of higher level AI programming languages, academic and commercial researchers have built libraries of commonly used algorithms and frameworks for connecting them as tool kits for building deep learning systems. These tool kits increase the efficiency of experienced development teams and provide an easier entry point for engineers new to the field. All of the AI hosts support the most popular frameworks, with Google (TensorFlow) and Microsoft (Computational Network Tool Kit or CNTK) having offered their own powerful internally developed tools as open source software. Google and Microsoft are also able to tune their infrastructure and pre-trained algorithms to their internally developed frameworks. IBM has promised to offer its own comprehensive deep learning development solution, in partnership with Nvidia, but thus far, has focused its support on the open source options, including both TensorFlow and CNTK. Amazon has offered a limited set of subroutines that it developed for its own recommendation engines, called the Distributed Scalable Sparse Tensor Network Engine or DSSTNE (pronounced Destiny), and has poached Alex Smola, an academic rock star previously on the faculty of Carnegie Mellon’s prestigious computer science department, to head development of a more comprehensive leading edge deep learning platform for AWS.
Pre-trained Modules – Some of the functionality enabled by deep learning is broadly applicable to many different applications – basic speech recognition, natural language interpretation, language translation, facial recognition, simple photo labeling, and others. The leading AI hosts have begun to offer these capabilities as modules, accessible on their platforms via application programming interfaces (APIs) that can be integrated into application software. Google and Microsoft offer fairly similar sets of APIs, while IBM, which acquired deep learning API startup AlchemyAPI two years ago, offers a much broader range of more closely defined modules. Amazon is a bit behind here, offering an API based on its recommendation engine as an analytic tool for hosted customers, but lacks a more comprehensive set.
Proprietary Datasets – Google, with its obviously rich trove of data in a wide variety of categories, has been a leader in offering pre-scrubbed and labeled datasets. Currently, it has made a proprietary database of 8 million YouTube videos that have been annotated with thousands of different descriptors available to developer customers looking to train their deep learning systems. Similarly, it has a database of 9 million still image files with an even finer degree of descriptive annotation. It also hosts many available datasets that have been open sourced by academic researchers. IBM has also moved in this direction, having spent $2.6B on Truven Health to gain access to the health records for more than 200M patients. Earlier this year, it also closed its acquisition of The Weather Company and its proprietary database of detailed US weather data. Thus far, neither Microsoft or Amazon have moved to offer proprietary data assets, but both host most open source datasets that are in circulation for the use of their customers.
Bespoke Development – Custom deep learning application development resources are at the core of IBM’s Watson business. Big blue has worked with 3rd party customers as disparate as local police departments and commercial banks to build AI enhanced applications. While the other top AI platform companies are open to major partnerships – Google’s DeepMind unit is currently working with the British National Health Service to improve diagnosis and triage – IBM is the only one that has built bespoke development into a line of business.
Exh 9: Worldwide Cloud Infrastructure Services Forecast, 2014-2020
A Race to the Top
Cloud hosted Infrastructure-as-a-Service (IaaS) is a roughly $22B run rate market world-wide, growing at a better than 34% annual clip. IaaS addresses enterprise datacenter spending for hardware, software, communications, personnel, power, real estate and other costs that currently run to more than $3B on a global basis, offering huge advantages in cost, performance and flexibility. Given the TAM and intrinsic advantages, we believe the IaaS market will easily exceed $100B before decade end, and could top $1T by 2030. Amazon’s AWS is the big kahuna chasing this giant wave, with more than 45% of the total business, nearly quadrupling the share of Microsoft, which follows it at about 12% (Exh 9). Thus far, Google has been more talk than action, with revenues less than a tenth of AWS, but its hire of VMS founder Diane Greene to run its enterprise business unit and high profile customer wins at Snapchat and PayPal have it a relevant part of the discussion. IBM’s SoftLayer-derived IaaS cloud business is an also-ran – we believe that it lacks the scale or sophistication to compete effectively on either price or performance vs. the big three. The worry for all of these players is that as the hosting market matures, customers will be able to move their business from platform to platform to chase better prices, transforming the IaaS market into a commoditized race to the bottom.
We believe that deep learning support services will be a major growth business for cloud hosts and d a significant source of differentiated value. While all four players have a similar baseline in the AI services that they offer – optimized infrastructure and support for open source development tool kits – there are clear differences as well – hardware/software optimized platforms, proprietary APIs, datasets, and customized development support – that will resist commoditization. The nature of deep learning development – massive datasets, millions of iterations – make AI systems difficult to multisource or move across competing host platforms and generate huge (and lucrative) compute and storage demands to boot.
The biggest beneficiary of the growth in Deep Learning as a Service will be Google, which, to date, has struggled to play catch up to IaaS behemoth AWS. With acknowledged thought leadership within the AI community, a big lead in overall AI talent, the popular TensorFlow framework with its optimized hardware platform, and unbeatable data assets, the company has the weapons to take market share via a growing demand for deep learning support (Exh 10-11). The big drawback for Google is its relatively poor reach into enterprise IT departments.
Exh 10: Net Plant Property and Equipment, 2010-2015
Exh 11: Capex Spending, 2010-2015
Microsoft has no such problems, with its huge and effective salesforce spread all over the globe. It also brings substantial resources to bear to support AI development, with a well architected infrastructure, a strong tool kit, and a strong number two position in scientific talent. We expect that Azure will appeal to the many, many enterprises already working with Microsoft’s platform software. IBM also can leverage its strong existing relationships. Although we believe its basic cloud platform lacks the scale and sophistication to be cost and performance competitive with AWS, Azure or Google for more traditional applications, its deep learning assets and more hands-on approach to its customers’ AI development needs give it an important leg up. We see Watson as a ray of light in an otherwise dark, dark blue sea.
Finally, Amazon, the dominant leader in IaaS hosting, is playing a bit of catch-up in regard to deep learning support. Yes, customers can buy time on banks of Nvidia GPUs and yes, all of the most widely used development frameworks are supported, but its rivals are doing more to support and attract AI to their platforms. New hire Alex Smola is supposed to reverse this, empowered to define a next generation deep learning development environment, which could include even more powerful and comprehensive development tool kits, better pre-trained API services, and optimized hardware. However, Google, Microsoft and IBM will not be sitting still, and we see AI services as an ongoing opening for these rivals to take modest degrees market share from AWS over the next few years.
We also note that the triumvirate of Chinese internet leaders, led by BIDU, will almost certainly also play a key role, at least on a regional basis. We believe that these companies have AI talent not easily identified by our methodology, with BIDU likely to be competitive with Amazon or even IBM in terms of its roster of scientific talent.
Exh 10: The SSR AI Heatmap
©2016, SSR LLC, 225 High Ridge Road, Stamford, CT 06905. All rights reserved. The information contained in this report has been obtained from sources believed to be reliable, and its accuracy and completeness is not guaranteed. No representation or warranty, express or implied, is made as to the fairness, accuracy, completeness or correctness of the information and opinions contained herein. The views and other information provided are subject to change without notice. This report is issued without regard to the specific investment objectives, financial situation or particular needs of any specific recipient and is not construed as a solicitation or an offer to buy or sell any securities or related financial instruments. Past performance is not necessarily a guide to future results.