Storage: All Roads Lead to the Cloud
SEE LAST PAGE OF THIS REPORT Paul Sagawa / Artur Pylak
FOR IMPORTANT DISCLOSURES 203.901.1633 /.1634
sagawa@ / firstname.lastname@example.org
October 25, 2012
Storage: All Roads Lead to the Cloud
- The massive paradigm shift in the TMT landscape has profound implications for data storage and the companies involved in storage technologies. Web-based applications for both consumers and enterprises are driving a rapid expansion in data. At the same time, the rise of flash storage equipped portable devices is crowding out PCs with more capacious hard drive storage, while enterprises are coping with the storage requirements of their virtualized data centers. We believe that these trends portend a substantial long-term shift to public cloud-based storage solutions based on commodity components and proprietary implementations of open-source software. As such, we expect flash demand to flatten, commodity disk drive demand to reaccelerate, and storage system demand to peak and decline. These changes will also create significant opportunities for cloud data center operators and IT services consultants who can facilitate the transition for enterprises, and for services and applications that can exploit the cost and performance advantages of cloud-based storage.
- PC hard disks and enterprise RAID arrays have dominated storage for 20 years. The first PCs relied on removable “floppy” disks for storage, but soon adopted the more reliable and capacious hard disk as standard. In the enterprise data center, hard disks had been the norm for mass storage since their introduction in the 1960’s, but really flourished with the adoption of RAID (Redundant Array of Independent Disks) techniques for combining multiple disk drives into a system with built in mechanisms for back up storage and failure recovery, introduced during the late ‘80’s. This paradigm of hard disk equipped devices supported by self contained data center storage systems is a lynch pin of the client/server architecture that rose in the early ‘90’s and that his been the primary principle behind enterprise computing ever since.
- Data growth, driven by networked applications, is prodigious and expected to continue. In 2011, 437 billion GB of hard disk storage and 2.7 billion GB of flash memory were shipped. Over the next five years, we believe that industry projections of better than 40% annual growth in demand for storage capacity will prove conservative, driven by prodigious growth in the creation of data by users and in the capture of data by enterprises.
- Consumers are shifting from PCs to portable devices, and with that, from local storage to cloud storage. The rapid shift from PCs to smartphones and tablets as the primary means for consumers to access information has catalyzed an acceleration of user created data, and has driven growth in portable and power-friendly flash storage at the expense of disk drives. However, flash capacity is over 11 times more expensive than hard disk – a strong impetus for users to take advantage of cheap, plentiful, and reliable disk storage in the cloud accessible via nearly ubiquitous high speed wireless access and for device flash configurations to level off. Given that cloud operators typically keep four redundant copies of each file in order to assure reliability, this shift should significantly increase demand for disk storage capacity.
- Enterprise data centers are overwhelmed by “big data” and will eventually shift to lower cost, higher performance public cloud solutions for most future storage. McKinsey estimates that enterprises stored 7 billion Gigabytes of new information in 2010, with data growing at a better than 40% annual pace. The RAID systems favored by enterprises are expensive, inefficient and inflexible compared with public cloud “Storage as a Service” or storage bundled with cloud-based applications. Enterprise concerns about data security and transition costs vis a vis the public cloud are easing, and the shift will be cautious but pervasive.
- Massive growth in cloud storage will commoditize storage technologies. The paradigm shift underway is funneling both consumer and enterprise storage demand toward the cloud. Most of this demand will be served by disk drives bought in bulk by cloud operators, assembled to their specs by contract manufacturers, and managed by proprietary software. Meanwhile, integrated storage system vendors, like EMC, IBM, HP and NetApp, will see enterprise data center demand crest and decline. Both cloud and enterprise data centers will make increasing use of flash for specific niche applications where fast storage access is critical, but prices will not threaten disk drives for volume applications for decades.
- Winners: enterprise disk drives and SSD, cloud operators, IT consultants; Losers: storage system vendors, device disk drives and flash. Device disk drives are roughly 75% of industry sales today, making the transition toward portable platforms painful, yet the potential for growth from the cloud is underappreciated. Given slight valuations, even in the face of major industry consolidation, we believe that drive makers and their suppliers are eventual winners from the paradigm shift. Likewise, we see the biggest cloud operators – Google, Amazon, and Microsoft – as broadly advantaged by their scale and skill in delivering on-line storage to both enterprises and consumers. As enterprises navigate the shift to the cloud, we see IT consultants, such as IBM, Accenture, and Rackspace, as playing an important role. On the flip side, the shift to the cloud will be painful for storage system vendors. Flash memory vendors are enjoying an immediate boom from portable device growth, but we fear that price competition and limits to the amount of flash needed per device may yield future disappointments.
To Make a Long Story Short …
The TMT sector is in the midst of a massive, once in a generation paradigm change to portable platforms, based on integrated software platforms with tight control over apps used to access internet-based resources running on distributed cloud data centers via increasingly fast wireless networks. This tectonic plate shift has profound implications for every part of the sector landscape, and storage is no exception. The more than 40% growth in the data being generated each year is getting a boost from devices that connect people wherever and whenever. Yet, the portability of these devices precludes shock-sensitive, power hungry and bulky hard disk drives. Flash memory, the storage of choice, is resilient, power efficient and tiny, but more than 14 times more expensive than disk drives making it unrealistic to match PCs gigabyte for gigabyte. Enterprise data centers are also awash under the relentless tide of data, upgrading and adding capacity to their integrated storage systems at a madcap pace, yet coping with capital spending pressures that are only getting more intense. For both consumers and enterprises the inevitable destination is “The Cloud”.
The best cloud operators – Google, Amazon and Microsoft – do not buy turn-key storage solutions with built-in failure recovery technology, sophisticated storage management and monitoring capabilities, and fat vendor margins. Rather, they buy commodity disk drive components direct from manufacturers, hire contract manufacturers to install them onto server boards, and write their own software to manage a vast sea of disk drives as one gigantic storage system. Reliability is delivered through greater redundancy, rather than monitoring systems. Thus, cloud data centers buy MORE storage capacity to store data with far greater redundancy than would enterprises or individuals, but buy far less processing hardware and software to manage it, and operate at a scale and utilization that is unattainable for more traditional shops. With access to cloud-based storage growing faster and more nearly ubiquitous, the cost and performance advantages become more and more compelling.
So consumers are getting used to keeping their stuff in the cloud, with automatic back-up, practically unlimited capacity and access from any device, any time anywhere. Enterprises are dipping their toes in the water with cloud storage back-up and “software as a service” applications, but show every indication of moving resolutely to the cloud for processing and storage intensive applications once their lingering security and transition cost questions are answered.
The implications are several fold: First, device disk drive demand will fall off as enterprise drive demand accelerates, a shift that we believe will play out more quickly than most estimates given still unrealistic projections of PC unit growth and the robust trajectory of cloud investment. This scenario would yield short term disappointments but long term opportunity for drive makers. Second, the implications are the mirror image for flash memory, with short term growth from the shift to portable devices tempering off as the storage nexus shifts to the cloud, where we believe solid state storage will complement disk storage in niche applications. Third, demand for integrated storage systems will crest and decline as enterprises cautiously transition applications to the public cloud. Finally, the biggest, most sophisticated cloud operators will dominate the market given their substantial cost and performance advantages, while enterprises will increasingly rely on IT consultants to help them manage the transition.
The Old Way
Before the PC, data storage was left to the professionals – computer geeks who minded the notoriously finicky integrated disk drive systems and tape drives in the raised floor cold rooms that used to define high technology in movies of a certain vintage. Data storage for the masses came first in the form of 5.25 inch “floppy” drives included with the first mass market PCs and used 1.2MB removable diskettes that were actually floppy and protected by a thin plastic sleeve, and thus, comically easy to corrupt. In the mid-‘80’s 3.5 inch floppy disks sealed in a rigid plastic case became the popular choice – smaller in size, but better protected and higher in capacity. At the same time, integrated hard disks became ubiquitous in serious PCs, freeing software developers from the constraints of holding application programs and related storage on a single floppy and freeing users from managing a library of applications on disk. Over time, hard disk capacities grew, and floppies went away as a removable media option, replaced by CD-ROMs at first before eventually giving way to USB drives and internet downloads.
Meanwhile, the disk drives in the raised floor computer rooms changed too. Historically, enterprise grade hard disks were very large, spun very fast and were vulnerable to failure, which typically resulted in major system downtime to replace the failed disk unit, reload the data from back up and test for system integrity. In 1986, IBM introduced a concept called “checksum” as part of its S/38 operating system, which simplified the tests of data integrity and shortened the recovery from drive failure. This concept was the precursor to RAID (Random Array of Independent Disks) systems, which applied the concept to storage systems built of a larger number of smaller, cheaper disk drives. RAID has many levels, but common to all of them is the scheme to divide and replicate data across multiple physical drives contained in a logical system. RAID systems became a key piece of the client/server architecture which rose to dominance in the 1990’s as means to support user PCs with more powerful centralized computing resources. These systems, produced by EMC, IBM, NetApp, HP and others, remain the bulwark of enterprise data storage.
Over the past 25 years, the maximum capacity of a hard disk drive has increased at a geometric rate, roughly growing by a factor of 10 every 5 years (Exhibit 1). In a 2011 article published in Science, Martin Hilbert and Priscila Lopez estimated that the digital portion of the world’s information grew from just 3% in 1993 to more than 94% in 2007, and that the total digital information amounted to 295 Exabytes (that’s 295 BILLION gigabytes) in 2007, of which, more than half was stored on hard disks (Exhibit 2). IDC estimates that the universe of digital data hit 1.8 Zettabytes in 2011 (that’s 1.8 TRILLION gigabytes), still outpacing the growth in the capacity of individual disk drives. IDC goes on to predict that the total amount of information in the “Digital Universe” will grow to more than 35 Zettabytes by 2020, a 30 times increase from 2011and a better than 40% annual growth rate. This jibes with the growth scenario laid out by Gartner, McKinsey and several other industry analysts, and fits with the better than 40% yearly growth expected for total disk drive capacity shipped through 2015.
Exh 1: Hard Drive Maximum Capacity over Time
Exh 2: Optimally compressed storage by media, 1986-2007
This explosion in data comes from more digital devices creating more digital information (e.g. photographs, emails, search histories, location records, transactions, etc.), and accessing more digital information (file downloads, media streaming, social networking, etc.), while enterprises capture more information about their customers and products. The rise of portable platforms, which put users on-line wherever they are and whenever the like, supercharges the creation of data and works against any natural deceleration that might otherwise be expected. This puts a lot of pressure on individuals and enterprises to store and manage the data.
PCs Have Hard Drives, Tablets Don’t
Since the launch of the iPad in April 2010, more than 120M tablets have been sold world-wide, with 2012 tablet unit shipments expected to approach 2/3rds of total consumer PCs and reach parity by the end of 2014 (Exhibit 3). Meanwhile, the worldwide smartphone user base has hit 1 billion according to Strategy Analytics, vastly outstripping the number of homes with PCs. The point is unmistakable – for the vast majority of consumers, portable devices are becoming the primary internet access method, supplanting PCs in the process. From a storage perspective, this has interesting implications, since PCs typically have substantial hard disk drives, while smartphones, tablets and ultrabooks do not. In the short run, this shift has been a boon for flash memory demand and a drag on hard disks, although there is reason to believe that this dynamic will be somewhat short lived.
Exh 3: Tablet versus Home PC Shipments, 2010-2016
Portable devices rely on flash memory for storage. Flash memory is very small, requires little power, offers extremely fast data access and is largely indifferent to physical jostling – all significant advantages relative to hard disk drives in portable devices. On the flip side, flash memory is over 13 times more expensive per gigabyte than disk storage for devices, a price gap that is only expected to narrow to 11 times by 2015 (Exhibit 4). The cost of flash memory is clear to consumers who are asked to pay $100 more for an iPhone with 64GB of storage vs. the 32GB model. iSuppli Research estimates that flash memory makes up 18% of the high end iPhone5’s bill of materials (Exhibit 5). In comparison, a PC maker could buy a 1 Terabyte laptop disk drive for roughly the same cost. The portable era does not afford the luxury of carrying around all of your songs, photos and videos where ever you go.
Exh 4: NAND Flash versus Mobile HDD cost per GB
Exh 5: iPhone 5 Bill of Materials Breakdown
The solution is to carry only a small subset of one’s personal data in flash memory and store the rest somewhere else where you can get it when you need it. Increasingly, that somewhere else is the “cloud” (Exhibit 6). Pioneered in the consumer market by Dropbox, internet-based services like Google Drive, Apple iCloud, Amazon Cloud Drive and Microsoft SkyDrive offer to store files in their own distributed data centers, accessible from most web connected devices and automatically backed up to statistically predicable levels of reliability. The price for these services is currently 50-60 cents per gigabyte per year, with discounts available for storage above a 100 GB threshold. In comparison, simple home network attached disk drive systems can be bought for less than 10 cents per gigabyte, but these are not web accessible, require installation, are not backed up and are notoriously prone to failure. As the cost of cloud storage comes down, and as cloud operators begin to include more free capacity along with their devices and cloud applications, we expect the popularity of online storage solutions to accelerate.
Exh 6: Share of Consumer Cloud and Local Storage 2011 v. 2016
At the same time, adoption of cloud storage will reduce the need for flash storage on the device itself. The original iPhone, introduced in 2007, maxed out at 8GB of storage. A 16GB version was introduced 7 months later, and the 3GS took the upper threshold to 32GB about 17 months after that. In comparison, it took an additional 27 months for the iPhone 4S to usher in 64GB, which remains the upper limit a year later. The iPhone 5 holds to the 64GB maximum, which is the same upper limit for the 4th generation flagship iPad just announced yesterday (Exhibit 7). The obvious deceleration in the expansion of top-end flash memory configurations for Apple’s smartphones and tablets is emblematic of user requirements for locally available storage. The rest of it can be stored on disks in the cloud.
Exh 7: Apple Device Memory Component Cost Estimates ($ per GB)
Exh 8: Global Forecast for Network Attached Storage, 2010-16
Beyond the Raised Floor
The demand for enterprise storage systems has been strong – Gartner reported 19% revenue growth in external controller based (ECB) storage systems for 2011 and projects double digit growth for 2012, despite the headwind of double digit price declines – as enterprises have spent to cope with wave after wave of data (Exhibit 8). Storage has risen to 28.4% of total global data center IT systems spending from 24.5% 2 years ago. This has translated to strong growth for storage focused companies like EMC and NetApp, and has been a particular bright spot for broader players like IBM, HP and Oracle. Nonetheless, the pressures of the global economic malaise overhang IT budgets and sales growth has decelerated sharply over the past few quarters. With forecasts of better than 40% annual data growth as far as the eye can see, enterprise IT departments face a challenge to balance future capacity needs with budget realities.
This squeeze is a major opportunity for public cloud operators. The design and operation of a leading edge distributed cloud-based data center offers dramatic cost advantages relative to well run enterprise IT shops. Operators like Google or Amazon do not buy external controller based storage systems with integrated logic and software to monitor and manage storage reliability. Rather, they buy commodity disk drives direct from Seagate and Western Digital, and have them installed onto server boards of their own design by contract manufacturers, eliminating the logic, software, cooling, power supply, physical enclosures and healthy margins added by traditional IT vendors. Storage reliability is managed by higher levels of redundancy and by state-of-the-art infrastructure software developed internally by cloud operators atop open source standards. These distributed data centers operate on a massive scale, allowing economies on hardware, software, communications, real estate, and other capital and operating costs. Superior design allows them to run with dramatically less power, fewer workers and lower maintenance expenses. Finally, the most sophisticated cloud data center operations – Google, Amazon, Facebook, and Microsoft – have been the employers of choice for top computer science talent for years, amassing all-star teams of engineers to design, implement and manage their operations that are far, far beyond the reach of ordinary enterprise IT shops, who typically rely on their vendors or IT consultants for technical advice.
In this context, Forrester estimates that 100 TB of cloud storage could save enterprises more than 70% per year than the full cost of equivalent capacity on systems bought and managed in house, with most of the difference coming from the need to buy redundant capacity for reliability and from the cost of the engineers needed to manage and maintain the systems (Exhibit 9). Of course, there are caveats. First, many applications rely on quick access to storage, so moving the data to the cloud is impractical unless the application moves as well. This is a viable solution for many applications, but it adds to cost and disruption of transitioning to the cloud. Alternatively, the application could be replaced by a cloud-based SaaS solution, in many cases a superior option but one that will require even more disruption in transition. Second, cloud storage vendors typically require that data be defined as objects, each tagged with an identifier with attributes to describe the contents of the object and a security mechanism. While this policy is efficient for operators and secure for customers, it may require that the data and application be modified to accommodate – potentially an expensive proposition.
Exh 9: Costs of 100TB of Storage, Internal versus Cloud
Despite the caveats, the cost savings are an enormous incentive for enterprise IT to begin to off load some of their storage needs to the cloud. The most obvious candidate is archival data, of which there is a lot, but applications that do not require high speed access to structured data, such as office productivity suites, messaging, and shared files, are also likely targets for the cloud. Enterprises will also evaluate opportunities to move entire applications into the cloud, as the savings for processing power are on the same magnitude as those for storage. Where the application moves, the data will move also, with the best candidates being applications with particularly heavy demand on processing and storage capacity, such as “big data” analysis, business intelligence, and back office business processes. Over time, we believe that current concerns, such as high transition costs and data security will be alleviated, and that only applications that are site specific, such as shop floor control systems, and/or extremely latency sensitive, such as securities trading, will remain steadfast to the in-house data center.
Exh 10: HDD versus SSD Capacity Shipped, 2010-2016
Solid State of Mind
On a pure capacity basis, hard disk drives for all applications are expected to outsell flash memory storage by more than 30 to 1 in 2012, translating to a better than 60% revenue advantage, on average prices that currently run nearly 95% cheaper per GB relative to fully configured solid state drives (Exhibit 10). However, demand for solid state storage is significantly skewed toward devices, with more than 50% of demand from mobile phones and tablets, another 17% from PCs and just 16% of total revenues coming from enterprise storage applications. (The rest of the market is comprised of industrial applications and external storage devices like USB drives). In contrast, 27% of disk drive sales are related to enterprise storage, and around 55% from PCs. (The rest are used for other applications, such as DVRs, gaming consoles and automobiles).
Purely from a device perspective, the industry paradigm shift is good for solid state storage and bad for disk drives, as we expect traditional PC shipments with their associated hard drives to begin a long, slow and inexorable decline in favor of flash equipped devices, like tablets or ultrabooks. Industry forecasts for both disk drive and flash memory demand reflect this transition, which we believe may play out even faster than many expect (Exhibit 11). However, we are not convinced that projections of increasingly large amounts of storage capacity per device will play out. First, we expect that the unit growth from smartphones and tablets will be fastest at the low end of the price scale, where flash configurations are less generous. Second, the growth in storage capacity at the high end has decelerated overtime, as evidenced by the maximum configurations for iPhones and iPads. With the emergence of consumer cloud storage services, we do not expect significant future expansions in future generations.
Exh 11: PC/device HDD versus NAND Revenue, 2010-2016
There’s no RAID in the Cloud
The shift of enterprise applications and storage to the cloud is bad news for storage system vendors, like EMC, NetApp, IBM, HP, Oracle and others, but good news for commodity disk drive manufacturers. Scale operators, like Google, Amazon and Microsoft, do not buy configured systems with their expensive controllers, management software, maintenance contracts or vendor margins, and as organizations begin to take advantage of the dramatic cost savings from cloud storage, much of the savings comes out of the market addressable by those traditional systems vendors.
However, the change in architecture does not reduce the demands of data growth. In fact, cloud operators tend to rely more heavily on redundancy to achieve storage reliability than does the average enterprise, suggesting that the paradigm shift could accelerate the demand on raw disk drive capacity. Moreover, the growth of consumer oriented cloud applications like social networking, location based services, media streaming, and e-commerce, along with the shift of user-generated data to the cloud sparked by the adoption of flash-based portable devices, is a further catalyst to strong storage demand from these online distributed data centers. We believe this phenomenon will play out faster than forecasts assume, a sweet consolation prize to disk drive makers suffering from the shift away from their bread and butter PC market.
Solid State is a Complementary Niche Market in the Data Center, Cloud or Otherwise
The cost of solid state storage is coming down more quickly than the cost of disk drives, but the relative trajectories will not allow flash storage to narrow the pricing gap enough to be competitive on primary enterprise or cloud storage applications for many, many years. Alternative solid state memory technologies in development laboratories are promising, but a decade or more from commercial reality. For the foreseeable future, disk drives will provide the vast majority of storage for data centers.
Exh 12: Data Center Costs of Storage and Disk Types Commonly Used by Application
That said, solid state storage already plays an interesting role in many sophisticated data centers, where the high cost can be justified for its other attributes – e.g. extremely fast access times and very low power usage – and where its technical downsides – e.g. capacity degradation under heavy read/write loads – are not a particular liability (Exhibit 12). For example, Facebook has deployed flash storage from Fusion I/O to accelerate access to commonly requested files and images. GigaOM reports that Facebook is developing technology to avoid unreliable areas within flash memory, and thus make use of memory that has been rejected for other uses, conjecturing that lower cost, salvaged flash could be used to archive its massive collection of user posted photos giving users fast access to infrequently used files.
Even with a severe disk drive shortage caused when flooding in Thailand destroyed factories and disrupted supply lines, more than 80 Exabytes of disk storage was expected to ship for enterprise applications in 2012, vs. less than 2 Exabytes of flash. In 2013, enterprise disk storage is forecast to rebound to more than 150 Exabytes vs. 4 Exabytes of flash. Given the relative sizes of the two media and the considerable price premium for flash, it is hard to characterize enterprise grade solid state storage as directly competitive, even if it can maintain its considerable growth rate (Exhibit 13).
Exh 13: Enterprise HDD versus SSD Revenue, 2010-2016
Exh 14: Hard Disk Drive Revenue Share by Segment, Q212
Despite the near term drag of weak PC sales, we believe that the disk drive industry is well positioned for growth and profitability. We expect disk drive shipments for cloud-based data centers to run well ahead of expectations. Moreover, in 2011, Seagate acquired Samsung’s disk drive operations, while rival Western Digital acquired Hitachi GST, leaving the market with just three remaining suppliers, including Toshiba. After industry capacity was constrained by the Thailand flooding, price competition in the disk drive market has remained orderly, suggesting that the violent cyclicality that has plagued the industry from its birth may be less violent going forward. Given these considerations, we see disk drive makers as significant beneficiaries of the industry paradigm shift, with Seagate particularly well positioned, given its leadership in the enterprise market (Exhibit 14). Key suppliers to the disk drive industry, such as Nidec and TDK, would also likely benefit.
Exh 15: Winners and Losers
In contrast, the flash memory market, which has been in a cycle of constrained capacity as well, is attracting significant investment with competition driving prices lower – NAND flash ASPs dropped nearly 40% this year. Nonetheless, industry analysts are optimistic for 2013, with the belief that a mix shifting from discrete chips to integrated solid state storage solutions embedding memory and controller with a standard interface may allow more differentiation and thus, more stable prices. In the context of devices, we are skeptical that pricing can hold to the degree forecast, particularly given our view that device storage capacity may plateau. However, solid state drive products, designed for enterprise applications with additional embedded logic and higher capacities, may prove to grow faster than estimates assume, although it appears that cloud giant Google may be ramping its own initiative to have commodity NAND flash packaged into SSDs of its own design. Should the other cloud operators follow Google’s lead or should the flood of well funded SSD start ups raise the competitive temperature, the opportunity to differentiate at the SSD level and hold the line on pricing may prove to be less attractive than analysts currently assume.
On a broader basis, the shifting storage market plays into the hands of the lowest cost cloud operators – Google, Amazon, Microsoft, and potentially, Facebook – and to IT consultants who may be able to facilitate the transition to the cloud for their enterprise clients. On the downside, we are concerned that longer term expectations for enterprise storage systems vendors continue to assume market growth, given the likelihood of an accelerating shift to the cloud.