The shrinking middle class of storage. While reviewing my recent performance analysis work, this phrase came to mind to describe a long-standing trend in storage systems. The rise of tiering/caching software paired with the reducing cost of flash media and the increasing capacity density of 7.2K/10K RPM disk drives has considerably reduced the need for the “middle class” – the small to medium capacity 10K/15K disk drive.
However, when considering this trend as it applies to designing a storage system, it is more important than ever to have enough fast media available to handle the size and rate of change of the working set of data. Many issues can come up if an effort isn’t made up front to quantify this, and I wanted to share an example of how this can look in an actual production system.
My focus here applies to the more long-standing enterprise storage systems on the market, such as EMC Symmetrix VMAX, HDS VSP G1000, etc. I’ll leave discussion of how the working set applies to the newer group of all flash and hybrid systems to a future blog post.
Quantifying the Working Set
At the simplest level, the working set in a storage system is the subset of data that generates the majority of I/O activity over time. This is also sometimes referred to as skew, which describes the degree to which I/O activity varies by unit of capacity.
All data has a lifecycle that varies by application. For example, a new sales transaction on a website might comprise many data elements that are frequently accessed for several hours in support of business processes, e.g. using the total dollar amount for a credit card authorization, using item quantities and UPC codes to drive warehouse picking and shipping, using a customer ID to drive a data analytics process to make a promotional offer, etc. After this flurry of activity, I/O activity related to this transaction quickly falls off and new transactions take its place. A storage system design must provide sufficient resources to handle the workload generated by the aggregated working set of data for all applications it is intended to support.
Quantifying the working set can typically be done in two ways. Both methods require having performance metrics by time and logical device (LUN/volume), covering both typical processing time periods and if possible, less-frequent peak processing such as a month-end close. The most simple method is via a tool or service provided by the manufacturer, e.g. EMC Tier Advisor or other array-specific sizing tools that are able to calculate and factor in skew. If a tool isn’t available, which is rare nowadays, the working set can be calculated in a spreadsheet. Note that I’ve had to use the latter method only a couple of times over the past several years, all involving IBM DS8xxx systems using text data extracted from Tivoli Storage Productivity Center (TPC).
Most enterprise storage systems have the ability to track data at a fairly fine level of granularity and make physical data placement decisions based on utilization over time, e.g. EMC Fully Automated Storage Tiering (FAST) or Hitachi Dynamic Tiering (HDT). Implementations have two or three tiers of physical storage media available and the percentage of installed capacity across these tiers varies depending on the skew calculations/assumptions that were used when sizing.
Most enterprise storage systems also have the ability to use flash media to extend the available capacity for caching data beyond the DRAM in the controllers. Each implementation has a specific theory of operations that defines how appropriate data gets moved into the cache, so get the details to see how much benefit there may be for a given workload. Server side DRAM/flash caching with a product like PernixData’s FVP is also a very powerful capability.
It is important to note that while tiering and caching algorithms can do a great job of anticipating and reacting to application I/O patterns, no algorithm is perfect. This is the primary reason why synthetic benchmarks must be taken with a grain of salt, but that is another topic! Being conservative is a good rule of thumb.
I recently came across an EMC VMAX 40K with the following complement of disks (includes spares):
33 x 400GB Flash
294 x 300GB/15K RPM
108 x 2TB/7.2K RPM
This system had the following component utilization chart for the sample period I observed (6 days at 15-minute intervals):
The bottom two thirds of this utilization chart shows the individual disks. The disks that are shown as red in color are the 2TB/7.2K drives and they are consistently over 90% utilization. The disks that are blue-green in color are the flash and 15K drives, and are consistently below 40% utilization.
So why the disparity in utilization? Shouldn’t the tiering algorithms figure this out? A quick review of capacity utilization by tier points to the primary cause:
|Pool||Usable GBs||Free GBs||Used GBs||Full %|
While there is plenty of performance headroom in the flash and 15K pools as shown in the utilization chart, there is not enough capacity. This is forcing data segments that would otherwise be placed on faster tiers to fall down to the 7.2K tier and the drives can’t keep up. There are not enough resources to handle the working set in this real-world example and the best remedy is to add to the flash and/or 15K tiers in future upgrades.
A little up front work to figure out how much fast media is needed for a given implementation is critical for success. You can leverage AHEAD’s extensive design experience for your next project.