A big exciting day! – click on the link to subscribe to one of the sessions… Earlier I posted on how EMC has been tracking the host attach rate to our systems and how 2 years ago in the open-systems category – ESX hosts started to just ramp up furiously? That had the engineers thinking about what would characterize a storage subsystem for a 100% virtualized datacenter. Over the remaining 2 years, they’ve been off working on the next generation. The CX4 was the first to incorporate some of this thinking (large scale multicore, 64-bit transition, ability to change I/O interfaces non-disruptively via Ultraflex I/O, Virtual Provisioning), the Celerra updates came next (similar hardware considerations including the modular Ultraflex I/O design, deduplication, vCenter plugins), and now the 3rd shoe drops. Readers – you know that I’m a big fan of small, and home-brew, and using IOmega for VMware storage –
that is EMC. So is this – just at the other extreme. So – what characterizes a “Storage for a Private Cloud”? Well – it needs to be:
- Efficient – for any given workload – it needs to be as efficient as possible. For some, that means capacity-saving techniques like Virtual Provisioning. For others, it means IO-saving techniques like EFD, large caches.
- Control – the ability to have vCenter-integrated control (which EMC covers end-to-end – EMC storage Viewer for the VMware admin, ControlCenter being VMware Ready for the storage admin, and the new provisioning model on the Symmetrix V-Max has a built-in facility for “VMware Cluster-wide provisioning” all at once), and control at the hyper-scale of the VMware datacenters we’re seeing within datacenters and at service providers. It’s has to deliver the control of 24 x forever operation.
- Choice – the choice to have a common infrastructure literally scale to any workload, any app, any use case, and across geographic boundaries. You need to be able to literally change ANY attribute (software OR hardware configuration – from layout to brains, to IO) totally non-disruptively. In the mid-range, that means a certain scaling factor, but in the high-end – it means a different scaling factor.
First – a quick summary of Symmetrix V-Max (and I’ll leave a lot of detail to Barry – I’m more interested in the VMware related elements). It’s a new product – which can be summarized as 2-3 times faster/larger/stronger than the previous generation – which normally would be enough for any new product announcement, but it’s also a new architectural design, and one that is important and also ties together things we’ve been working on for a long time, and we think will redefine storage-land for a while. A V-Max Engine is the building block of a Symmetrix V-Max config. It has 8 Intel x86-64 cores, 16 high density front end ports (can be hot-swapped – note the theme of anything cn be changed non-disruptively) and 128GB of memory. The “Global Memory” thing is important for reasons that will be clear in a second… bear with me. You can take these V-Max engines and start with one and scale out. How far out? Let’s use the “out of the gate” configuration: Here – with 8 V-Max engines, we now have an array that has:
- hundreds of high performance cores
- hundreds of ports
- TBs of cache
- supports thousands of spindles
If you’re curious what this would look like – check it out…. Let me explain the Virtual Matrix design and that global memory design. Literally, the memory of these engines are all one global memory address space and all nodes can read and write to the common – supported by the virtual matrix – EMC-developed ASICs which handle memory access and a very highspeed low latency interconnect. But the design has been designed to scale to 256 engines, and the virtual matrix is designed to span geographic distances. Here – now, across geographically disperse datacenters, you have
- thousands of high performance cores
- thousands of ports
- hundreds of TBs of cache
- hundreds of Petabytes of storage
- tens of millions of IOPs (something has to feed the 6 Million IOP vSphere cluster – right?)
The next element is the idea of
FAST (Fully Automated Storage Tiering) which can tier within the array (which remember – is now modular, and can be geographically dispersed) completely transparently.
Ok – take a look at this slide from the VMware/Cisco/EMC keynote at VMworld 2009. Now I can start to apply the decoder ring to what we were talking about (BTW – we continue to do these – I’m doing one on Wednesday with Cisco at VMware Partner Exchange – with each event, we’re going to keep a 1-2 year window on what we’re talking about, so now you can see how literal we are, you can apply the same filter for subsequent events).
- The Moore’s law point meant that it was clear – our focus on Intel and x86 in the mid-range was the right bet, and was becoming a no-brainer for Symmetrix. It’s the smallest part of the Symmetrix V-Max launch to a customer at it’s surface, but important. It’s one of the reasons we’re able to get 2-3x more in every possible vector (performance, number of objects/replicas). It’s also one of the important ingredients as we continue to scale up in the current generation and the future generations and contain cost at the same time – the Xeon roadmap isn’t slowing down right now – it’s accelerating, and VMware and hyper-consolidated arrays are two of the few things that can chew up all those cycles. BTW – this was a three way presentation – and we were all trying to show “where we are thinking” – note that for Cisco “Processors/Servers designed for VMware” clearly meant UCS.
- Something to consider….. Purely from an engineering feat standpoint – think about this: the Symmetrix engineering team managed to completely change the hardware layer supporting the platform (including a full instruction set change and moving from a direct matrix to virtual matrix model) while keeping DMX 4 the clear technology/market leader at the high end during the multi-year effort, AND maintaining full look/feel/feature consistency in the Symmetrix V-Max. That’s quite an engineering achievement. For you VMware readers that’s analagous to VI3.5 going to vSphere while going from PowerPC to x86 – obviously not the case, but makes you think about the software engineering challenge.
- “Every App that can be a VM should be a VM should be (and will be!) a VM” – we’re hammering this (because it’s true, and when understood it’s a key to datacenter transformation) – you can literally virtualize almost every x86 workload. Are there exceptions? Sure – but they aren’t worth arguing over. And with VMware’s own shoe to drop, the bar will clearly be raised again. This means that all the infrastructure to support it needs to be able to scale to that sort of workload. Sure, it’s not for everyone, but for those of you that say “yes, I need a configuration with thousands of spindles, and hundreds of storage engines, and hundreds of ports – and I need to start small but be able to scale up to that with one point of management”
- “DRS for Storage and Network” and “Aggregate workloads demand I/O QoS” - now it’s clear what we’re talking about – the storage array will be able to move the elements that compose VMs (even parts of VMs!) between tiers, between configurations, and applying storage-engine QoS mechanisms completely dynamically and automatically. FAST (Fully Automated Storage Tiering) is huge – particularly in the coming world with two very divergent tiers (hyper-fast EFD and hyper-large SATA). Now, I’ve talked about this in a few places before, but the picture is more clear now – this will work without any external policy engine (working to optimize with no external input), but his will come to full fruition with vApp metadata and vStorage APIs in the vSphere roadmap – for example, you may be OK with virtual machine A getting a lower performance than virtual machine B – so a “balance all available resources” is good, but not as good as “proritize this VM to deliver these performance characteristics”.
- “Technology must contain cost” – efficiency needs to operate at every level – Virtual Provisioning (be thin and fluid), hit capacity goals as effectively as possible (for some workloads, dedupe, for others SATA, for others compression), hit IOps goals as effectively as possible (large caches – but more than anything EFD is changing the game here) – but there’s another factor: Consolidation and Management.
Think about that…. Hyper scale, hyper consolidated environments like the idea of:
- A single vSphere cluster that consists of 64 ESX hosts, with 4096 cores, 32TB of RAM, and 6 Million IOPs – the largest thing of it’s kind, and all centrally managed as an aggregate pool
- A single Cisco UCS design can have 320 blades with 2590 cores, 62TB of RAM – the largest unified compute system of it’s kind, and all centrally managed as an aggregate pool
- A single Symmetrix V-Max that can have thousands of cores, thousands of ports, hundreds of TB of memory, hundreds of PB of storage – the largest hyper-consolidated single array (that can span sites) – that can be managed as an aggregate pool.
Even better – through vStorage and vNetwork integration, increasingly these can be managed as a unit (as can anyone that designs against these philosophies and APIs) For those of you that aren’t Symmetrix V-Max scale but like what you see….
EMC spent $1.8B last year in R&D – more than the rest of our industry put together. That’s not to disparage any others. Heck, frankly, many projects are easier when you’re small! What it does mean that you may be see good cross-polination here, and it’s a two-way street. Things like FAST, or the dedupe/compression techniques on Celerra or the ____ on CLARiiON (wait for it!) all cross-polinate.
Like I said when I started - an exciting new day!