Saturday, June 17, 2006

The Data Dungeon

This idea came from reading "I, Cringely" of 17 Nov 2005. Cringely says Google is creating a "datacenter in a box" - shipping container, really - "5000 Opteron processors and 3.5 petabytes of disk". That's pretty impressive.

People have been buying servers and building datacenters for years - why should this be exciting? Because it has the potential to lower the cost of the whole datacenter radically - without doing any calculations, 5 or 10 times.

I thought how I'd do it - no fans (they break) means liquid cooling, no UPS - direct DC, single A/C unit, no walk spaces, no cases needed for equipment unless it's for cooling, RFI or containment. OH&S doesn't occur when it's working - it's sealed.

And you throw away the key. It's the next logical move from "lights out" or "dark datacenters"...
[You may even weld the doors shut.]

A reasonable technology and physical life, without maintenance, would be 3-4 years.

And people like SUN, Dell, IBM and HP could make these things - either sell or lease. And being the owners, could

Google have a very special workload - it scales linearly and generally is short transactions that can be rerun.

Normal commercial workloads are things like:
- webserver (Transaction based, restartable, load-balancer friendly)
- database (long-connection time, persistent, need clustering for high-reliability/availability)
- filer/file server (more like a database)
- email - client or server. Both need reliable data storage, but can take restarts.
- and probably way more...

The whole point of a "Data Dungeon" is replicating at the systems level, not the component level...
You don't need hot-swap power supplies, dual-NICs yadda-yadda-yadda if you have two complete systems that hot-swao.
Commodity hardware is *cheap*. You have to be inventive with your software/systems to design around break-able parts.

And all parts don't have to be the same - you'd want some really low-power fanless CPU's for some types of service, and enough top-end high-power CPU's in the mix for those times when too much grunt is not enough...
It's not going to be a box full of just the one thing...

So a "Data Dungeon" - would you ever just have ONE? Nope - the breakable design dictates at least two... Which you can stack in a car-space out the back [shipping containers, rememer?]. And when it's time to upgrade, wheel in another one or two, mirror the data, migrate the persistent processes and take away the old ones - all done live in prime-time...

Part of the scheme is running everything in Virtual Machines: Only one service to a virtual machine (ebserver, email, DB, ...)
It's easy to migragte a service onto a different physical processor - if you have load or servicability problems.
[VMware have some neat new Enterprise tools to do this now.]

And with Mac on Intel, running VM's means you get to run all the major commercial apps:
- all flavours of Windows deskop - via VNC or Citrix remote client to host legacy Apps.
- Windows server
- Mac OS/X
- z/OS [IBM mainframe]
- Solaris, BSD, and Linux [for those who need a Unix]

No comments: