Long Distance Data Centers: The Hard Part

In a previous post, I described why EMC IT is migrating applications from one of our main Massachusetts data centers to a new facility in North Carolina as part of our company’s journey to building our own Private Cloud. Simply put, we need more distance to protect critical business systems from really big, region-wide disasters. But doing so also adds complexity and cost. More important, it forces us at EMC to sort out which of the applications we run are truly mission-critical. In other words, which must survive a disaster with every last transaction intact? Which can afford to lose a few minutes, hours, or even a day’s worth of data?Why hasn’t this come up before? Actually, it did. But the decision was a lot simpler when our data centers were both in the “Metro-West” area outside of Boston. Applications were either disaster-recovery (DR) protected across multiple sites, or they weren’t. If they were, it was to the last transaction.

Now we’re spreading our data centers much farther apart. In order to capture data changes up to the last possible instant, we need to add a “bunker site” within 200 kilometers of each main site. (See this solution guide for a detailed technical explanation and example for Oracle.)

But not every application needs last-possible-instant protection. Sorting out which ones truly do, and which truly don’t, will make a huge difference in EMC IT equipment costs and operational expense. Sounds pretty straightforward, right? Each application we designate as not needing maximum protection costs less to operate, so conversations with their “owners” — groups within EMC’s businesses that use them — should be equally simple.

Unfortunately, that’s where we hit a potentially nasty snag. At EMC, our IT organization does not use charge back accounting for company-wide services like disaster recovery. We’ve saved quite a bit doing things this way. But it also means there’s no incentive for an EMC business unit to examine each application looking for savings.

Imagine you’re one of those business unit “owners.” If an application is DR-protected today, then your business obviously considers it mission-critical, right? Why shouldn’t it get the same protection in the new data center? Calculating various risk scenarios would take a non-trivial amount of effort, and the expense of protecting your app portfolio is dwarfed by the company’s aggregate costs. Why not let the other businesses deal with it? I don’t know about you, but I don’t envy the IT person who has to ask our company’s business unit “owners” to sort out their applications like this.

Last week, I met someone who’d just found out she’d been charged with doing exactly that.

I’ll keep you posted on how things go.

As always, I look forward to your thoughtful comments.

About the Author: David Freund