10 Things I Learned Shipping an Ancient Data Center to AWS (Part 1)

By John Fahl, IOD Expert
When we got to the consulting gig, they told us they wanted to do DevOps. They said they wanted to be like <insert “awesome streaming video provider” here>. What they had was several data centers filled with several servers using OSs older than 10 years (some even 20+ years old).
They also relied on some real relics like NIS (yup…this is still a thing …), had deep vendor lock-in, massive old, stagnant clusters, manually built applications (most having been created a decade earlier), and armies of contractors overseeing it all.

Despite many hurdles, we managed to succeed with this project and get them to the cloud.
This is not a how-to migrate your data center post. It’s more of a how NOT to migrate it. Here are the first 5 of 10 lessons I learned from shipping an ancient data center to AWS. (I’ll share with you the other five in my next post!)

1. Windows Server 2003, RHEL 5: Seriously, Don’t Put Relics in the Cloud

Why does putting relics in the cloud get special mention? Because it still happens: Would you believe I moved over 200 of them just over a year ago? I’m leaving off RHEL 4 and 3, Gentoo, Solaris, HP-UX, Windows 2000, and the rest I changed on the way since they don’t even run on AWS. Some of them never made the move (and for good reason, they should be retired). Are 2003 and RHEL5 supported on AWS? Well … technically. But should you move them? No way. You want instance issues? Tons of dreaded 1/2 health checks? If you do, go ahead, move your (OS vendor) unsupported servers and extend their lives but know this:
They will fail. They will have kernel issues. The citrix pv driver issues in Windows 2003 alone will bring you to the brink of madness.
AWS provides tools (Hub and Server Migration) to move these old systems if you’re in the right region and under the right conditions respectively. Otherwise, you’ll have to resort to third-party vendor support. If you must move these servers because of a crufty application that a developer built 15 years ago, do yourself these favors:

  • Upgrade the OS first. I recommend a minimum upgrade to 2008 R2 and RHEL 6. They are much more AWS-friendly. The virtualization drivers are better. They are actually supported operating systems.
  • If you can’t upgrade the OS, at least upgrade all newest patches (just kidding, since they don’t get patches anymore) and the kernel to latest.
  • Install the latest Citrix network drivers. This is extremely important. Without this step, your instances will fall offline often. 
  • If you’re lucky enough to be dealing with relics in VMWare or Hyper-V, try to use their migration tools to move your servers to a hypervisor in the cloud directly.

2. Create a Simple and Clean DNS

The data center I moved had several “sets” of DNS name spacing schematics representing different eras. It’s 2018 and you’re going to the cloud, why are you managing DNS? Rather than having CNAMEs reference CNAMEs that reference A records that equate to a poor man’s round robin on an antiquated VIP, just fix the name spacing. Moving these problems, or worse, extending them, makes troubleshooting much more difficult.
Create logical Amazon Route53 name spaces. Place the definitions in your servers through DHCP scope options and/or scripts. As you’re prepping to move your systems to AWS, prep them to switch.

  • Create CNAMEs for your on-premise records to reference your Route 53 A records. Have your Route53 record pointing to the on-premise IP.
  • When you move your server, make the switch.
  • Abandon the old Zones down to legacy stubs or pointers.

You don’t want to be in the business of troubleshooting hops to hops trying to figure out where things are going wrong in DNS. Also, why incur that latency?

3. Make File Services Simple or Go S3

SAMBA, CIFS, NFS, DFS(-R), CFS, Gluster, oh my … are you ready for it? Moving your file services to AWS isn’t hard; moving them to match the way you’ve managed it on-premises can be, though. In fact, it’s a big mistake.
Unless you’re leveraging NetApp with their ONTAP cloud offering, you need to seriously reconsider how your applications/users access data. If you can’t move your data to Amazon S3 (typically requiring a refactor in your application), or in very specific cases, Amazon EFS, then you’ll find yourself in the quagmire of slogging data into a file server or third-party offering.
Amazon does not have the on-premises data center concept of shared storage. Whatever you do, don’t play the game of setting up “cloud-like” HA for file services: you will be sorry. By sorry, I mean when replication or reconciliation fails (as it happened to me with millions of files, many times), you will want to delete all the files in a fit of rage.
Instead, do the following:

  • Migrate your data with Rsync and/or Robocopy to a single server.
  • Use a tool like CPM to take frequent snapshots for quick recovery
  • For servers, you don’t need to back everything up like you did on premise. Back up snowflakes, data servers, and your other pets. If you’re simply moving pets to their new pen in the cloud (i.e. need to backup everything), I’m sorry.
  • Whatever you do, keep it simple. File services can become problematic if you’re managing 50TB for a heterogeneous environment.
    • Consider this an interim step to buy yourself time to update your application to use S3 (Guess what, NetApp Cloud Sync can move that data for you)
  • Remember, how you managed data in your onsite file server cluster is not cloud-like, not even a little bit. Abandon that model because after you work through a replication failure and try to reconcile 150M files in a backlog burndown–you will know that you’ve passed through the gate and there is no return.

4. Don’t Subscribe to Tech Debt

Nobody likes tech debt. It’s the mountain of misery that chases you, smothers you, and many of us believe it can’t be defeated. On the contrary, tech debt can be controlled:

  • Train (personnel) first, plan (your project) second, build third. If you do it in any other order (I’ve done so, twice) you will be subscribing to tons of tech debt.
  • I’ve seen “Ready, Fire, Aim” used a lot lately in social media. These people must have never moved to the cloud, or at least, didn’t have to do the technical work. In short, don’t rush it.
  • If you are moving your ancient servers to the cloud, come up with the steps to decommission them with the nextgen solution before you move them. Don’t put it off until next quarter, or next year, or it will live in its expensive infamy like a failed venture capitalist meme.
  • If and when you start migration conversations with, “to do this the right way …” stop right there and take note of the right way to do your move. Realize any other decision (based on time, money, resources) means it will cost you more of any or all of those things in the long run. 
  • If you are shooting for CI/CD, DevOps, all the cool stuff, stop what you’re doing and do not move your data center. Build for cloud, migrate the services, and then decom your data center as you make progress. Does it sound slow and expensive? Wrong. It’s actually the cheapest solution.

5. Your Old Vendors Aren’t Always the Best

You may have a 10-year relationship with your backups or database vendor. You might rely on these solutions as lifeblood. You may be years deep in a storage contract. But you’re moving to the cloud. You will make new partnerships and be on the hook for new objectives. I’ll say it again, managing your environment in the cloud is not the same as it was for your on-premises one.
Once you have your tech debt-minimized road map set for the cloud, seek out new vendors and solutions that will help you get there and operate there. Don’t call up your old buddies at Vendor X that have floated your data center with sweet line items on a contract which requires a Master’s degree to understand. Do your research, evaluate, trust your engineers to find the right solutions. Not doing so will slow your progress, and it will keep you embedded in those large legacy contracts. If it isn’t a button click on the AWS Marketplace and a pay by the hour model, then be certain you really want it. Lock-in is for suckers.


IOD is a content creation and research company working with some of the top names in IT. Our philosophy is experts are not writers and writers are not experts, so we pair tech experts with experienced editors to produce high quality, deeply technical content. The author of this post is one of our top experts. You can be too!  JOIN US.


Summary

This is the first half of the best advice I can give those making the move to cloud. This can be a difficult endeavor, but it will be much less painful if you abandon the rules you’ve followed for the last few decades. It is time to embrace new technologies and meet the cloud head on. Do it right the first time.

Related posts