Upgrading a VDI vCenter 5.5 on Windows to VCSA 6.5

I recently worked with a customer to upgrade their Horizon VDI environment's Windows vCenter 5.5 server to the vCenter Server Appliance running 6.5.  I knew from an earlier experience that such a migration could potentially be challenging, but I hoped that things would go more smoothly this time, since that old issue was from before the Migrate option was introduced.  This customer also had a smaller, completely isolated DR VDI environment that we could upgrade first, to prove out our process.  So, that's what we did!

The migration of the DR environment went without a hitch.  We even spun up about 20 desktops and had a few IT staff log into and use them during the upgrade, so that we could be confident that we'd identify any issues that might impact the users during the production migration.  Everything went great, so we confidently moved forward with the production migration.  You can probably guess what happened next.

Fortunately, we didn't run into any catastrophic issues and the users experienced no downtime.  Unfortunately, we had to roll back our change after our first attempt.  Our core issue was simple enough - this was a small environment with about 500 desktops.  The migration assistant however detected it as an X-Large environment, meaning that it wanted 24 vCPUs and 48 GB of RAM.  We poked around at it a bit, but decided that we should get Support on the phone before we broke something.

The support engineer was very friendly and walked us through a few processes for cleaning up the vCenter database.  After each one, we tried the migration again, but it kept insisting that we'd need an X-Large vCenter (with a Large storage footprint).  After each failed attempt, we dug deeper and deeper into that database, running SQL queries that I'd never seen before (note: I'm no DBA, so that's not saying too much...), removing entries and who-knows what else.

At the end of the day, unable to get the migration process to detect the correct size, the support engineer approved our request to just move forward with the X-Large system for the migration, then shut down the appliance and reduce its compute footprint to something more reasonable (we weren't overly concerned about the storage footprint).  So, we thanked them for their help and moved on with that process ourselves.

The process completed (after quite a long wait), but the appliance that resulted was completely worthless.  We could authenticate to it, but it never managed to load the inventory.  With a nervous feeling, we decided to turn back on the original machine... and saw the exact same behavior there.  Good thing we took backups!  Since we knew that our aggressive changes had all been to the SQL database, we restored that DB and then tried again.  Fortunately, that got the source machine back online.  By this point, we were nearing the end of our change window, so we elected to ensure that the environment was stable and then to come back at it another time.

On try #2, we decided that we'd move forward with the X-Large deployment and then shrink approach, but try not to break our database first.  This plan eventually worked just fine, but we came across one minor issue.  When we started up the migration service on the vCenter server, it failed due to a lack of free space.  It said that it needed 17.6 GB, but our drive only had 17.4 GB available.  So close!  We looked around for some easily deletable files but were only able to clear up a couple hundred megs.  While that seemed like it would technically be enough space, we really didn't want to fill up the C:\ drive and so chose to expand our VMDK.

We removed our snapshot, added 20 GB of space to that VMDK, then extended the volume in windows.  Then, we took a new snapshot and resumed our process.  As it went, I kept an eye on our free space, just out of curiosity.  The process took quite a long time to run (about 3 hours), and by the end it had actually consumed about 24 GB on the C:\ drive!  So, long story short, make sure that you've got some extra room on C:\ when performing this upgrade (or point the migration service at another, larger drive)!

Fortunately, after that long runtime the process completed successfully!  After it was done, we shut down the VM and reduced it to a "medium" size, then powered it back up.  The VCSA came back up with no problems and, most importantly, Horizon did not lose track of its user:desktop persistent mappings, so everything ended up going very smoothly!

Comments

Popular posts from this blog

Clone a Standard vSwitch from one ESXi Host to Another

PowerShell Sorting by Multiple Columns

Deleting Orphaned (AKA Zombie) VMDK Files