Since I started with Fairwinds back in 1999, we’ve been a Novell shop. Even when we brought in Citrix servers (and therefore Microsoft Server software), we still stayed solidly in the Novell camp. Over the past few years, however, it has become increasingly difficult to find Novell technicians, particularly good ones, that we can contract. Because of this, and our heavy reliance on Citrix and terminal services, we’ve elected to migrate from a Novell environment to a Microsoft environment.
Okay, I know that the Novell guys are screaming right now. I’ve heard it all… yes there are more Microsoft techies out there, and yes, that may be because more are needed to support the systems. The reality is that in my operations, I need support quickly. Where we live, we just can’t get it with Novell. I’ve also suffered numerous issues with trying to tie Novell in to Citrix, and it needs to end. Since Citrix and Microsoft are quite tight, I simply don’t need Novell any more. In addition, my version of Novell is out of date, and to upgrade is going to cost almost as much as putting in MS software anyway. So we bit the bullet on it.
We started on the project two weeks ago, and the main conversion finished at 2AM on Saturday morning. In all, I put in 82 hours of work last week, leaving the office at 2:00 AM on three out of five days. It was an exhausting project.
In addition to the main migration, we also added another layer to this project. I consolidated several servers in my server room, and we are running the whole thing server system on VMWare ESX 3.5.
Before I talk a little more about what we’ve gone to, here’s what we came from:
- Our Directory Services and File Storage were managed by 2 (physical) Novell 6.5 Small Business Servers in a clustered environment.
- We housed our own email server using Novell Groupwise 6.5, which resided in the cluster
- We had a Citrix farm consisting of two Citrix terminal servers, which were load balanced
- We also had one web access server to leverage Citrix remotely, a backup server, and a separate machine which we used to manage the network
So, in total, I had 7 physical machines running in my server room. Most of our users attach to the network using thin clients, but we have about 9 that use Windows XP terminals.
Now, what we’ve gone to:
We actually have 2 physical servers for VMWare, as well as one “utility” server. And that’s it. So immediately, we’re cutting 4 physical machines out of our network, resulting in a bit of power savings. (I may add back one more server if there are any performance issues, but that remains to be seen.) What’s cool, though, is that I’m actually going to be running more servers at the end, despite having less physical machines. Here’s what will be up and running within the VMWare environment:
- Two Active Directory servers (Domain Controller, DHCP & DNS)
- One File & Print server
- One SQL 2005 server
- Two Citrix servers
- One Remote Gateway server
- One Exchange server
In addition to this, we have a pretty skookum box left over. This machine will run or “Virtual Centre”, which we’ll use to manager our ESX cluster, as well as act as our backup server. The long and short of this is that I’ll be running 9 servers on 3 machines, instead of 7 on 7.
The actual migration was a mixture of good and bad, challenges and victories, frustration and elation. I don’t think that any of our end users really appreciate what was involved in this. Not because they don’t appreciate what I do, far from it, but simply because they don’t know just how many decisions need to be made to make everything work. While I worked for the last two weeks with an extremely competent consultant on this, he made recommendations, but every key decision was ultimately mine.
It actually started off well enough. We pulled down a Novell server, slapped in some new drives, and built and ESX server. A few new drives in our SAN (storage area network), and we were off to the races building new servers. My week was a little screwed up with appointments, but by the end of the week we had our first ESX server, and the following virtual servers build: Active Directory, SQL, Exchange and a Citrix server. We also had a virtual system centre built as well, had captured screen shots of every screen in the build process (for disaster recovery purposes), and had made a great many of the design decisions.
The second week was the one that was destined to be much tougher. My memory of everything could be a little sketchy, as the week was pretty long, but here goes…
This was the week that we needed to move all of our data off the SAN, onto a temporary device, so that we could blow away the Novell formatted drives and get them into NTFS format. I don’t know why, but this ended up being a much bigger issue than we originally thought. After a full day of building another ESX server and other required machines, we shut down our resort and 6:00PM, kicked off the data copying process and went for dinner. Upon returning, though, we found that the tool had worked through a verification, only to find that the verification (before the copy) failed. We fought the issue with this particular product for the rest of the evening, to no satisfactory conclusion. At about 10:00PM, we elected to fail back to using plain old ‘Robocopy’ and set it to go. Our intention, originally, had been to move the data to the temporary array and point the users to run on that for Tuesday. As it happened, we couldn’t do that, so we had to let the users run on the original data, electing for a differential copy on Tuesday night.
We also spent that evening doing a P2V (physical to virtual) process to virtualize our main Citrix server. This would allow us to free up the box that would become our stand alone virtual centre. All went well until we started it up in the virtual environment. It was a mess… we spent a couple of hours trying to clean it up, finally got it to where it booted okay. It wouldn’t accept remote connections. After fighting it for an hour, we decided to fail back to the physical machine. It wouldn’t accept connections either. Thank God that Norman (my consultant) knew his stuff. He resolved the issue on the physical machine and we called it a night, going home exhausted and defeated.
The crappy part was that we were both so tired, that we left the Citrix running on the physical machine. This does display some of the cool features of VMWare, though. We turned off the NIC, started the virtualized Citrix server, changed the name and IP, and set the NIC back to on. This gave us the ability to correct the issue with denied logons and test it, without affecting anyone. So at 2:00PM, we kicked everyone off the physical Citrix server, booted the virtual version, and let them connect. We pulled a drive from the physical machine (I always run a mirrored drive set), set it aside, formatted the primary, and built a physical virtual centre, according to the best practices of designed by Norman’s company.
Someone before dinner (could have been a day before, I can’t remember), we’d also built a specific virtual machine to migrate the Novell Groupwise data into Exchange. After shutting everyone down at 6:00PM again, we did a differential Robocopy, and checked every top level folder from the Novell side to the folders on the Microsoft servers. We compared the size, size on disk, number of subfolders and number of files, to ensure that they were the same. Differences were corrected with another Robocopy until we knew we had it right.
We took the old SAN down, reformatted it, and built it as we wanted for the new environment. A lot of tweaking took place on the size and locations of the Virtual Machines, and their data drives, and we started the process of migrating Groupwise to Exchange. (It would take almost 24 hours.) Somewhere in all that we found time for dinner, and we eventually left the office at 2:00AM, feeling that we were back in a decent place, as Wednesday and Thursday were scheduled to be completely offline for systems access all day.
I am honestly having a hard time remembering what happened on Wednesday. It was another long day, although we ended up calling it quits around 11:00PM. Due to some frustrations, I remember not be quite as far along as we’d have liked, as I was hoping to start bringing users back online on Thursday afternoon.
I spent a lot of time patching PC’s to the latest service packs, and installing Outlook 2007 on them. Joining people to the new domain was also becoming an issue, as we started experiencing problems with our DNS and DHCP servers. We believe that part of it had to do with our migration machine needing to be a domain controller. Group policies didn’t seem to be replicating, and other issues kept popping up.
Then came Thursday. It started with a drive failure in our SAN device as we were re-organizing the drive set. For a bit, we thought we’d lost everything, but it turned out okay after some time, high blood pressure and sweat. That ate up more time that we hadn’t planned on, and we still hadn’t found time to actually install all the required applications on Citrix that were required. Also, our printers became an issue. They just weren’t showing up in the domain and working the way they should. We fought against those issues, and many others, and worked another late night. Before we ended, however, we lost our virtual centre due to another drive failure. Another 2:00AM departure, and again a frustrating one, although we had been able to get one point of sale station up and running.
Friday was kind of the panic day for me. We felt that we’d given ample window to do the work, not counting on O/T, so this sucked. I was able to bring a few people online early, providing they had full PC’s and were patient. They didn’t have printers, but at least they had email and file access. Our priority then became getting out POS stations back up and running, and wouldn’t you know it… another printer issue! We ended up online with the vendor, and finally worked out that our migrated printers (we’d given up and migrated the Citrix printers) did not have proper ownerships set up on them. Even though you could see them in the admin profiles, the published application sessions couldn’t see them. Even more wasted time, but we got the POS guys up and running before lunch.
This left our golf shop point of sale. With the target of being up on Thursday evening, we finally got them back online at 6:00PM, but the printing here was an issue as well. The greatest thing about this piece though, is that this program is a bit of a monster. In the past we always had to install the app for every single user, and it would only work on one server. If they connected to our other Citrix server, the profile would corrupt, and we had to reinstall the program. Norman, who is a feaking genius, managed to fix that for us. For the first time in our history, we have a golf shop POS that will actually run on either of two Citrix servers, allowing us the redundancy and backup we need.
By the time we got here, with the printing mostly fixed, it was 7:30PM. We broke for dinner, as a celebration of sorts. Things had finally come together for the most part, and we had a functional system. We decided to push through, get the secure gateway back up and running, and I also pulled together some documentation (with screen shots) on how to log in and set up a new profile. The final thing we did was test a VMWare feature at 12:30AM… which didn’t work. 1.5 hours later, after a VMWare support call (for the whole time) with an extremely pleasant fellow in Ireland name Ronan, things were good. We called it a night again at 2:00AM.
All in all, 82 hours for me that week, only slightly less (on-site) for Norman. (I know he’s still been working on some things remotely today.) We migrated over 250,000 email messages and 125GB of data.
The good news though… only two phone calls for support all weekend. I’m sure that they’re being nice to me, but the systems are up and running at this point, which is great.
And the best part of all of this? I now have a SQL server set up, with Analysis Services installed. So in between learning about the nuances of VMWare ESX 2.5, Virtual System Center 2.5, Active Directory, Exchange 2003, etc… I’m can now cut my teeth on BI and SQL 2005. This is a HUGE step forward for us, as our native systems have very weak BI abilities. I’m very stoked on the ability to start snapping some SQL compatibility into my applications so that I can start generating information from our data.
At any rate, that’s enough for me right now. If you’ve been looking for me on email, hopefully you understand why I haven’t answered.
Oh… and one thing to note… My wife and child didn’t see me for more than an hour all told last week.Â I’m very lucky that my wife, while not thrilled about it, does understand what I do and why.