Updated 1/6 13:40 to clarify a point made by email
I thought I’d document (albeit in not too much detail) the delights I discovered this weekend.
Being self-employed my network is a bit of test and dev environment – at times I am bleeding edge running beta code, at others a bit behind the curve. The latter is true on my production Exchange infrastructure – but as email is so critical – that was OK.
But by now, although Exchange 2007 is working and on the “if it ain’t broke, don’t fix it theory” I had left well alone the upgrade to Exchange 2010 was overdue, and I thought I’d make some progress on it. But…
First off, Exchange 2010 requires that you upgrade your environment to Exchange 2007 SP2 as a minimum for 2010 to install. I’d not done that largely as to find a period of downtime where I will not impact my business email or my wife’s business’s co-located email setup has been tricky for some months due to the long term illnesses and now recent deaths of our fathers. But now, having downloaded (again – just to be sure) the modest (!) 800MB+ service pack file I set to the task (along with the Update Rollup 4 that is the latest for Exchange 2007).
The installation went well, and the SP2 install also applies all the schema changes required for your network to support Exchange 2010 as well, so that would save on that server install. Good. I started with my HUBCAS server (I have a separate MBX server).
However as soon as I rebooted the server after the update things went a tad wrong. Several services refused to start – Microsoft Exchange File Distribution, Service Host, Transport and Transport Log Search. Active Directory Topology was experience lots of errors, and the whole thing stank. It began to feel like my last problem (Exchange 2007 slow start up fixed); so I quickly checked out the servers and confirmed that no spurious IP addresses existed in DNS or on the Domain Controllers.
So, some further investigation was required. As is my usual practice I archived and cleared the event logs, and rebooted the server. This gives me a clean eventlog to check through, and also from a clean boot so I know that the event list I am looking at is directly the fault of the issue, and not clouded by other stuff (FWIW – for years I have also fully rebooted servers [if possible] before upgrades and archive/clear event logs then, so that a) I know the machine boots OK, b) the event logs don’t overflow or show me old rubbish).
So the errors were still piling up, the services were still not starting. I was seeing events 2114 2604 1014 from the AD Topology, and found these 2 links:
These pages report that post Update Rollup 5 on Exchange 2007 the performing of the certificate revocation checks that the managed code performs during service boot can time out and cause services to fail to start. That looked good as I do have some connectivity issues being over 6km from the BT exchange for my broadband, so went down that route. I changed the HOSTS files as suggested for both the IPv4 and IPv6 address (127.0.0.1 and ::1); but that had no impact.
So then I reverted the HOSTS changes and instead assigned specific IPv6 addresses to the servers and tried that – no good – again.
By this time (given that I clear logs and reboot between each test) I was starting to get more than a bit annoyed with all the time I was losing. I headed over to eventid.net instead to research some of the event id’s. Eventually… after searching for ages over various event id’s I came across this page: Notes on 2114 error in MSExchangeDSAccess . Joe Richard’s comment was something that intrigued me, so I hopped over to a DC and discovered that the my site’s IP range (and assignment to the site) in Active Directory Sites and Services was missing. SHOCK HORROR. *Updated - however although all was working well recently, I've no idea (for now) when this disappeared; I'd really like to blame the SP2 AD upgrade :-). If i find out, I'll add more
I quickly fixed this (and replicated around the network), but disappointingly the services did not play ball. I restarted DNS everywhere, but no dice. Ever the optimist (!), I took the decision to reboot the DC’s and then the exchange boxes to force everything through (I could not be bothered to stop/start services until it played.
Hallelujah. The services all came up cleanly
So some 8 or 9 hours later I was finally on track to install an Exchange 2010 server…