Sunday story - Strange AD/DNS issue

Discussion in 'Software' started by LukeP, Mar 20, 2011.

  1. LukeP

    LukeP Gigabyte Poster

    1,194
    41
    90
    Just thought I'd share some of my today's experiences.

    Today, I went to work in the morning to roll out Forefront TMG and reconnect our internet connections so they're in load balancing/failover configuration. This went alright. Done and dusted about 1-2 hours later.

    While I was there I thought I'd give it a go and check if UPS shutdown procedures are working as expected.

    Just to give you an idea, we have 2 host hyper-v failover cluster with some virtual machines on it. We also have other servers some of which run Hyper-V while others don't. In total we have 3 domain controllers (1 of which runs on the cluster). The idea is that cluster obviously needs a domain controller to start up.

    The way it's set up is that domain controllers send Wake On Lan request to both cluster hosts when they're up.

    This was working great when I was last testing it (about 3-6 months ago). Now, I'm not sure what has changed since then (nothing in our environment) apart from updates and recently SP1 (Windows Server 2008 R2 is what we run all servers on).

    I have to say I was surprised when after the power cycle the cluster didn't come up. The problem was with the 2 domain controllers which are outside the cluster. They were starting but neither DNS or AD was getting initialised. Funny this, as DNS was not starting as it was waiting for AD initial sync to finish and AD couldn't start or perform the sync because DNS wasn't working.

    I'm aware that this is not a problem if you have only 1 domain controller in your environment.
    Found this:
    http://social.technet.microsoft.com.../thread/805d69e6-46c0-4570-bd00-5c7daed2eeae/
    which is exactly the problem we were experiencing. This problem applies to physical domain controllers too (as long as you have more than 1 DC in total on your network).

    I've noticed that it does sort itself out after the timeout expires but the timeout is 15-25 minutes by default. Definitely not quick enough before cluster hosts are woken up.

    I just want to say that this wasn't a problem when I last tested it few months ago so it might be related to some updates (SP1 maybe).

    What it does is when the server starts it tries to start DNS server (every 2 minutes by default) which waits for AD to finish initial sync with other domain controllers (which is not possible due to DNS not working on the network).

    I found a technet KB article on how to solve this problem:
    http://support.microsoft.com/kb/2001093

    which given the date of publishing kind of indicates it's a new issue.
    It says what to change in the registry to stop the initial sync when AD is starting (the only time there isn't a DC on the network would be in power outage situation after everything was gracefully shut down and waiting for power to come back).

    This worked and it's now OK.

    I find it funny how Microsoft suggest getting a power generator as an alternative solution.

    Just wanted to let you guys know as my setup was able to cope with power outage few months back, but something between then and now broke it. I wouldn't know if I didn't run the UPS test again today.

    The way to test it would be to shutdown all DNS servers and DC on the network and start one up. After it boots up see if AD and DNS are running properly. None of mine were.
     
    Last edited: Mar 20, 2011
    WIP: Uhmm... not sure

Share This Page

Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.