Problem Format Domain!

Discussion in 'Software' started by DC Pr0Mo, Nov 13, 2010.

  1. DC Pr0Mo

    DC Pr0Mo Kilobyte Poster

    268
    9
    41
    Hi Guys

    Nightmare scenario at work, got a call today from my boss that somehow someone had remoted in to our network and deleted every user and computer object in the domain apart from 2 random workstations, our head office ISA Server and all the domain controller computer objects, not good, even worse our most recent system state backup is not restoring and the only other system state backup is more than a year old (I know bad practice, but that’s for another day just know). As you can guess this needs to be resolved ASAP.

    We have our head office with the majority of our servers (2 dc’s, fileserver, anti-virus, VPN, DHCP etc) and 10 sites each with a server running as a domain controller and an ISA/VPN server (I know this is not ideal, but is out of my control). We are running Server 2003 r2 domain functional level in a single domain environment.

    The ray of shining light is that one our remote sites never received the changes to active directory and in effect has a full working copy of our domain before all the objects were deleted. Our job tomorrow is to try and salvage the domain from this server and get the network back up and running.

    I have already visited the site and imaged the DC and created a system state backup in case things don’t go to plan tomorrow; my plan tomorrow is as follows;


    Take the domain controller to our head office.
    Configure network Settings
    Seize the FSMO roles
    Uninstall ISA server
    Move the server to our head office site in AD sites and services
    Perform an authoritative restore
    Reboot
    Force replication with other domain controllers within our head office site

    I know we need to address other issues such as how did this happen and how we can prevent it, as well as our Disaster Recovery being inadequate, but as you can guess the most important thing just now is to get our system back online for 9am Monday morning.

    Has anyone got any suggestion on how to help or see an issue my above plan will encounter?

    Any help would be much appreciated.

    A rather nervous IT Administrator
     
    Certifications: MCDST | BSc Network Computing | 365 Fundamentals
  2. Sparky
    Highly Decorated Member Award 500 Likes Award

    Sparky Zettabyte Poster Moderator

    10,718
    543
    364
    Nightmare.

    First thing. ISA on a DC? WTF?!?

    Do you know why the DC didn’t pick up the changes? Has it been tombstoned or something like that? You need to check the event logs to find out whats happening.

    I would switch off all the remaining DCs so that nothing can replicate to the DC that does have the full copy of AD.
    Complete the steps you have noted and bring each DC online one by one and force replication.
     
    Last edited: Nov 14, 2010
    Certifications: MSc MCSE MCSA:M MCSA:S MCITP:EA MCTS(x5) MS-900 AZ-900 Security+ Network+ A+
    WIP: Microsoft Certs
  3. zebulebu

    zebulebu Terabyte Poster

    3,748
    330
    187
    ...Ouch

    That sucks

    The plan you've outlined is good - I'd add taking a copy of the image just in case when you bring the first of the other DCs online and then force replication you don't rinse it again because of something you've overlooked. As Sparky says, check out the event logs of the DC you think is still OK - there must have been a reason that it didn't get the AD changes replicated to it. Hopefully it will be something like a network outage (which would probably be the only time you'll want to kiss your WAN provider for their crappy service!) If not, you could be restoring a non-working DC, in which case you'll have to rebuild.

    Oh - and get ISA off the DC first - the last thing you want to do is get it up and running, seize roles and change the network config only to find that ISA is a PITA to get rid of.

    One more thing you're going to have to consider is completely disconnecting your internal network from the outside world. Whenever you have a major breach like this you have to consider that the ***** who has done it has left themselves a neat little backdoor to get back in. You say a couple of 'random' computer accounts have been left untouched? How certain are you that they're 'random'? How would you like to restore everything and then find that the same muppet remoted straight back into the other workstation and did the same thing? I'd disable that computer account and make the passwords for any domain admin accounts double spiteful.

    I presume you've got logs you can work through on your firewall? Take a copy of those ASAFP - you'll need them to track down where the breach occurred.

    Above all, don't panic. In a situation like this, any company worth its salt will let you get on with restoring everything to working order without breathing down your neck. Afterwards, you can use it to your advantage to try and stress the importance of investing money in IT - more staff, better systems, better security, dedicated DR etc etc
     
    Certifications: A few
    WIP: None - f*** 'em
  4. Shinigami

    Shinigami Megabyte Poster

    896
    40
    84
    That definitely doesn't sound good. Cut the internal network off from the internet asap.

    Then find out WHY the DC in the satellite office didn't receive the latest changes. You must be 100% sure that this DC is in perfect working order before shutting it off and moving it around. You also don't want to bring it into the head office and suddenly have it receive the "deletions" from the other DC's when you turn it on...

    And if it's got older info, how old is this info? Did it just fail to repliate in the last 24 hours? Could it be that whatever broke your head office DC's, broke them to the point where they couldn't replicate? If so, I would probably reinstall them (to ensure no rootkit/backdoor/virus remains in the system), and then seize FSMO's to the new DC, run ntdsutil cleanup of the old DC's, and finally DCPROMO them so that they receive the objects from the remaining DC.

    In any case, you'll have quite some work planned out for yourself. And try not to mix and match otherwise complicated software on DC's (such as ISA). But you already said you understand that this is not ideal.

    Anyway, good luck, stay calm, stay meticulous, and work hard. You'll get it back up with a little work. You may want to open a PSS ticket or contact another consulting firm for an extra pair of hands if you need to do DC checks on your remaining machine.

    A definitive nightmare scenario indeed. I think if I got this news on a Sunday, I wouldn't wait until monday to start fixing it, I would be there all night working the next 48 hours straight if I had to.
     
    Certifications: MCSE, MCITP, MCDST, MOS, CIW, Comptia
    WIP: Win7/Lync2010/MCM
  5. Kitkatninja
    Highly Decorated Member Award 500 Likes Award

    Kitkatninja aka me, myself & I Moderator

    11,143
    559
    383
    Good luck mate, hope all goes well...

    -Ken
     
    Certifications: MSc, PGDip, PGCert, BSc, HNC, LCGI, MBCS CITP, MCP, MCSA, MCSE, MCE, A+, N+, S+, Server+
    WIP: MSc Cyber Security
  6. AJ

    AJ 01000001 01100100 01101101 01101001 01101110 Administrator

    6,897
    182
    221
    Can I just add, and this may be a bit too obvious, but make sure you document EVERYTHING. Not only for future DR requirements, but in case there are any police involvement at a later time.

    Good luck mate and please let us know how you get on.
     
    Certifications: MCSE, MCSA (messaging), ITIL Foundation v3
    WIP: Breathing in and out, but not out and in, that's just wrong
  7. jamin100

    jamin100 Byte Poster

    154
    1
    22
    You need to make sure that the USN on the server with the ok AD structure is higher than the one with the trashed one otherwise the good DC will still replicate with the forked DC and get all the deletions..

    There is a command line for that which you perform on the good AD DC which will add 100,000 to the USN...

    They should replicate ok then
     
    Last edited: Nov 15, 2010
    WIP: 70-680
  8. zebulebu

    zebulebu Terabyte Poster

    3,748
    330
    187
    How are you getting on with this fella?
     
    Certifications: A few
    WIP: None - f*** 'em
  9. DC Pr0Mo

    DC Pr0Mo Kilobyte Poster

    268
    9
    41
    Hi guys, cheers for the reply, as you can guess it's been a long busy couple of days for me.

    As it stands the domain is almost fully functional with only a few issues needing cleaned up.

    I followed my original plan on Sunday and all went well, brought the known good Domain Controller to head office, of the two domain controllers in our head office, one replicated from the known good Domain Controller while the other would not play ball, we spent a good couple of hours trying to get them talk but to no avail. The remote sites were all replicating the changes (TBH it was going too well), then we changed the domain Administrator account name and password (this was the account that was compromised) and all hell broke loose, on hind sight we should have done this first thing before we try to replicate to any other server.

    Every remote site we checked seemed to be fine, they had the new administrator account name and password but our two servers at head office were throwing up Active directory errors left right and center.

    We spent good amount of time trying to resolve this but eventually decided it would be better for us to use a remote server that was now known good and replicate from it to a reimaged server at head office. All our VPNs need to go through our head office so in doing this we needed to then fix replication, which has been the major headache for the last two days, all servers seem to be replicating now bar a few sites, though we just need to change some settings manually to get these working, another thing we are experiencing problems with is group membership, some groups are empty, where before they had many members, this is causing issues with our ISA server rules that filter on groups and our file shares which are denying access.

    To sum it up, we have a few site that need replication fixed, some ISA rules that are not functioning and some file shares that are not allowing access, all in all not too bad considering what we walked into in Saturday morning.

    After this is done obviously there are other issues we need to address, particularly how they got access and how they obtain login details, not to mention our DR process.


    I know i agree, im not a fan of this setup either, What are the main reason that this is not advised?, as far as I can see its that Microsoft don't support it, your DC is sitting on the internet, and your DC is multihomed. Lol anything else apart form that?

    I'm unsure, I'm guessing its because every object was delete that it somehow disrupted replication (our VPN links are software which use an ad username), all our sites replicate through our head office. When my boss got in on saturday moring all the vpn connections had disconnected.

    We have a good idea who it was but I can't really go into in an open forum, we also know the computer that was used to gain access to the network, and the user account used to cause the mass deletion of objects. At this point what we dont know is how they got access to that user account details.

    I'm pretty sure that's the case, with the VPN server being down replication would have been impossible, though I will be calling our Antivirus provider to ask for advice on this in case there is something lingering on the network.

    First thing I thought of, authoritative restore should do the trick.

    Again thanks for the help guys, hopefully have this sorted shortly :)
     
    Certifications: MCDST | BSc Network Computing | 365 Fundamentals
  10. zebulebu

    zebulebu Terabyte Poster

    3,748
    330
    187
    That's really good news (considering how bad it could have been). I guess from what you're saying it was an 'inside' job - former employee perhaps? (PM me if you're not comfortable discussing that aspect of it here). 90% of incidents of this type are people who used to work for a firm. Whilst I've often had the hump with former employers, there's no way on Earth I would ever dream of doing something like this. Not only would my professional pride not allow it (not to mention the moral aspects of doing something this sucky), but it's just so, so easy to get caught doing it.

    Glad you're back up and running. With your current woes, now might be a really good time to get acquainted with some cool AD replication utilities (I think I mentioned them on an earlier post) like Sonar or Ultrasound. They'll be invaluable to you in troubleshooting the replication issues you're encountering, and there's nothing like an actual disaster to put them to the best use. If you're interested, Sonar can be found here, and Ultrasound here
     
    Certifications: A few
    WIP: None - f*** 'em
  11. DC Pr0Mo

    DC Pr0Mo Kilobyte Poster

    268
    9
    41
    Looks very much like a former employee. Cheers for the replication tools zeb. Still having some issues with replication. Promoted two dc to replace servers we used from other sites during the recovery. They promoted with no errors but when opening ad users and computers the mmc console would connect to another dc rather than itself. I noticed that on both servers the sysvol and net login folder was not shared so I cleared the blur flags which shared the sysvol folder but it's empty and no group policies are present on them as well as there is no netlogin share. Don't you just love this it business.:D
     
    Certifications: MCDST | BSc Network Computing | 365 Fundamentals
  12. Sparky
    Highly Decorated Member Award 500 Likes Award

    Sparky Zettabyte Poster Moderator

    10,718
    543
    364
    Hope you get the police involved...
     
    Certifications: MSc MCSE MCSA:M MCSA:S MCITP:EA MCTS(x5) MS-900 AZ-900 Security+ Network+ A+
    WIP: Microsoft Certs
  13. Shinigami

    Shinigami Megabyte Poster

    896
    40
    84
    Glad to hear you're starting to sort through the mess. If it was an inside job and you find the culprit (did you remember to save the event logs on your DCs before you formatted them for further analysis?), you may get enough punitive damages back to pay for an ADRAP evaluation of your AD from MCS :)

    Anyway on a more serious note, as was mentioned, each DC (followed by health checks on all member servers and clients) need to be performed with utter care. dcdiags and the like need to be run on each DC, every error corrected. Eventlogs analysed minutely and every error fixed. Make a task list and work on fixing a DC at a time thoroughly (or first go through a single type of task per DC round Robin style until you get back to the original DC starting with a new type of checkup task). If you come across an error you cannot fix within an allocated time limit, move on and prioritize the tasks unless it proves to be a very important task that you cannot skip.

    Eventid.net is quite nice for analysing some event errors...

    And take note of anything strange, things that require further attention later on. Pretty sure you've already built a resolution methodology which you're good with. And this place has a knowledgeable bunch if you need answers to things you're not entirely sure of.

    In the end we all work differently during crisis situations, so keep at it and good luck.
     
    Certifications: MCSE, MCITP, MCDST, MOS, CIW, Comptia
    WIP: Win7/Lync2010/MCM
  14. Shinigami

    Shinigami Megabyte Poster

    896
    40
    84
    P.S. Make a backup everytime you hit a milestone.
     
    Certifications: MCSE, MCITP, MCDST, MOS, CIW, Comptia
    WIP: Win7/Lync2010/MCM

Share This Page

Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.