Problem Rebooting ESXi server kills our ASA 5505

Discussion in 'Networks' started by ThomasMc, Apr 6, 2011.

  1. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    Afternoon chaps got a bit of an odd one, every time we reboot a ESXi server our ASA5505 falls on its arse(only the ESXi servers do it as I've tried rebooting some servers and they don't have any effect at all on the ASA) so after trolling through the logs I've found the bellow output.

    Code:
    06/04/2011	12:50:24	Local4	Notice	10.x.x.1	:Apr 06 12:39:33 GMT/BDT: %ASA-config-5-111008: User 'Config' executed the 'passive-interface dmz' command.
    06/04/2011	12:48:32	Local4	Notice	10.x.x.1	:Apr 06 12:37:46 GMT/BDT: %ASA-eigrp-5-336010: EIGRP-IPv4: PDM(355 5: Neighbor 10.x.x.248 (Vlan100) is down: holding time expired
    06/04/2011	12:48:32	Local4	Debug	10.x.x.1	:Apr 06 12:37:46 GMT/BDT: %ASA-sys-7-711002: Task ran for 20064 msec, Process = dhcp_daemon, PC = 835eab8, Traceback =   0x0835EAB8  0x0835A146  0x0835A343  0x08054637  0x0805A50C  0x08965099  0xDD7A76D5  0xDD6A61E0  0x0819934E  0x08199A18  0x0818B92E  0x0818BEB9  0x0818C749  0x0819160A
    06/04/2011	12:48:32	Local4	Debug	10.x.x.1	:Apr 06 12:37:46 GMT/BDT: %ASA-sys-7-711002: Task ran for 20064 msec, Process = dhcp_daemon, PC = 835eab8, Traceback = 
    
    Seems like the DHCP service is crapping out and taking the whole thing down :( hoping one of the cisco/networking guys could shed some light, thanks.

    ASA Version 8.0(4)

    p.s sorry for messing with the board layout :D
     
    Last edited: Apr 6, 2011
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  2. SimonD
    Honorary Member

    SimonD Terabyte Poster

    3,681
    440
    199
    Thomas, is this a single server or any ESXi box?

    If it's a single box have you tried changing the NIC to seeing if that's causing the issue that's the first thing that jumped into my head when I read this.
     
    Certifications: CNA | CNE | CCNA | MCP | MCP+I | MCSE NT4 | MCSA 2003 | Security+ | MCSA:S 2003 | MCSE:S 2003 | MCTS:SCCM 2007 | MCTS:Win 7 | MCITP:EDA7 | MCITP:SA | MCITP:EA | MCTS:Hyper-V | VCP 4 | ITIL v3 Foundation | VCP 5 DCV | VCP 5 Cloud | VCP6 NV | VCP6 DCV | VCAP 5.5 DCA
  3. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    Hi Simon, all 3 of our ESXi boxes do it to the ASA although they all use Dual Intel ET cards.
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  4. danielno8

    danielno8 Gigabyte Poster

    1,306
    49
    92
    How is this all hanging together? It's unlikely i will be able to help i'm just very curious how your ASA could be affected by this! Also what exactly do you mean by crapped out? All interfaces unresponsive? At what point of the ESXi reboot does things die?
     
    Last edited: Apr 6, 2011
    Certifications: CCENT, CCNA
    WIP: CCNP
  5. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    From the firewall to the ESXi hosts looks like this, I've been having a look over some of the release notes and there has been some issues related to the dhcpd that have been fixed in the new versions(I seem to be 2 versions behind) so hoping that will maybe resolve it, the ASA is in routed mode and has highlighted that maybe I ask to much of it :D might need to lighten its load or even pop in a partner for it. To recreate the issue;

    1) Reboot one of our ESXi servers(maintenance mode).
    2) Ping the ESXi and watch for it to drop off.
    3) ESXi starts responding to pings.
    4) Between 3 and 10 seconds later syslog server reports the dhcpd traceback(after the EIGRP-IPv4 message no more are received until step 6.
    5) About 1 minute later the management software reports that the default gateways on all vlans that the ASA serves are offline.
    6) About 2 minutes after that the ASA comes back up(uptime also reset).
     

    Attached Files:

    Last edited: Apr 6, 2011
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  6. danielno8

    danielno8 Gigabyte Poster

    1,306
    49
    92
    hmm pretty strange. Can you verify your spanning-tree config on the switches? And what the switches are doing with the trunk links when an ESXi host is rebooted, whether they are learning/forwarding etc.
     
    Certifications: CCENT, CCNA
    WIP: CCNP
  7. craigie

    craigie Terabyte Poster

    3,020
    174
    155
    Not sure if I'm reading it wrong, but why do you have dhcpd turned on, on the ASA?

    My guess is that on reboot the ESXi hosts are requesting a DHCP address on reboot causing the ASA to fail, not sure for the reason.

    If you do a wireshark and mirror a ESXi port to see if my assumption is correct.

    We can then take it from there mate.
     
    Certifications: CCA | CCENT | CCNA | CCNA:S | HP APC | HP ASE | ITILv3 | MCP | MCDST | MCITP: EA | MCTS:Vista | MCTS:Exch '07 | MCSA 2003 | MCSA:M 2003 | MCSA 2008 | MCSE | VCP5-DT | VCP4-DCV | VCP5-DCV | VCAP5-DCA | VCAP5-DCD | VMTSP | VTSP 4 | VTSP 5
  8. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    Cheers lads, I'll need to wait until after 5 to do some more testing.
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  9. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    @danielno8, this is what both of the switches are doing when I restart the ESXi host
    g1=ASA
    g2=ESXi NIC 1
    g3=ESXi NIC 2

    Code:
    17:58:42 Informational	%LINK-I-Up: 1/g1
    17:58:39 warning	%LINK-W-Down: 1/g1
    17:58:35 Informational	%LINK-I-Up: 1/g1
    17:58:33 warning	%LINK-W-Down: 1/g1
    17:55:09 warning	%STP-W-PORTSTATUS: 1/g3: STP status Forwarding
    17:55:09 warning	%STP-W-PORTSTATUS: 1/g2: STP status Forwarding
    17:54:39 Informational	%LINK-I-Up: 1/g3
    17:54:39 Informational	%LINK-I-Up: 1/g2
    17:54:36 warning	%LINK-W-Down: 1/g3
    17:54:36 warning	%LINK-W-Down: 1/g2
    
    @craigie, I'll report back soon
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  10. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    I've narrowed it down to the slp daemon on the ESXi hosts that initially causing it and can now trigger it by running

    Code:
    /etc/init.d/slpd restart
    
     
    Last edited: Apr 15, 2011
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  11. joshlarsen

    joshlarsen New Member

    1
    0
    1
    I am having this same issue. Has anyone figured out how to stop this from happening?

    Josh
     

Share This Page

Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.