RAID 5 fail

Discussion in 'Hardware' started by Apexes, Jun 1, 2011.

  1. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    Anyone recognize this? Adaptec piece of ****e

    Basically the server won't boot, the array won't rebuild, i think the controller's naffed - and there's 3tb worth of data on there i've got a horrible feeling i'm going to be restoring from backups...

    Anyone offer any advice on how i may be able to get around this without losing data?

    Error pic attached

    IMAG0105.jpg
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  2. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    Does it tell you how many drives are fooked when you check the array status?
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  3. BB88

    BB88 Kilobyte Poster Gold Member

    383
    13
    76
    Is the system beeping at all, consistently? What do you see in the Config Utility?
     
    Certifications: CompTIA A+, CompTIA Network+, MCSA: Office 365,, 70-410, 70-680
    WIP: CompTIA: Security+
  4. Sparky
    Highly Decorated Member Award 500 Likes Award

    Sparky Zettabyte Poster Moderator

    10,718
    543
    364
    It says the your RAID 5 array has problems. Go into the RAID card config and it should highlight which disk(s) have failed. If only one disk has failed then accept the config and the server should boot ok.

    Replace the failed disk when you can....
     
    Certifications: MSc MCSE MCSA:M MCSA:S MCITP:EA MCTS(x5) MS-900 AZ-900 Security+ Network+ A+
    WIP: Microsoft Certs
  5. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    I've gone back to Berlin, and this server had been transferred since i left.

    Accepting the config change does nothing, it tries to boot but gets stuck on the splash screen.

    Going into the config tells me that it's "Degraded" but doesn't specify which disk is wrong, identifying the faulty one does nothing - i'm back in tomorrow so will double check all over what you've said :)

    BB88 - it's beeping constantly when i boot it up, accepting configuration changes makes no difference and it reboots and does the same thing
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  6. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    if none of that works you could always try updating the firmware as well

    Adaptec - Bios Updates and Other Downloads

    I'm also a little surprised that it won't show you which disks are degraded or the status of the disks
     
    Last edited: Jun 2, 2011
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  7. zebulebu

    zebulebu Terabyte Poster

    3,748
    330
    187
    Try pulling the disks one by one and using compressed air to blow the slots free of dust. You'll be surprised just how many times that works. If you had monitoring set up for the array and received no warnings about impending disk failure prior to the FUBAR, then it may well be the raid controller (you're not likely to get two simultaneous disk failures, and that's the only other thing that would prevent a RAID5 array from booting under normal circumstances). The good news in that situation is most RAID controllers nowadays don't have the array info stored on the controller, so provided it's not as old as the hills you should be able to get a replacement controller, slot it in there and fire it up.

    Also, as that's a 31205 (quite a common card) I've come across it before - I'm pretty sure the FW on it that's listed is at least three years old, so update that as well - lots of RAID cards have little bugs in firmware that only rear their heads after a set number of seconds uptime, or when a specific failure event occurs (and if you're getting to see the Windows splash screen, then there's a distinct possibility that either only one or even none of the disks themselves are failed - only the array card thinks that they are)

    Seriously though - try the compressed air trick first.

    PS: I hate looking at servers that have the O/S installed on a RAID 5 array along with everything else. It makes me physically itch. Building a server with the O/S not on it's own RAID controller with a mirrored pair is one of the worst sins imaginable as far as I'm concerned. I have rebuilt entire server infrastructures because of this.
     
    Certifications: A few
    WIP: None - f*** 'em
  8. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    Thanks Thomas and Zeb,

    I've just got into the office now in Berlin (It's a bank holiday over here, and no word of a lie, it's called "Man Day" every other bastard is out on the piss except me sat in this office on my own cos of a stupid cheap server!!)

    Zeb i agree with you about a seperate controller - they bought this 2 years ago on the cheap, around 3k (euro's) and it's failed already, shows that you get what you pay for

    Just unracked the thing and moving it to my desk, will update shortly! :)
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  9. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    Making progress!

    Updated the firmware and bios on the server after just about managing to get it to boot, finally identified the failed disk (Have to wait until tomorrow now to get a new one to replace it, in fact even maybe on monday it'll arrive)

    I can access all the data with the other drive removed, and performance is 10x better than what it was with the failed drive in. I'm currently copying it all across to another server, which'll take around 10-12 hours i think, so i'll leave it running tonight and shift all local backups onto another server whilst i get this one repaired.

    Think i'll also replace the controller in it aswell, whilst it's still working, at the moment - i'm not confident it'll continue working in the future.

    Thanks for all your replies and help guys :thumbleft
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  10. zebulebu

    zebulebu Terabyte Poster

    3,748
    330
    187
    That's weird - either the drive has failed or it hasn't - if it has, then performance will be poor, but it shouldn't make any difference to performance whether the drive is still in the server or not, as the card has already detected it as faulty and will have removed it from the array. Did the FW upgrade make a difference? I went and checked after my last post - the FW you were running originally was four years old, so I bet there were oodles of bugs in it.

    Glad you're getting it sorted - nothing like resurrecting a seemingly dead server!
     
    Certifications: A few
    WIP: None - f*** 'em
  11. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    That's what i thought,

    It's running on 4 drives at the moment (Frantically awaiting the 1tb spare to arrive, why the hell they didn't have a spare on site i've got no idea, i gave them a bollocking for that) but the performance increased ten fold after removing it, so not quite sure.

    The FW upgrade did make a huge difference, it was taking ages to boot the controller on initial boot up, it's now within a few seconds

    Either way i'm happy to at least have the data intact, and not having to pull off 2.6tb of backup data over the WAN!

    I'm going to get the controller replaced anyway, this whole adaptec lark over the last week has me worried about it, everyone i've said to that it's an adaptec have been like "wtf, replace that crap!" lol

    Any recommendations on a replacement one?
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  12. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    Nowt wrong with adaptec :) we use Adaptec Raid 5805 which I find to be great on performance
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  13. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    It's more the reliability of it i'm worried about, the performance was fine when it was working ok - but just worried it's gonna play up again in the future - it basically couldn't decide whether the disk had failed or not, it was like a kid deciding between a sherbert straw and a cola cube! "That one, no that one, i don't like it, i want that one, no it's not, that one"
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  14. ThomasMc

    ThomasMc Gigabyte Poster

    1,507
    49
    111
    LOL! I see what you mean but you would probably get good use out of that card now your FW is updated, if your looking for an easy upgrade then stick to adaptec and just migrate the array to the new card.

    p.s RAID 5 still sucks :)
     
    Certifications: MCDST|FtOCC
    WIP: MCSA(70-270|70-290|70-291)
  15. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141

    I know it does, try telling a German that! lol! back to England friday, so my workload will be cut by 75% yaaaaayy, plus i'm going to Ascot races on tuesday to spend the leftover expenses i have :mrgreen:
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  16. wizard

    wizard Petabyte Poster

    5,767
    42
    174
    You're going to spend 20p on the gee gees? :twisted:
     
    Certifications: SIA DS Licence
    WIP: A+ 2009
  17. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    lol, maybe 40p!! :D nah i got a couple hundred quid spare, so there's about 50 of us going from my local back home - tis always a good day out, but their 4 cans of beer limit aint so good - so i've now resorted to vodka and coke, in a 2 litre bottle :D
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  18. onoski

    onoski Terabyte Poster

    3,120
    51
    154


    Glad to hear you've got to the bottom of this problem, however I would still replace the faulty disk as soon as possible. All the best and thanks for sharing:)
     
    Certifications: MCSE: 2003, MCSA: 2003 Messaging, MCP, HNC BIT, ITIL Fdn V3, SDI Fdn, VCP 4 & VCP 5
    WIP: MCTS:70-236, PowerShell
  19. Apexes

    Apexes Gigabyte Poster

    1,055
    78
    141
    Still waiting, it hasn't arrived yet.. i'm getting nervous!
     
    Certifications: 70-243 MCTS: ConfigMgr 2012 | MCSE: Private Cloud
  20. wizard

    wizard Petabyte Poster

    5,767
    42
    174
    Do they have a PC world in Berlin? :D
     
    Certifications: SIA DS Licence
    WIP: A+ 2009

Share This Page

Loading...
  1. This site uses cookies to help personalise content, tailor your experience and to keep you logged in if you register.
    By continuing to use this site, you are consenting to our use of cookies.