The postings on this site are my own and do not represent my Employer's positions, advice or strategies.

LifeAsBob - Blog

 

Home

No Ads ever, except search!
Saturday, April 20, 2024 Login
Public

5 in 1 day! 4/23/2009 10:53:29 AM

Windows has historically had a reliability issue in the enterprise (remember daily reboots?), SQL Server received this tag right along with it.  Over many years starting with the release of SQL Server 2000 Microsoft has been working to correct this issue, now we need to start working on our vendors.


Yesterday 5 Windows servers, crash in one day, of course all of them running SQL Servers.  I've got 146 SQL Servers and go months with no issues, than BAM!  These were all different "flavors" of SQL (2K, 2K5 and different service packs).  Most likely not a SQL issue or even a windows issue, but regardless the product takes a hit and so do I.

The machines were so "unresponsive" that the only way to correct was to kill the power to the machines.  Of course no logs and no indication of what the problem was.  I highly suspect one of the many 3rd party crap that we run on these windoze machines:  backup, anti-virus, san drivers, monitoring programs etc; though we'll never know for sure what was to blame.

Even our Microsoft Clustering did not save me!

No fail-over, no nothing, had to reboot both nodes of the cluster; plenty of room still for the 3rd party clustering utilities as they also have failures as well, but I seem to have higher up time with them...try Never Fail, Golden Gate, HP Polyserve or Veritas.

We still have a long way to go to increase stability, we've got to continue now to harp on our 3rd party vendors that provide the services not built into windoze, just wish we knew which one to call for this outage.


Blog Home