Horkay Blog
The postings on this site are my own and do not represent my Employer's positions, advice or strategies.
Thursday, 25 February 2016

Recently had some SQL Agent jobs that run SSIS packages start failing, with an error of "Failing Pending Execution".

Could find no real issues.

Thanks Tom !,

Finally found some relevant links related to Group Policy being changed that caused this.

  • Impersonate a client after authentication
  • Replace a process level token
  • Adjust memory quotas for a process
  • http://stackoverflow.com/questions/23643967/pending-execution

    http://sirsql.net/content/2015/03/10/ssis-job-steps-fail-pending-execution/

    Thursday, 25 February 2016 09:22:28 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent |  SSIS | Web_Blog#
    Thursday, 07 March 2013

    Pivot Job History
    Pivot Job History Across multiple SQL Server Instances

    I have several hundred SQL Servers, many of these are clusters of federated SQL Servers working together for one particular "thing".  I regularly need to check certain jobs on the server and also compare run times across servers.  This started due to a EMC Clone (BCV) job, that needed to checked across a particular farm.

    We have a custom website called SQL Monitor (lame name alert), we wanted to display this job information, pivoted by runtime of the job and server, color coding it for success or failure and be able to drill into the information and pivot the step history of each job and it's average run time.

    All the information is there, you just need to query it and display it.  Doing Pivots in SQL Server has gotten easier, but is still poor in my opionion, especially when you are handling an unknown number of data elements to pivot on.  I generally do not do pivots in SQL Server, but prefer to "offload" them to the application layer, and allow the application layer to complete the pivot. 

    It provides a lot more control for the developer.  It is possible to do pivots in SQL Server, but for this particular requirement it would not have been (especially since it goes across multiple SQL Instances, I guess you could enter the world of linked servers, but when dealing with 100's of servers, that wouldn't be much fun).

    Included is a simple Visual Studio 10 web project, that sets up a web page and shows the results.  The queries are all in there, though I've included a seperate file of the queries.  The Pivot is actually done in the c# code so, don't look for it in the TSQL.

    Of particulare pain in the ass was setting up the query to to find the start time of each job and then getting the subsequent steps.  As the goal was to pivot by the start time, but each step in a particular job has it's own start time, and no real corresponding "key" back to the "start time".

    Another pain was the run duration for each step is in the format of HHHmmss, so this requires parsing that value to handle time span calculations for average run time of each step.

    I stayed away from using sp_help_job, as this stored procedure is way too "heavy" for such a simple query, and capturing the results of sp_help_job and then manipulating them, just increased complexity that wasn't necessary.

     

    Thursday, 07 March 2013 13:05:16 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent | Web_Blog#
    Friday, 10 August 2012

    A follow up to this email.  Rick found a very important detail that I left out of this “fix”.  When adding the SQL Server account to the permissions, you have to add it as

    NT Service\MSSQL${Instance} .  You also need to change the “location” from the domain to the Servername. 

     

    SQL Agent jobs using xp_cmdshell that delete files failing.


    Kayden Kross inspires a bit of friday morning contemplation . . . Also, there's a SQL Agent housekeeping issues we've finally solved.   SQL Agent delete steps failing, for backups and replication jobs. (Actually Tom Reeves, thankyou!).

    In the past the quick and dirty way of solving this was by adding “everyone” to the folder permissions with modify rights. 

     

    Below is an error I copied from one of the distribution jobs, as you can see it leads you to believe that SQLAgent${Instance} is having the issue.  I ran a profiler capture while I ran the job and it even stated that the job was being run as SQLAgent${Instance}.  I added SQLAgent${Instance} to the folder permissions to no avail, it still didn’t work.  I then added MSSQL${Instance} to the folder permissions with Modify rights and this seemed to have fixed it!  In this example I added the permissions to the folder E:\Replication\unc\{server}${Instance}_{Instance}12_{Instance}12

      

    We only seem to have this issue with the newer Windows Server 2008 machines, due to the UAC (User Access Controls). 

    ----------------------------------------------------------------------------

    Step ID                 1

    Server                   {server removed}\{Instance Removed}

    Job Name                            Distribution clean up: distribution

    Step Name                         Run agent.

    Duration                              00:00:00

    Sql Severity        16

    Sql Message ID 14152

    Operator Emailed           

    Operator Net sent          

    Operator Paged               

    Retries Attempted          0

     

    Message

    Executed as user: NT SERVICE\SQLAgent${Instance}. Could not remove directory 'E:\Replication\unc\{Server}$ACCOUNTMASTER_ACCOUNTMASTER12_ACCOUNTMASTER12\20120809000011\'. Check the security context of xp_cmdshell and close other processes that may be accessing the directory. [SQLSTATE 42000] (Error 20015)  Replication-@rowcount_only parameter must be the value 0,1, or 2. 0=7.0 compatible checksum. 1=only check rowcou: agent distribution@rowcount_only parameter must be the value 0,1, or 2. 0=7.0 compatible checksum. 1=only  scheduled for retry. Could not clean up the distribution transaction tables. [SQLSTATE 01000] (Message 14152).  The step failed.

     

    Friday, 10 August 2012 07:50:16 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent#
    Wednesday, 24 June 2009

    This script copies jobs from one msdb database to another, useful for when SQL Server is re-installed, and the DBA Forgot to script out the sql agent jobs; but you still have the old msdb mdf and ldf. 

    Simply attach the database as msdb_old.

    It is also possible to use this via a linked server and copy some sqlagent jobs from one server to another; this script is specific to sql server 2005. 

    One item I had to adjust is that some of the jobs appeared "greyed" out afterwards and I had to open each job and adjust the target server to "local". 

    But it was still quicker than re-creating all the jobs.  (I was unable to restore the previous version of the msdb database because it was at a lower service pack level than what the new server was built at - had I been smart as the new server was built there would have been a restore of the msdb database completed when the new server was at the same level, but I missed that step and was then left with the task of recreating all the jobs, argh).

    I'd be cautious using this script if there were a lot of reporting subscription jobs (as there would have to be some synchronizations with the report database and website), but in this particular case all the jobs were custom backups, tlogs, exports, imports and other purely tsql and ssis related tasks.

    http://www.lifeasbob.com/Code/ScriptVault.aspx?script_id=80 

     

    Wednesday, 24 June 2009 09:43:24 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent#
    Wednesday, 20 May 2009

    Affinity mask keeping SQL Server Agent from starting.

    Recently had a unique issue in one of our HP Polyserve clustered environments.  One of the unique aspects of HP Polyserve is that you can cluster different types of hardware.  This can reduce costs and risks (as a bug in firmware or hardware on one machine is also most likely going to exist in the other node, especially if they are the same make, model, and usually purchased together).


    We had an affinity mask set on a 16 processor machine, the instance was intentionally re-hosted to another server in the cluster, which was an 8 way machine.  The instance did start, but SQL Agent would not.


    The following error messages were found in the sql errorlog:

    initconfig: Warning: affinity mask specified is not valid. Defaulting to no affinity. Use sp_configure 'affinity mask' or 'affinity64 mask' to configure the system to be compatible with the CPU mask on the system. You can also configure the system based on the number of licensed CPUs.


    SQL Server blocked access to procedure 'dbo.sp_sqlagent_has_server_access' of component 'Agent XPs' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Agent XPs' by using sp_configure. For more information about enabling 'Agent XPs', see "Surface Area Configuration" in SQL Server Books Online.

    A quick bit of research showed a similiar issue in a Microsoft Cluster, where the one node did not have the correct permissions for the "lock pages in memory" setting, causing the same result of SQL Agent not starting, you can see a great write up of this in Suhas' blog.


    Seems if the "start up" process for a SQL Server instance is interupted that the settings changed from the defaults in sp_configure are not set and the process does not continue.  From what I can determine this "start up" process is the settings contained in sp_configure, if any of these fail along the way the process just stops or rolls them back and the other settings like "Agent XPs" (required to start sql agent), will not occur.


    I think this is a behavior that should not exist in SQL Server, why does an incorrect affinity mask keep SQL Server from finishing running the other settings set by sp_configure ?  These options should be mutually exclusive and one failure should not preclude others from running.

    I've had similar issues with how SQL Server starts up affecting SQL Agent, and it does not appear to be a priority to fix, see my post on SQL Agent will not start when a user database is in recovery


    This was easy to fix, but required setting a correct affinity mask and stopping and starting the service.  Generally we don't use an affinity mask on many instances.  HP Polyserve provides pre and post start up scripts which we'll use to address the issue in the future, though in extreme cases of failure no pre shutdown script will be effective.  Plan your failover capacity carefully.

    Wednesday, 20 May 2009 14:26:52 (Central Standard Time, UTC-06:00) | Comments [0] | Polyserve | SQL Server |  SQL Agent#
    Wednesday, 22 October 2008

    SQL Agent job listing, jobs, steps and tsql / commands executed.

    Not really useful, but helpful when someone wants to see all the jobs and the steps they are executing.

    Save as an excel file, or publish to rdl and push to a reporting site so they can always see it themselves....

    select sj.name,sj.description,sjs.step_name,
     sjs.subsystem,sjs.command,sjs.database_name,
     sjs.output_file_name
     from sysjobs sj
     inner join sysjobsteps sjs on sj.job_id = sjs.job_id
    -- where sj.name like 'cy%'
     order by sj.job_id,sjs.step_id

     

    Wednesday, 22 October 2008 12:05:02 (Central Standard Time, UTC-06:00) | Comments [0] |  SQL Agent#
    Monday, 31 March 2008

    Began receiving alerts on a job failing.  The job was "suspended".  The error in the history event log is:

    Unable to start execution of step 2 (reason: The ActiveScripting subsystem failed to load [see the SQLAGENT.OUT file for details]; The job has been suspended).  The step failed.

    A quick review of the server shows MSDE was installed on the machine, somehow this causes an error in the registry.

    Review the below registry entry and ensure there is an "s" at the end.  According to Microsoft this happens sometimes with MSDE installations and upgrades.  You can see the picture of mine, that it was missing

    [HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\80\Tools\ClientSetup]
    "SQLPath"="C:\Program Files\Microsoft SQL Server\80\Tools"


    I added the "s" to the end.  Started and stop the instance (i did both services because I could, but may only need to have the agent service restarted.)

    Everything is working fine now.

    This was on a sql server 2000 installation.

    Monday, 31 March 2008 10:58:59 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent#
    Tuesday, 26 February 2008

    SQL Server does a great job of logging, but anyone who administers a sql server should adjust some of the "out of the box" defaults for an instance, and create some custom jobs in sql agent to control output of jobs.

    By default sql server logs information to ErrorLog.  Depending on your version of SQL Server (7.0, 2000 or 2005) and a default or named instance it is located along a path similiar to the following:

    C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\LOG

    C:\Program Files\Microsoft SQL Server\MSSQL\MSSQL${Instance}\LOG

    The ErrorLog is a great source of monitoring and should be scraped and reviewed regularly, but by default there are only 6 log files kept on disk, and a new log file is started evertime an instance is started or "cycled".  This is not adequate, an administrator applying a patch to a server seems to require 3 reboots and you don't want to find out that information needed in a log file was lost due to not having enough of them.  We run this to 30 log files kept on disk and cycle the error log daily at 12am via a sql agent job.  This is very easy and can be adjusted via gui in SQL Enterprise Manager or in SQL Server Managemetn Studio by right clicking on the SQL Server Logs and selecting "Configure", which will bring up the dialog below, adjust to any number, but definetly 6 is not enough !

    It can also be adjusted via TSQL, but requires xp_instance_regwrite and is highly dependent that you know the structure of the registry for your particular instance and version of sql server, this is one case where using the GUI is the best option, but it is possible to do via tsql:

    EXECUTE master.dbo.xp_instance_regwrite
    N'HKEY_LOCAL_MACHINE'
    ,N'SOFTWARE\Microsoft\MSSQLServer\MSSQLServer' --****!!!! VERIFY
    ,N'NumErrorlogs'
    ,REG_DWORD
    ,30 -- number of errorlogs you want to keep
    GO

    After configuring the instance to retain 30 errorLogs, now you need to implement a sql agent job to cycle the error logs daily at 12am.  This is very easily implemented with a job scheduled to run:
    Exec sp_cycle_ErrorLog.

    Now onto the good one, SQL Server Agent and the output from jobs.

    After managing SQL Server Agent jobs for many years it is apparent that the default options for logging of jobs, Overwrite and Append, is not adequate.  Most of the time we default the logs to append, but overtime and depending on the amount of output these log files, they can become very large, difficult to open, difficult to read and waste valuable time when you have to trouble shoot a job.  This applies to sql server 7.0, 2000 and 2005 and is highly dependent on having all your jobs and logging go to the same directory (if they are currently "scattered" you can correct them manually or through tsql in the sysjobsteps table, with a mass update).

    We have developed a jobs that we host on every sql agent instance, Log Archive.  It is a two step job consisting of TSQL and Active X Scripting.  One step moves all the log files to a sub-directory for that day.  The 2nd step of the job removes the sub-directories older than x days (we default to 30).  This makes it very easy to find a log file for a particular day, based on the naming convention of the sub-directories.  This job runs daily at 11:58pm.  Example below:

    The code for the two steps is attached, they will require editing for your specific directory structure and number of sub-directories to retain.

    Archive_Daily_Logs.txt (.7 KB) 

    VBScript_Delete_Old_Dirs.txt (.52 KB) 

    This creates a great directory tree, easy to maintain, easy to troubleshoot:

    Tuesday, 26 February 2008 13:07:45 (Central Standard Time, UTC-06:00) | Comments [0] | SQL Server |  SQL Agent#
    Wednesday, 06 February 2008

    Polyserve clusters SQL Server instances, and can then rehost them or fail them over to other servers.

    One of the issues we've run into with Polyserve is when we move a sql server instance from one machine to another where the binaries change location from mssql.a to mssql.b.

    SQL Agent (SQLAgent.exe, SQLAgent), utilizes a system table in msdb for starting and executing jobs, called syssubsystems;

    Execute: select * from msdb.dbo.syssubsystems and you will see each sub-system of sql agent and the path to the binaries, if a sql server instance is moved through some technology like Polyserve or through manually moving a msdb database from one server to another and the path to the binaries change, the system tables will need to be updated to the new path.

    We've experienced this on Polyserve and when we moved a sql server instance from one server to another and copied msdb.

    very easy to fix, but frustrating, below is  a snippet which can be run to correct the issue:

    --Get SQLBinRoot

    declare @ret sysname

    exec master..xp_instance_regread 'HKEY_LOCAL_MACHINE','Software\Microsoft\MSSQLServer\Setup','SQLBinRoot',@ret OUTPUT

    --Update subsystem_dll to current SQLBinRoot

    update sub

    set subsystem_dll=@ret+substring(subsystem_dll,charindex('\binn\',subsystem_dll)+5,30)

    from msdb..syssubsystems sub

    where charindex('\binn\',subsystem_dll)>0

    --Stop Sqlagent service if running

    --MxDB should restart it automatically

    declare @service sysname

    select @service = case when charindex('\',@@servername)>0

    then N'SQLAgent$'+@@servicename

    else N'SQLSERVERAGENT' end

    create table #stat(status sysname)

    insert #stat

    exec master..xp_servicecontrol N'QUERYSTATE', @service

    if exists(select * from #stat where status='Running.')

    begin

    exec master..xp_servicecontrol N'STOP', @service

    end

    --Clean up tmp table

    if object_id('tempdb..#stat') is not null

    drop table #stat

     

    Wednesday, 06 February 2008 09:49:41 (Central Standard Time, UTC-06:00) | Comments [0] | Polyserve |  SQL Agent#

    This is crap, SQLAgent not starting because a user database is in recovery mode, after restarting sql server.  Usually recovery / checkpointing of user databases on start-up is really quick, but if one runs into an issue it can take a long long time, this particular user database has been recovering for 4+ hours.  Now I have issues with maintenance jobs and reporting services jobs not running because sql agent will not start.

    I can understand if msdb or some other system database is in recovery or suspect, but not a damn user database !

    SQL Server Error Log:

    2008-02-06 09:38:44.07 spid10s     Recovery of database 'ClientAssistant_CY' (20) is 27% complete (approximately 38597 seconds remain). Phase 3 of 3. This is an informational message only. No user action is required.

    READ THAT, 38000+ seconds, over 10 HOURS, we can't wait that long ! 

    SQL Agent Error Log:

    2008-02-06 08:44:55 - ? [393] Waiting for SQL Server to recover databases...

    This is unreal, i'm going to have to de-attach / interrupt the recovery process; i think, but to be sure i placed a support call to Microsoft.

    According to Microsoft, this is a new feature of SQL Server 2005.  If SQL Server Agent starts and not all the user databases on-line / recovered than sql server agent stays in a starting mode and no jobs run.

    The way around this is to "interrupt" the services, so you must stop sql server, which requires finding the process id (pid) and killing it via task manager or kill.exe; rename the mdf and ldf of the database taking the extremely long recovery time and then restart the services, everything should come on-line.  Now you can either restore from a backup or re-attach the mdf / ldf and let it start recovering again....

    very very bad solution as interrupting sqlservr.exe with a kill of the pid is a high risk operation that could corrupt other databases, so proceed with caution and get a critical support case opened with Microsoft, you do not want to be doing this alone, have them on the phone with you, severity 1.

    RESPONSE FROM MICROSOFT (Names removed)>

    From: [nameremoved@microsoft.com]
    Sent: Friday, February 08, 2008 9:02 AM
    To: Horkay, Robert
    Cc: 
    Subject: RE: Case number SRX080206600293 - SQLAgent will not start on a cluser - Open

    Hi Bob,

     We have decided to file a design change request to see if our development team will build in functionality to SQL Agent so that a registry key, trace flag or some startup parameter can be added to have SQL Agent go ahead and start up while databases are in recovery.  If this is approved then a knowledge base article would be published describing how to do this and the caveats involved.  Instead of taking SQL Server down to stop recovery we would only have to take SQL Agent offline to add the required parameters to change the code path in SQL Agent and let it start up immediately or at least 5 minutes after starting the service.  I will continue to work on the design change request and hopefully it will be approved.

     Thank you,

    {removed}MCDBA

    Microsoft SQL Server Support Escalation Engineer

    SQL Server Escalation Services Team

    Product Support Services (PSS) - Charlotte, NC 

    Hours:  M-F 9:00am - 6:00pm Eastern | Phone:  {Removed} | Email: {Removed}

    Wednesday, 06 February 2008 09:42:45 (Central Standard Time, UTC-06:00) | Comments [1] |  SQL Agent#
    Search
    Popular Posts
    Unpatched Vulnerabiltiy discovered ...
    Spring Fornicator brewed...
    SQL Server Security, not where it n...
    DTA - Failed to initialize MSDB dat...
    Check the Uptime of a Windows Serve...
    Recent Posts
    Archive
    June, 2017 (2)
    May, 2017 (2)
    April, 2017 (2)
    March, 2017 (1)
    February, 2017 (1)
    December, 2016 (2)
    October, 2016 (2)
    September, 2016 (1)
    August, 2016 (1)
    July, 2016 (1)
    March, 2016 (2)
    February, 2016 (3)
    December, 2015 (4)
    November, 2015 (6)
    September, 2015 (1)
    August, 2015 (2)
    July, 2015 (1)
    March, 2015 (2)
    January, 2015 (1)
    December, 2014 (3)
    November, 2014 (1)
    July, 2014 (2)
    June, 2014 (2)
    May, 2014 (3)
    April, 2014 (3)
    March, 2014 (1)
    December, 2013 (1)
    October, 2013 (1)
    August, 2013 (1)
    July, 2013 (1)
    June, 2013 (2)
    May, 2013 (1)
    March, 2013 (3)
    February, 2013 (3)
    January, 2013 (1)
    December, 2012 (3)
    November, 2012 (1)
    October, 2012 (1)
    September, 2012 (1)
    August, 2012 (1)
    July, 2012 (4)
    June, 2012 (3)
    April, 2012 (1)
    March, 2012 (3)
    February, 2012 (3)
    January, 2012 (4)
    December, 2011 (3)
    October, 2011 (2)
    September, 2011 (2)
    August, 2011 (8)
    July, 2011 (4)
    June, 2011 (3)
    May, 2011 (3)
    April, 2011 (1)
    March, 2011 (2)
    February, 2011 (3)
    January, 2011 (1)
    September, 2010 (1)
    August, 2010 (2)
    May, 2010 (2)
    April, 2010 (3)
    March, 2010 (1)
    February, 2010 (4)
    January, 2010 (1)
    December, 2009 (3)
    November, 2009 (2)
    October, 2009 (2)
    September, 2009 (5)
    August, 2009 (4)
    July, 2009 (8)
    June, 2009 (2)
    May, 2009 (3)
    April, 2009 (9)
    March, 2009 (6)
    February, 2009 (3)
    January, 2009 (8)
    December, 2008 (8)
    November, 2008 (4)
    October, 2008 (14)
    September, 2008 (10)
    August, 2008 (7)
    July, 2008 (7)
    June, 2008 (11)
    May, 2008 (14)
    April, 2008 (12)
    March, 2008 (17)
    February, 2008 (10)
    January, 2008 (13)
    December, 2007 (7)
    November, 2007 (8)
    Links
    Categories
    Admin Login
    Sign In
    Blogroll