From SQLMail to SMTP in the SQL
Agent
By
Bill Wunder
In an earlier
article, “Take my
SQLMail, Please!”, I presented a case for an alternative for
SQL mail. Then I described a stepwise strategy for moving away
from xp_sendmail by first decoupling application stored
procedures from the action of sending an email in the article
“From xp_sendmail to safe_sendmail”
and then I suggested a way to replace xp_sendmail with an
Simple Mail Transport Protocol (SMTP) relay
based extended stored procedure in the backend of the now
asynchronous call to send emails in the article
“From xp_sendmail to
xp_smtp_sendmail”.
Now, to wrap up this discussion aimed at completely replacing
SQLMail, I turn the attention
toward the SQL Agent. Is easier than most would imagine to
remove any SQL Agent dependencies on MAPI and the Outlook Client
in SQL Server 2000. As an added bonus I’ll also show you a way
to provide much better messages that those that SQLMail provides
through the SQLAgent Notification option.
As I replace SQLMail, I will
preserve the key benefits of SQLAgent Notifications. The
features that the SQLAgent’s SQLMail provides that I want to
continue to support include job failure email notification, job
completion email notification, and
SQL Agent
Alerts email notification. On the other hand, in our shop
anyway, SQLMail’s reliance on msdb.dbo.sysoperators to
predetermine who can - and by their exclusion who cannot -
receive an email from the SQLAgent is more trouble than it’s
worth so we will not perpetuate this well hidden layer of
maintenance and any associated pre-configuration requirement. We
will provide an alternative to the Agent’s “Fail-Safe Operator”
by supporting a default recipient for all safe_sendmail based
emails. In reality, the default recipient is broader in scope
than the “Fail-Safe Operator” because it applies to all emails,
not just SQLAgent originated emails. Bu sure to look back at the
“From xp_sendmail to xp_smtp_sendmail”
article and
the
script for the SendAsyncEmail stored procedure to see how
this mechanism works.
I will also add a handful of useful
features to my SMTP based replacement that you don’t get with
the out of the box SQLMail. First, all
messages sent
by the SQL Agent will include the actual error messages that
caused an SQLAgent job to fail in the failure notification
message rather than the generic failure message that SQLMail
emits. Additionally, the new solution will provide an option to
archive job failures to support the tracking of failures over
time. Finally, because it uses the safe_sendmail T-SQL interface
to send it’s email, the replacement will include an archive of
all messages sent in a table within the SQL Server and is
adaptable to any SQLMail alternative technology you might decide
to use.
Here’s where I get to back off a
little and explain that I think SQL Server is a great product
even though it only talks MAPI in the email community. If it
weren’t for broad selection of tools available within SQL
Server, it would not be possible to come up with something
better than SQLMail.
Red Alert
One welcome tool is the fine
assortment of tokens available in the SQLAgent. In case you
aren’t familiar with the tokens, check out the Books on Line
documentation for sp_add_jobstep. Even though the token
documentation is unfortunately weak, one thing you’ll notice
right away is that tokens and alerts were meant for each other.
If you look over the script I'll use in the command of the only
step in this job you'll notice that simply by the use of tokens,
i can produce a message that will tells me everything about the
alert that the under the covers Alert notification does. The
tokens are going to provide the error inf and even the database
where the error occurred.
exec sysmon.dbo.safe_sendmail
@recipients = 'DBA@MyCompany.com'
, @subject = 'SQL Server Alert System: alert condition
occurred on [SRVR]'
, @message = 'Error: [A-ERR]
Severity: [A-SEV]
Date: [STRTDT]
Time: [STRTTM]
Database: [A-DBN]
Message: [A-MSG]
Check the [SRVR] SQL
Server ErrorLog and the Application event log on the server for
additional details'
This makes the task of capturing the
characteristics of an alert very straight forward in a SQL Agent
job step. With that in mind, I’m going to turn my alerts over to
an agent job that I’ll name “w Alert Notification”. This job is
going be an easy interface to the Agent Alerts because it will
take advantage of one of those underused abilities that most of
us have looked at many times and managed to easily ignore. You
can tell any alert to execute a job. Simply open an Alert from
Enterprise Managers left pane, go to the Response Tab and select
the “Execute job:” check box. This will allow you can browse
for any job that already exists on the server.
This means that in order to use the
GUI it is necessary to create the job before you can call it
from an Alert. It’s worth noting that once the job is created
you can also add Alerts to the system by going to Job Schedule
Tab of the job and selecting the “Add Schedule” button. This
takes you to the same dialog as adding an alert directly from
the Alerts folder of the Enterprise Manager left pane.
Naturally, I prefer a script to the
GUI. If you have multiple servers to convert I'm betting you’ll
understand why. You'd have to do a lot of clicking to move all
the existing Alerts into the “w Alert Notification” job. Take a
look at the
script to add the “w Alert
Notification Job” and then modify all existing Alerts to use the
job. The script doesn’t
include statements to remove any existing Operators from the
jobs because it is necessary to review which Operators need to
be notified of which Alerts and add this information to the
script. I removed the Operators on my systems as I collected the
information so I didn’t need to do it again in this script. (My
fall back plan here was to use the scripts I had previously
stored in SourceSafe using “Bill Wunder’s DDL Archive Utility”.
Fortunately I didn’t need to go there either, but it's always
best to have a fall back plan.) If you find that you require
more than one distribution list you may want to consider
creating enough different notification jobs to cover all use
cases. It’s even conceivable that you may want to continue to
use Agent Operators in NET SEND mode in your alerts in which
case you wouldn’t want to remove them though I find that NET
SEND's value is dubious except in very rare cases. I would
recommend that you always include the DBA (that’s you!) as a
recipient for all Alerts. This will be your best indication of
how much the system is actually getting used.
Getting the Message
OK, now we have alerts using the
SMTP solution. All that’s left is to get the SQLAgent Job
Notifications feeding the AsyncEmailQueue and the need to
install Outlook on any SQL Server is Gone!
Since I’m not using Agent Operators,
the built-in Notification ability won’t do. I’ll need to add a
job step that will handle both job failure notification and
completion with success notification to all jobs that require
notification. Truth is if you want any kind of reliable failure
notification you need to use the built-in Notification else a
separate job step. If you rely on a mail to be sent from within
a T-SQL job step or from within a procedure called from the job
step and an error occurs that is “fatal to the batch” you’ll
never know that a failure occurred. In an extension of that
reality, if you rely on the built-in Notification you’ll stand a
better chance of getting notified (if SQL Mail doesn’t let you
down just when you needed it) but you’ll have to go the extra
mile to actually find out why the job step failed. The easiest
way to see the actual error is to view the Job History in
Enterprise Manager, remembering of course to check the box that
gives you all the details. Many times this method is not
desirable. For one, you need to be a user that has the necessary
permission to view the history. Always nice, for example, to get
called at 2 AM by someone that wants to know why their Job
failed so they can correct the problem and move on. Won’t it be
so much nicer when they get he actual error in the email that
tells them that the Job failed?
To get the error all you do is query
msdb.dbo.sysjobhistory. Of course you have to join it to
msdb.dbo.sysjobs by guid to find a row by job name: an excellent
3AM sobriety test. What I found to work better is to have a job
step that all job steps are sent to on failure. In this job
step, I query msdb.dbo.sysjobhistory and I insert the error
message into a table in the admin database. That way, no matter
how many jobs I have writing to msdb.dbo.sysjobhistory and
irregardless of the history settings for the SQL Agent (and I
always find the 1000 rows total, 100 rows per job default
setting woefully inadequate) I can make sure I can go back to
the actual error in the morning or on Monday or after I get back
from Vacation. Even if the msdb.dbo.sysjobhistory has been
overwritten. Even if a crafty developer has cleared the job
history to cover their tracks- but that never really happens
right? Have a look at the
script to create the table and the
two stored procedures necessary to handle all of your Job
Notifications. These
components can be used without implementing the SMTP mail
facility to provide enhanced error reporting and error tracking
even if you choose to stick it out with Outlook. The components
are intended for use in the admin database so you can keep
everyone from mucking around.
There are two procedures:
log_job_failure is intended to replace the built-in “When the
job fails” mechanism and log_job_completion that replaces both
the “When the job succeeds” and “When the job completes”
mechanisms. Basically, log_job_failure will always cause the job
to end in failure and always sends a message with “JOB FAILED”
in the subject. log_job_completion can send either a “JOB
SUCCEEDED” message and leave the job in a happy state or it can
duplicate the log_job_failure behavior if the job is in a failed
condition.
It was slightly tricky for everyone
to get used to the job flow requirements of these jobs, but not
a big deal for anyone. Here’s what the job step flow would look
like for a failure only condition.
And this for an always notify
condition. Everything is the same except the last step before
the notification step continues into the notification step
rather than “Quitting with success” as it does for failure only
notification.
The job step for both is identical
and very simple. Just provide the correct procedure name and the
recipients that you want to be notified: admin.dbo.log_job_error if all you care about are failures and
admin.dbo.log_job_completion if the notification step needs to
send an email each time the job finishes. All the smarts is in
the T-SQL so you can modify the logic without having to revisit
every job if you need to tweak either procedure to suit your
environment.
If you didn’t want to capture the
error messages to a table it would be a simple matter to modify
the procedures and omit the table. I have to warn you though,
I’ve found that table useful on more than one occasion and
having a history of failures over several weeks has also been
repeatedly useful.
So there you
have it. A complete replacement for xp_sendmail and SQLMail. We’
have explored a solution that is asynchronous, keeps a history
of sent messages, maintains the kept history, uses a standard
user interface that mimics xp_sendmail, enhances the messages
sent by the SQL Agent by including actual error messages rather
than the generic failure message that the SQL Agent built-in
notification facility puts out, provides full support for SQL
Agent Alerts and is adaptable to any email technology you want
to use. This solution makes it very easy to deploy and change
the underlying email technology easily because the interface is
centralizing in a single stored procedure: safe_sendmail.
Try it out.
I’m sure there are lots of ways to make it better and many other
ways to solve the general problem. Of one thing I am certain. My
SQL Servers run better and longer between the reboots necessary
to replenish depleted memory with this alternative than they
ever did with SQLMail.
Bill
|