Are They Really Using All Those
Stored Procedures?
By
Bill Wunder
Follow me on a few twists and turns
(OK, even a few grumbles) through SQL Server’s memory pool and
I’ll take you to a little "set it up and go" tool that will give
you a remarkably accurate indication of which of those piles and
piles of stored procedures on your system are every really
getting used and the degree to which each is really called.
SQL Server memory space has evolved
over time into a virtual black box. In the early generations of
the product memory configuration and management was quite the
voodoo science. It was necessary to tell the server how much
memory went to procedure cache and it was possible to define how
much of the reaming would be distributed among the various types
of data cache buffers and pools. The number of configuration
options was a bit daunting, the measurement techniques were
obscure and/or expensive, and the documentation was woefully
inadequate. Interestingly, manifestations of memory leaks were
an every day occurrence. The more work the SQL Server did – and
for the most part irregardless of which memory tweaks were in
place – the more often the SQL Server either crashed or needed
to be started or needed to be rebooted because memory
management wasn’t doing it’s job.
Fortunately, the recent releases of
SQL Server are much better at memory management. The
improvements have led to an almost complete elimination of
manually configurable memory options and to a much more stable
primary storage environment. Under the covers, though, the
memory pool retains much the same architectural layout as the
legacy releases. There is still a fixed amount of memory
allocated for SQL Server executable code address space with the
remaining majority going to cache. SQL Server executable code
address space includes the kernel and the server’s internal road
map(s) to the protocols, .dlls and providers currently in use by
the SQL Server. This piece of memory still seems a bit
susceptible to leaks that can bring a server down. (See my
earlier article
“Automation and T-SQL”
for more on this topic). The SQL Server –g start up
option is an offshoot of Microsoft’s attempts to address these
problems. Personally, I’d rather avoid memory leaks when
possible and reboot regularly than resort to intentionally
stealing chunks of the memory pool. Don’t really think Microsoft
has helped us much with the –g command line option, Must
be one of those business risks driving it? (i.e. CYA
technology). The memory pool – or cache - is the more
interesting segment of a SQL Server’s working set. In this area
reside the procedure, data and log caches as well as per
connection dynamic data structures and the system data
structures. Basically, these types of cache are left to talk
amongst themselves to decide who gets what amount of the
available pool.
It’s no surprise that data cache
gets most of available pool. At least it better if you want
decent performance from the SQL Server. Obviously, connection
information and system information (including sysprocesses,
syslocks, sysperfinfo, syscacheobjects and the other virtual
tables must be able to claim the space they need so their memory
requirements directly impact the amount available for data,
though the expectation would be that they do not to exact a
terribly heavy cost. Procedure and log cache I would expect to
remain a pretty constant fraction of data cache space size once
the server has acquiesced after a start-up. Log data is probably
going to disk as quickly as possible and likely gets a top
priority thus holding down the space needs of that memory queue.
Procedure cache is being managed through an LRU aging algorithm.
Here’s where it gets interesting.
SQL Server uses the virtual table syscacheobjects to track LRU
statistics and age the procedure cache. If you query
syscacheobjects you will find a sort of index of the current
state of the cache. The data objects in cache can change
dramatically so trying to see what tables are in cache on a busy
server is not practical. You can, however, easily look in
syscacheobjects and see which embedded or ad-hoc query
statements have recently been executed if you are trying to
track down a problem where that information is useful. You can
also look in this table to see which stored procedures are
currently in use by checking for an “execution context”
cacheobjtype or see which have recently been used by checking
for “compiled object” cacheobjtype. I have found it useful to
use that latter bit of information as a tool to identify which
procedures are actually getting called on my SQL Servers.
If you examine the code in the
stored procedure
GetProceduresFoundInCache
you’ll notice that it is a self contained utility - intended for
creation in the admin database if course. The procedure creates
the historical tracking table it uses and even a mini-stored
procedure that it uses to track when the compiled objects in
cache have been changed. It infers that cache has changed by
regularly looking for this mini-proc in syscacheobjects every
minute as it is executed from the SQL Agent job that you set up
to call it. The mini-proc is effective because it should be at
the lowest rung in the LRU algorithm at all times. It always has
a use count of 1 which is the minimum and does not reference any
other object in cache. If GetProceduresFoundInCache finds the
mini-proc it does nothing. If the mini-proc is gone,
GetProceduresFoundInCache captures a snapshot list stored
procedures currently in cache and updates the table
ProceduresFoundInCache from that snapshot.
Querying ProceduresFoundInCache can
tell you when a procedure entered the table, when it was last
called, and the number of times that it was found in cache when
GetProceduresFoundInCache determined a sample was necessary. By
Left Joining ProceduresFoundInCache to sysobjects in other user
databases you can easily list those procedures that were never
found in cache.
There is no written guarantee that
every stored procedure will be cached, so after you have
identified those that don't appear in syscacheobjects you
can do a couple of things to see if a procedure just isn't ever
cached. You can execute it and see if gets cached or you can add
an insert into a tracking table to each identified unused
procedure and leave it in the database for a comfortable period
of time. The advantage to this latter approach is that you could
even log the application name and host machine that referenced a
procedure. Could be you'll uncover a complete application
process that is unneeded or unknown.
The biggest problem I have with this
tool is the high level of developer denial surrounding the
amount of their hard work that isn’t being used.
Bill
|