Heavy Duty LiteSpeed Log Shipping
Part 4: Playing Nice
By Bill Wunder
Occasionally, there are activities
that will cause the rate of transaction log writes to increase
dramatically in what I’ll call spikes or surges. Spikes are
brief periods when a lot of write activity hits the log device
and as suddenly is done and the log activity quickly acquiesces
to a normal rate. Spikes of log activity can occur during data
imports to large tables in a batch process or more likely during
processing of data into a permanent table just after an import
into a staging database. Surges of log activity begin as quickly
as spikes but the high logging rate is sustained for a longer
period of time. Surges of log activity can occur for the same
reasons as spike but usually involve larger tables or datasets.
There are a number of approaches
used to reduce contention and log surging when large tables are
loaded or maintained in a batch mode process. It is possible to
iterate through the changes, affecting one or a limited number
of rows in each iteration. This is most effective, in terms of
managing log growth, for a database operating in the “simple”
recovery model. In the “simple” recovery model, the log will be
truncated on checkpoint back to the beginning of the oldest
active transaction. Keeping the transaction small through a
looping algorithm will assure that the complete table doesn’t
need to be processed as a single transaction. This technique is
generally not useful to control log growth in the log shipping
scenario as you cannot backup logs if the simple recovery mode
has been used since the last full backup. Still it may be a
valid approach, even for a database under log shipping, to
reduce the duration of blocks in a highly contentious database.
When importing data or creating
indexes, the “bulk-logged” recovery model is useful to greatly
reduce the amount of logging. If you have a database that has a
significant maintenance window this recovery model can be easily
implemented to keep log shipping backup files to a reasonable
size. Unlike the “simple” recovery model, the “bulk-logged”
recovery model can be used for a database under log shipping
though it does add some risks and eliminates the possibility of
point-in-time log recovery that is available with a database in
the “full” recovery model. In general I’d recommend avoiding the
“bulk-logged” recovery for a database under log shipping in 24x7
operating conditions. Even though you’ll use significantly less
log space during an import you’ll still face the contention and
data integrity issues associated with atomic mass changes to
large tables. If a bcp or BULK INSERT is active against a large
table, the table’s data will be continually changing and or
unavailable during the import. Often, you will have some tables
that can’t be locked for a long duration and some tables that
can’t be in an inconsistent state during the import processing
as mandated by your particular 24x7 business model. In most
cases a staging database is needed to import the data before it
is more elegantly integrated into the live dataset. Sometimes
you may even need to switch between two tables when an import
occurs to move from the old data to the new data in an instant.
In such cases the “bulk-logged” recovery model doesn’t offer so
much help. Conversely, a CREATE INDEX on a large table may be
the most valid time to use the “bulk-logged” recovery model
while under log shipping, though only if the CREATE INDEX is a
rarely occurring activity. The CREATE INDEX will still block
other access to the table, regardless of the recovery model
used. While the “bulk-logged” recovery model will greatly reduce
the logging activity of these types of activity
For maintenance of existing indices,
archive/delete processing, looping algorithms that -even though
they don’t create long periods of blocking - can generate a huge
amount of transaction log activity, and all the other conditions
that result in log growth surges a method to manage log growth
in a log shipping environment is necessary. It is important to
manage this log growth to avoid running out of disk space on the
log device and to avoid an extended backlog of log backups
creating an uncomfortable latency between the state of the
source database and the state of the destination database. Sure
you can configure an 8GB database with a 32GB log device, but if
your SLA says you can fail over to your warm spare with less
than 10 minutes of lost data and your in the midst of shipping a
restored a 30GB backup when you need to fail over it’s going to
cost you.
A technique that is working well for
me is to exploit the SQLAgent Job scheduler using the msdb based
system stored procedures sp_start_job and sp_stop_job. I’ll
describe how I use this technique in particular with reference
to index maintenance. As a working example I have modified one
of the earlier scripts I placed in the
sswug.com Script Library
to show the changes necessary to achieve log space management
with a long running log intensive process. The original script
is still posted as
“admin db – Index (fragmentation)
Maintenance”. I have
modified that earlier version to include the technique I’m about
to describe and posted the updated version as
“admin db - index (fragmentation)
maintenance for log shipping”.
Please review both scripts to get a full understanding of how
this technique works.
The script jumps from about 300
lines to about 600 lines with the additions so there is a quite
a little that has to happen. Basically the procedure
dbo.IndexMaintenanceForDB should be set up in the SQLAgent job
scheduler to run at an appropriate time (i.e. when the server is
not working it’s hardest). When a database is defragmented that
is in the full recovery model the maintenance job will create a
second stored procedure name dbo.DefragLargeTables in the admin
database that is basically a clone of the defrag logic of the
dbo.IndexMaintenanceForDB (Only creates it, of course, if it
doesn’t already exist) and also a second SQLAgent job that calls
dbo.DefragLargeTables named “w Defrag large <the DB Name>
tables”. The new job will not get a schedule. Instead, when
dbo.IndexMaintenanceForDB begins defragging the large tables in
the database it will not directly execute a DBCC. Rather it will
call sp_start_job to execute “w Defrag large <the DB Name>
tables”. sp_start_job is effectively an asynchronous operation
so once the second job is started, dbo.IndexMaintenanceForDB for
will sleep for while and periodically wake up and check the
current used space in the log file for the database being
maintained by looking in master.dbo.sysperfinfo. If the current
log used size exceeds a threshold you have set as a parameter to
dbo.IndexMaintenanceForDB the “w Defrag large <the DB Name>
tables” job will be stopped by a call to sp_stop_job and
dbo.IndexMaintenanceForDB will wait until the log size drops
below the threshold before restarting the job. This works in
this case because DBCC INDEXDEFRAG is an online operation and it
can be stopped at any time with no loss of work already done and
no resulting inconsistencies to the data. To use the technique
for other operation you’ll want to be sure you write that
operation to meet these same standards: there will be no loss of
work and no inconsistencies to the data. In most cases that will
mean your process will take longer to run with the advantages
that it can be stopped and started at any time and that it will
not cause long running contention or blocking to other consumers
of a particular table’s data.
If you haven’t already, you can
download the Heavy Duty LiteSpeed
Log Shipping stored procedures
and save them all to a location that you have permission to
access from the network using the file names indicated in the
download script for each procedure’s script. Then follow the
steps outlined in
Heavy Duty LiteSpeed Log Shipping
Part 2 for deploying the
log shipping job on a test server.
Discover the usage and processing
actions of a deployed log shipping installation in the article
Heavy Duty LiteSpeed Log Shipping
Part 3. Then if you have
surging log growth conditions you can use the information
presented here to help you manage log growth. There is one more
topic I’d like to cover in the next article before we leave this
discussion of Heavy Duty SQL LiteSpeed Log Shipping: fail over
strategies. Hope you check it out.
Bill
|