Transactional
Meditation: T-SQL and
Commitment Control
By Bill Wunder
“By retaining the integrity
of the inner and external worlds,
true selfhood is maintained,
and the inner world made fertile.”
Without question, using
transactions in a stored procedure is not
inherently BAD. It is fragile. If an
error is encountered that is fatal to the batch (a missing table
or column are common examples) the transaction can be left in an
open state until the database connection is closed or the error
is handled and the transaction is disposed by the calling
application. This is evidenced by the those infrequent but none
the less catastrophic outages caused by an orphaned transaction
that occur in any successful and changing SQL Server database
application that relies on persistent database connections for
improved performance.
The more relevant questions are
when and at what scope transactions are appropriate. In my mind
the answer is fairly long…
There are a number of discussions
to be found on how to use transactions within T-SQL and as many
on how to write transactions in application code. Please seek
them out for a fundamental understanding of both. Here I will
assume you understand that material and instead focus on how the
two must work together to assure data consistency and optimize
data concurrency. To dive right in, consider the shop that has
been using T-SQL transaction and is now finding batch fatal
orphaned transaction and long running blocking conditions
unacceptable in an OLTP or web application.
One pragmatic proposal is for the
data access layer to re-query the @@trancount
global variable any time a database error is incurred and if
that value is greater than zero to close the connection thereby
assuring that the transaction is ended. This goes a long way
toward resolving the contention and concurrency issues that can
arise in a successful web application though it does present a
few lingering vulnerabilities.
Testing for @@trancount
at each error fundamentally addresses the issue of transaction
disposition in the event of an error while a database
transaction is open. However, the approach carries the risk of
plunging the requestor into a potential connection creation race
condition by needing to create a new connection to effectively
close a transaction. It also adds the small overhead of checking
for an open transaction even if there has been no transaction
opened by the procedure that returned in error. I suggest that
if an application process is promoted to production that gets
even a modest amount of activity relative to an active
environment – in our shop modest might be quantifiable as a 25
calls per minute for example in an environment where we see some
procedure call rates in excess of 2500 per minute – that over a
period of a few minutes the application would not be able to
close and re-establish connections quickly enough to keep up
with the overall number of query request at the database given
the few tenths of a second needed to dispose of and re-establish
a connection. It is likely that a batch-abort error within a
transaction in such a load might manifest as a long slow process
of the entire application becoming unavailable and eventual –
even if short live – application outage.
Obviously, if the transactional
procedure were called infrequently this approach seems more
acceptable. The problem is we cannot guarantee that only
infrequently called procedures will include transactional
requirements. For this reason alone a different approach seems
prudent.
It seems that many BEGIN
TRANSACTION/END TRANSACTION blocks found in application queries
are found in a specific subset of queries that perform a related
work task type. For example, processing of import data between
staging and permanent table sets and ‘administrative’ tools or
web pages intended to allow a small subset of users to easily
mange the application at some level. It is not uncommon for such
tools to need to act upon multiple rows of data in more than one
table atomically. In these cases there is a tendency to
implement transactions in consumed stored procedures. These
operations act on sets of rows rather than a single row – such
as the short and concise transactional operations used in stored
procedures that might be used to update or add data specific to
the current user. The result is that, in the former or range
query based operation, more database locking will occur more
quickly and, subsequently, small problems can mushroom to
systemic outages in a heartbeat.
One reasonable alternative is to
make the data access layer the transaction owner. This may
require some adjustments to the stored procedure code base in
environments where only T-SQL transactions had previously been
in use. This necessitates a regression test cycle to verify the
behavior of all changes. For the most part though, existing
transactional control in the stored procedures should continue
to work as expected and coded if ownership is pushed out to the
application data access layer. Conversely, the duration of
locks/blocks and the roll back behavior changes once the BEGIN
TRANSACTION/END TRANSACTION control block occurs outside the
scope of the local database instance. In particular, if an OLEDB
connection – including a COM or .NET ADO connection – owns the
transaction the behavior in the event of run time error is that
the entire transaction may not roll back unless the owner
process says ROLLBACK. This would leave us susceptible to data
inconsistencies whereby one or more DML operations in a
transaction would be committed to the dataset even though
another DML operation in the same transaction had failed and
rolled back. We can explicitly solve this by adding the
directive SET XACT_APORT ON to the beginning of every
transactional stored procedure. The risk here is that that the
development process is equally as fragile as is the initial
presenting problem. A change to the procedure plus a mechanism
in the application must both occur to attain the desired result.
If the application opens a transaction and the stored procedure
does not include SET XACT ABORT ON we could experience the
problems we started with. If the application does not start the
transaction and the stored procedure processes mindlessly
expecting the application to manage a commit or rollback we
could as easily end up with data corruption in the event of a
run time error. It is necessary to guarantee that SET XACT_ABORT
ON is added to the procedure and that the transaction enabling
configuration is properly set in the data access configuration
control table.
SQL Server will assure that the
transaction is ACID1 regardless if the transaction
originated at the application or within the procedure. The risk
can be that SQL Server may not know when the transaction has
ended. In the interest of best practices, each database
transaction ultimately needs to be told, even with SET
XACT-ABORT ON, that the transaction has ended to assure
consistent an predictable behavior. As it says in the January
2004 Books Online update:
If a run-time statement error
(such as a constraint violation) occurs in a batch, the default
behavior in SQL Server is to roll back only the statement that
generated the error. You can change this behavior using the SET
XACT_ABORT statement. After SET XACT_ABORT ON is executed, any
run-time statement error causes an automatic rollback of the
current transaction. Compile errors, such as syntax errors, are
not affected by SET XACT_ABORT.
It is the responsibility of the
programmer to code the application to specify the correct action
(COMMIT or ROLLBACK) if a run-time or compile error occurs.
Another risk of using SET
XACT_ABORT ON and pushing transactional control back to the data
access layer is the possible impact on the longer standing
transaction operations. Here we have to consider at least the
transaction blocks in current consumed queries, population
scripts and even triggers if they are used. Some existing
procedures could be stronger if we could reduce the risk of
unhandled errors such as would come from a data access layer
owned transaction. Others, and this is important, may not even
really need to be transactional. For example if two procedures
do similar work yet one is “loose” or not transactional while
another is consistent by using a transaction, it is worthwhile
to consider in the evaluation of the applications transactional
control which provides the correct benefit. Perhaps even
population procedures are candidates for Snapshot Isolation (see
below) processing and such a change would allow much strong
indexing designs than are currently used and ,as a result,
quicker overall query response times for consumer queries.
Triggers may employ transactions that occur beneath the data. It
is unknown until we test or try under production load to know
the performance impact of changing the scope of the transactions
from a behind the data operation to an application initiated
atomic operation. I suspect in at least some cases we won’t like
what we find.
Returning to the original
question of when Transactions are necessary in a stored
procedure, The staging to production table process case is
interesting. Frequently it is necessary to cleanse or otherwise
manipulate an incoming data set between the time it is imported
to the staging table and the time it is loaded into the primary
storage table. Similarly, at times somewhat intense processing
of the data from the primary storage table to tables more
appropriate for presentation purposes is done. To require that
the entire processing of the import process be “wrapped” in an
application owned transaction for the duration of such
processing could mean that a significant amount of production
data would be unavailable for consumption due to the persistent
concurrency control locking implicit in the transactional
processing.
Usually in such cases the
objective is to minimize consumer blocking and at the same time
assure data consistence for the duration of any and all consumer
queries. A commonly used construct is:
BEGIN tran
Truncate table – no one can use the
table until the commit
Populate table
COMMIT tran
For this requirement we can
eliminate the overhead, risk, and imposed period of data
unavailability through the use of Snapshot Isolation mechanism.
SQL Server Yukon will hopefully provide intrinsic Snapshot
Isolation in 15 months or so, Until then consider a home grown
Snapshot isolation:
Start with two table
images with only one in use by production through a view
Truncate and populate the unused table image with
the new data from the staging table
Doesn’t matter how long it takes or how
granular the locks because consumers are using the other table
Alter View to use the
newly populated table image and discontinue using the other
You can read more details about
this process in my recent article,
Retrofitting
Table Level Delete/Archive Strategies - Short Term Storage
Models. This homegrown Snapshot Isolation technique requires
only a brief and predictable period to complete regardless of
the data set size. There is also a brief and predictable time
necessary for each consumer procedures to recompile one time
after each change of the view. As a Secondary benefit, the
previous data remains near line should it be necessary. The down
side is that this approach does require more storage since two
images of the table must be in existence at the same moment.
Snapshot Isolation is a desirable
alternative to the transaction based rollover in the case where
the data is completely replaced, whether in a primary storage
table or in a secondary or presentation tuned table. In the data
load scenario where data is updated rather than wholesaled it is
not useful to relieve the transaction induced contention and the
unavailability of data for consumption during the data import
processing. In this case a database transaction may be the most
useful construct to assure data consistency. The real question
when updating a table is: do we care more about data consistency
of an individually consumer query that might occur while the
data is changing or do we care more about making sure the
consumers are fed even at the risk that the result set is not
100% consistent. I suggest that in some cases where we do in
fact use a transaction to protect the consumers from changing
data that we may have unfortunately chosen to use transactional
control and it is already causing us consumer outages during the
import processing. It is a common misconception that tables are
automatically locked simply by using a BEGIN TRANSACTION/END
TRANSACTION block. A little testing will quickly reveal the
inaccuracy of such an assumption.
With certainty I have not covered
every case. That disclaimer helps to support my conclusion that
a single solution is probably not going to serve us well
regardless of which one is chosen in regard to transactional
processing. In my opinion we ought to use a layered approach
that includes:
1.Allows range query DML
processing to use connections isolated from the high volume
singleton connections when persistent connections are used..
2.Seriously reconsider the
necessity of transactional processing in general. At some
level we need to say ‘It is more important that every thing
keeps running – and quickly – than it is that this
particular operation is ACID. Again I’m not saying
transactions are BAD. They are critical when needed but can
also be misused. Obviously if you are a bank or billing
application transactions are everywhere. If you are a web
application storing user preferences or page tracking data
it may be better for your application to avoid the
transaction.
3.Seriously reconsider the
use of transactional based processing techniques that are
techniques not aimed at the ACID principals but rather are
techniques to impose single threaded operations. Snapshot
Isolation will provide better availability than the all too
common lock-block-replace T-SQL algorithms.
4.When possible make the
connection the owner of transactions.
5.Keep Transactions as short
and concise as possible
6.Always, Always, Always the
application must make sure a transaction is not left open.
“He is without virtue of his
own,
who asks of others that they fulfil
his obligations on his behalf.”
1 ACID - (Atomicity,
Consistency, Isolation, and Durability) see SQL Server Books
Online keyword ACID
Bill |