Transactional Meditation

Transactional Meditation: T-SQL and Commitment Control
By Bill Wunder

“By retaining the integrity
of the inner and external worlds,
true selfhood is maintained,
and the inner world made fertile.”

Without question, using transactions in a stored procedure is not inherently BAD. It is fragile. If an error is encountered that is fatal to the batch (a missing table or column are common examples) the transaction can be left in an open state until the database connection is closed or the error is handled and the transaction is disposed by the calling application. This is evidenced by the those infrequent but none the less catastrophic outages caused by an orphaned transaction that occur in any successful and changing SQL Server database application that relies on persistent database connections for improved performance.

The more relevant questions are when and at what scope transactions are appropriate. In my mind the answer is fairly long…

There are a number of discussions to be found on how to use transactions within T-SQL and as many on how to write transactions in application code. Please seek them out for a fundamental understanding of both. Here I will assume you understand that material and instead focus on how the two must work together to assure data consistency and optimize data concurrency. To dive right in, consider the shop that has been using T-SQL transaction and is now finding batch fatal orphaned transaction and long running blocking conditions unacceptable in an OLTP or web application.

One pragmatic proposal is for the data access layer to re-query the @@trancount global variable any time a database error is incurred and if that value is greater than zero to close the connection thereby assuring that the transaction is ended. This goes a long way toward resolving the contention and concurrency issues that can arise in a successful web application though it does present a few lingering vulnerabilities.

Testing for @@trancount at each error fundamentally addresses the issue of transaction disposition in the event of an error while a database transaction is open. However, the approach carries the risk of plunging the requestor into a potential connection creation race condition by needing to create a new connection to effectively close a transaction. It also adds the small overhead of checking for an open transaction even if there has been no transaction opened by the procedure that returned in error. I suggest that if an application process is promoted to production that gets even a modest amount of activity relative to an active environment – in our shop modest might be quantifiable as a 25 calls per minute for example in an environment where we see some procedure call rates in excess of 2500 per minute – that over a period of a few minutes the application would not be able to close and re-establish connections quickly enough to keep up with the overall number of query request at the database given the few tenths of a second needed to dispose of and re-establish a connection. It is likely that a batch-abort error within a transaction in such a load might manifest as a long slow process of the entire application becoming unavailable and eventual – even if short live – application outage.

Obviously, if the transactional procedure were called infrequently this approach seems more acceptable. The problem is we cannot guarantee that only infrequently called procedures will include transactional requirements. For this reason alone a different approach seems prudent.

It seems that many BEGIN TRANSACTION/END TRANSACTION blocks found in application queries are found in a specific subset of queries that perform a related work task type. For example, processing of import data between staging and permanent table sets and ‘administrative’ tools or web pages intended to allow a small subset of users to easily mange the application at some level. It is not uncommon for such tools to need to act upon multiple rows of data in more than one table atomically. In these cases there is a tendency to implement transactions in consumed stored procedures. These operations act on sets of rows rather than a single row – such as the short and concise transactional operations used in stored procedures that might be used to update or add data specific to the current user. The result is that, in the former or range query based operation, more database locking will occur more quickly and, subsequently, small problems can mushroom to systemic outages in a heartbeat.

One reasonable alternative is to make the data access layer the transaction owner. This may require some adjustments to the stored procedure code base in environments where only T-SQL transactions had previously been in use. This necessitates a regression test cycle to verify the behavior of all changes. For the most part though, existing transactional control in the stored procedures should continue to work as expected and coded if ownership is pushed out to the application data access layer. Conversely, the duration of locks/blocks and the roll back behavior changes once the BEGIN TRANSACTION/END TRANSACTION control block occurs outside the scope of the local database instance. In particular, if an OLEDB connection – including a COM or .NET ADO connection – owns the transaction the behavior in the event of run time error is that the entire transaction may not roll back unless the owner process says ROLLBACK. This would leave us susceptible to data inconsistencies whereby one or more DML operations in a transaction would be committed to the dataset even though another DML operation in the same transaction had failed and rolled back. We can explicitly solve this by adding the directive SET XACT_APORT ON to the beginning of every transactional stored procedure. The risk here is that that the development process is equally as fragile as is the initial presenting problem. A change to the procedure plus a mechanism in the application must both occur to attain the desired result. If the application opens a transaction and the stored procedure does not include SET XACT ABORT ON we could experience the problems we started with. If the application does not start the transaction and the stored procedure processes mindlessly expecting the application to manage a commit or rollback we could as easily end up with data corruption in the event of a run time error. It is necessary to guarantee that SET XACT_ABORT ON is added to the procedure and that the transaction enabling configuration is properly set in the data access configuration control table.

SQL Server will assure that the transaction is ACID¹ regardless if the transaction originated at the application or within the procedure. The risk can be that SQL Server may not know when the transaction has ended. In the interest of best practices, each database transaction ultimately needs to be told, even with SET XACT-ABORT ON, that the transaction has ended to assure consistent an predictable behavior. As it says in the January 2004 Books Online update:

If a run-time statement error (such as a constraint violation) occurs in a batch, the default behavior in SQL Server is to roll back only the statement that generated the error. You can change this behavior using the SET XACT_ABORT statement. After SET XACT_ABORT ON is executed, any run-time statement error causes an automatic rollback of the current transaction. Compile errors, such as syntax errors, are not affected by SET XACT_ABORT.

It is the responsibility of the programmer to code the application to specify the correct action (COMMIT or ROLLBACK) if a run-time or compile error occurs.

Another risk of using SET XACT_ABORT ON and pushing transactional control back to the data access layer is the possible impact on the longer standing transaction operations. Here we have to consider at least the transaction blocks in current consumed queries, population scripts and even triggers if they are used. Some existing procedures could be stronger if we could reduce the risk of unhandled errors such as would come from a data access layer owned transaction. Others, and this is important, may not even really need to be transactional. For example if two procedures do similar work yet one is “loose” or not transactional while another is consistent by using a transaction, it is worthwhile to consider in the evaluation of the applications transactional control which provides the correct benefit. Perhaps even population procedures are candidates for Snapshot Isolation (see below) processing and such a change would allow much strong indexing designs than are currently used and ,as a result, quicker overall query response times for consumer queries. Triggers may employ transactions that occur beneath the data. It is unknown until we test or try under production load to know the performance impact of changing the scope of the transactions from a behind the data operation to an application initiated atomic operation. I suspect in at least some cases we won’t like what we find.

Returning to the original question of when Transactions are necessary in a stored procedure, The staging to production table process case is interesting. Frequently it is necessary to cleanse or otherwise manipulate an incoming data set between the time it is imported to the staging table and the time it is loaded into the primary storage table. Similarly, at times somewhat intense processing of the data from the primary storage table to tables more appropriate for presentation purposes is done. To require that the entire processing of the import process be “wrapped” in an application owned transaction for the duration of such processing could mean that a significant amount of production data would be unavailable for consumption due to the persistent concurrency control locking implicit in the transactional processing.

Usually in such cases the objective is to minimize consumer blocking and at the same time assure data consistence for the duration of any and all consumer queries. A commonly used construct is:

            BEGIN tran
                        Truncate table – no one can use the table until the commit
                        Populate table
            COMMIT tran

For this requirement we can eliminate the overhead, risk, and imposed period of data unavailability through the use of Snapshot Isolation mechanism. SQL Server Yukon will hopefully provide intrinsic Snapshot Isolation in 15 months or so, Until then consider a home grown Snapshot isolation:

            Start with two table images with only one in use by production through a view
            Truncate and populate the unused table image with the new data from the staging table
                      Doesn’t matter how long it takes or how granular the locks because consumers are using the other table

Alter View to use the newly populated table image and discontinue using the other

You can read more details about this process in my recent article, Retrofitting Table Level Delete/Archive Strategies - Short Term Storage Models. This homegrown Snapshot Isolation technique requires only a brief and predictable period to complete regardless of the data set size. There is also a brief and predictable time necessary for each consumer procedures to recompile one time after each change of the view. As a Secondary benefit, the previous data remains near line should it be necessary. The down side is that this approach does require more storage since two images of the table must be in existence at the same moment.

Snapshot Isolation is a desirable alternative to the transaction based rollover in the case where the data is completely replaced, whether in a primary storage table or in a secondary or presentation tuned table. In the data load scenario where data is updated rather than wholesaled it is not useful to relieve the transaction induced contention and the unavailability of data for consumption during the data import processing. In this case a database transaction may be the most useful construct to assure data consistency. The real question when updating a table is: do we care more about data consistency of an individually consumer query that might occur while the data is changing or do we care more about making sure the consumers are fed even at the risk that the result set is not 100% consistent. I suggest that in some cases where we do in fact use a transaction to protect the consumers from changing data that we may have unfortunately chosen to use transactional control and it is already causing us consumer outages during the import processing. It is a common misconception that tables are automatically locked simply by using a BEGIN TRANSACTION/END TRANSACTION block. A little testing will quickly reveal the inaccuracy of such an assumption.

With certainty I have not covered every case. That disclaimer helps to support my conclusion that a single solution is probably not going to serve us well regardless of which one is chosen in regard to transactional processing. In my opinion we ought to use a layered approach that includes:

1.Allows range query DML processing to use connections isolated from the high volume singleton connections when persistent connections are used..

2.Seriously reconsider the necessity of transactional processing in general. At some level we need to say ‘It is more important that every thing keeps running – and quickly – than it is that this particular operation is ACID. Again I’m not saying transactions are BAD. They are critical when needed but can also be misused. Obviously if you are a bank or billing application transactions are everywhere. If you are a web application storing user preferences or page tracking data it may be better for your application to avoid the transaction.

3.Seriously reconsider the use of transactional based processing techniques that are techniques not aimed at the ACID principals but rather are techniques to impose single threaded operations. Snapshot Isolation will provide better availability than the all too common lock-block-replace T-SQL algorithms.

4.When possible make the connection the owner of transactions.

5.Keep Transactions as short and concise as possible

6.Always, Always, Always the application must make sure a transaction is not left open.

“He is without virtue of his own,
who asks of others that they fulfil
his obligations on his behalf.”

¹ ACID - (Atomicity, Consistency, Isolation, and Durability) see SQL Server Books Online keyword ACID

Bill