• Quick note - the problem with Youtube videos not embedding on the forum appears to have been fixed, thanks to ZiprHead. If you do still see problems let me know.

Which is a better SQL strategy?

aggle-rithm

Ardent Formulist
Joined
Jun 9, 2005
Messages
15,334
Location
Austin, TX
I have a reporting system that depends on triggers in the production database for a bunch of different tables that, when changed, drop a small change record in a table. This table is monitored by another process that looks up the records that have changed and saves them to the reporting database.

As the system grew bigger, I started getting deadlocks, so I divided the change table into a number of different tables for different categories of records. To avoid changing any of the triggers I had so meticulously created in the production database, I put a INSTEAD OF INSERT trigger on the change table which redirected the records to other tables.

Another strategy would have been to create a stored procedure that performs the same function as the trigger, and call that instead of inserting into the table. However, that would involve changing all the triggers.

Which is the better strategy? The re-direct trigger seems semantically clumsy to me, but I don't see the point of changing it to a stored procedure unless there is a net benefit to doing so.

Any insight would be appreciated.

(As I write this it occurs to me that the re-direct trigger may not actually gain me anything. If the insert command locks the table anyway while the redirect is occurring, I haven't solved the problem. Calling a stored procedure would only lock the table I'm re-directing to....I think.)
 
An example of how it works:

A record is updated in the customer orders table. The update trigger is called which creates a record with the unique record ID, table number, and change type (update). As far as this trigger knows, it's inserting the record into the "changes" table, but since there is an INSTEAD OF INSERT trigger on that table, the record is instead saved to the "orders" table in the reporting database. Within a minute, a separate process grabs the table name and row id, reads the record from the production database, and updates the corresponding record in the reporting database. The production database and reporting database are now synchronized.
 
An example of how it works:

A record is updated in the customer orders table. The update trigger is called which creates a record with the unique record ID, table number, and change type (update). As far as this trigger knows, it's inserting the record into the "changes" table, but since there is an INSTEAD OF INSERT trigger on that table, the record is instead saved to the "orders" table in the reporting database. Within a minute, a separate process grabs the table name and row id, reads the record from the production database, and updates the corresponding record in the reporting database. The production database and reporting database are now synchronized.


I'm sorry - as this is a completely useless suggestion, but why rely on triggers to record DB changes, when the controlling application could be recording the audit information instead?

Another useless suggestion --> Why not set up the prd DB to be replicated and then run your triggers on the replicated DB to log the change records. That way prd is not impacted by the creation of the change records.

You would then most likely ditch your reporting database in favour of using the replicated one.

ETA for Clarification

PRD: no triggers, no change tables. Replicates to RPT
RPT: triggers, change tables



IANASQLG

(I am not a SQL guru)
 
Last edited:
Just a stupid question, but did you try anotehr venue, like stack overflow or a similar forum for SQL ? That is a rather pointed expert question for a generic forum.
 
Just a stupid question, but did you try anotehr venue, like stack overflow or a similar forum for SQL ? That is a rather pointed expert question for a generic forum.


Because it is well known across teh intarwebs that JREF has the answers to everything :cool:

Also, maybe he has and didn't get particularly helpful responses?
 
An example of how it works:

A record is updated in the customer orders table. The update trigger is called which creates a record with the unique record ID, table number, and change type (update). As far as this trigger knows, it's inserting the record into the "changes" table, but since there is an INSTEAD OF INSERT trigger on that table, the record is instead saved to the "orders" table in the reporting database. Within a minute, a separate process grabs the table name and row id, reads the record from the production database, and updates the corresponding record in the reporting database. The production database and reporting database are now synchronized.

It's a long time since I worked with sql, and things have undoubtedly changed too, so forgive me if this is a stupid question -

Why the 2-stage insert/update process into the 'reporting' version of ORDERS? Why not just slap the updated 'production' record in there from the outset, plus the 'update' change type?
 
I have a reporting system that depends on triggers in the production database for a bunch of different tables that, when changed, drop a small change record in a table. This table is monitored by another process that looks up the records that have changed and saves them to the reporting database.

As the system grew bigger, I started getting deadlocks, so I divided the change table into a number of different tables for different categories of records. To avoid changing any of the triggers I had so meticulously created in the production database, I put a INSTEAD OF INSERT trigger on the change table which redirected the records to other tables.

I would not attempt a fix on the deadlock issue without first tracing and identifying why the deadlock occurred in the first place. If you are using SQL Server, you can enable traceflags 1204 and 1222 to detect deadlocks and print out a diagnostic output of the event.

Its hard to know what strategy is correct without understanding the deadlock occurred in the first place. It may be something simple, like a TABLOCK/TABLOCKX hint on a query where it isn't needed, or a complex trigger graph, but it won't be possible to debug without trace data of the deadlock.

For more context, can you post the trigger code which inserts the change record into your report tables?
 
I'm sorry - as this is a completely useless suggestion, but why rely on triggers to record DB changes, when the controlling application could be recording the audit information instead?

Good question. I'm hampered by the fact that I'm working with a third-party ERP system. That system uses the same database for production and reporting, and I have created a separate reporting system.

Another useless suggestion --> Why not set up the prd DB to be replicated and then run your triggers on the replicated DB to log the change records. That way prd is not impacted by the creation of the change records.

You would then most likely ditch your reporting database in favour of using the replicated one.

That's been suggested to me as well, but the reporting system does a number of other things that make this not such a great option.
 
I would not attempt a fix on the deadlock issue without first tracing and identifying why the deadlock occurred in the first place. If you are using SQL Server, you can enable traceflags 1204 and 1222 to detect deadlocks and print out a diagnostic output of the event.

Yep...I've done that. Got some useful information.

After the changes I've made, the deadlock situation has improved quite a bit. The deadlocks I see now are demonstrably NOT MY FAULT. (an awesome feeling!)
 
It's a long time since I worked with sql, and things have undoubtedly changed too, so forgive me if this is a stupid question -

Why the 2-stage insert/update process into the 'reporting' version of ORDERS? Why not just slap the updated 'production' record in there from the outset, plus the 'update' change type?

I just wanted the trigger to be as minimal as possible. I'm operating under the assumption (possibly incorrect) that saving a record of three or four columns is less resource-intense than one of fifty columns.

The beauty of the second step is that it is asynchronous, so it doesn't add to the duration of the transaction.
 
I just wanted the trigger to be as minimal as possible. I'm operating under the assumption (possibly incorrect) that saving a record of three or four columns is less resource-intense than one of fifty columns.

You mean in terms of the performance of the production d/b which is working harder under everyday demand than the reporting d/b? Maybe, though we seem to be looking at a single write in the case I suggest - as we already have the data cached - vs. a write+read+write (write to the reporting CHANGES table, then read from the original production d/b and then write that stuff into reporting) in the 2-stage scenario.

Usual caveats apply. Long time since I worked in this field. I quite miss it, in a funny way - used to really enjoy exercising my brain over this kind of stuff. Beats the hell out of arguing with CTers, though Scrabble can be fun :)
 
Usual caveats apply. Long time since I worked in this field. I quite miss it, in a funny way - used to really enjoy exercising my brain over this kind of stuff. Beats the hell out of arguing with CTers, though Scrabble can be fun :)


I'm sure after a few months, you'd be just dying to get into a good semantics war with a poe - grass always greener and all that :D :D
 
Usual caveats apply. Long time since I worked in this field. I quite miss it, in a funny way - used to really enjoy exercising my brain over this kind of stuff.

A lot of people are already considering it obsolete technology. No-SQL is all the rage these days, but I find it much simpler to crunch data using the tried-and-true method.
 

Back
Top Bottom