Pages

2011/05/20

Managing Data Change

Often DBAs receive requests to manually manipulate production data. Ideally, all data changes should be performed through the appropriate application or interface. Unfortunately, sometimes applications lack a feature or has a bug that does not allow the user to modify data. To fix a bug or add a feature may require another software release. Sometimes the business need is too great to wait for the next release.

Manually manipulating data can pose various problems for both the DBA who designed the database and knows everything about the schema, as well as the DBA who recently inherited a database and knows nothing about the schema. Some of these problems are:

  1. Data cannot simply be changed whenever we want, to any value we want. Most tables are related to other tables. Some data is designed to represent a record’s state at a certain point in time. Changing a value may break that relationship or state and, therefore, break the application and corrupt data.
  2. In most cases with a well designed and documented database that has all the appropriate keys and constraints, a DBA can quickly determine if modifying data will break the application. In the real world, this is often not the case. The schema does not necessary reveal the application logic, or how the data is used or modified. Sometimes only the original developer or experienced analyst knows the application well enough.
  3. Changing data manually is very labor intensive. Also, DBA and developer labor is not cheap. Because it is labor intensive, continuing to manually manipulating will always become more expensive at some point in the future than coding the application appropriately.
  4. Changing data manually is error prone - Despite their appearance, DBAs are human. Even a highly skilled DBA with years of experience can write an UPDATE statement incorrectly. Also, the DBA can be given the wrong record to update.

If requests for data modification are frequent, policy and procedures to manage data change are needed. These policies and procedures should have the following objectives:

  1. Retain old data - Retain data prior to the change so the changes can be tracked and reversed, if needed.
  2. Document Purpose - Documenting the purpose of the data change in order build a case for a software update. By doing this you may find that you are updating the same column on the same table several times a month. Now you will have the documentation to prove it and why.
  3. Notify Interested Parties - There may be other interested parties in your company that need to be aware of the change. Operations managers may not want you changing records during the day or after an order is shipped. Software Development and Product managers may want to know about a missing feature or bug in the software.
  4. Only certain data can be changed - Because the database does not necessarily reveal the application logic and table relationships, only approved tables and columns should be changed.
  5. Test all changes - All changes should first be done in a development environment.
  6. Verify changes - All changes are verified in a QA or testing environment.
  7. Use the best person to develop an update script - The DBA may not be the best person to develop an update script. Sure, he or she may be the best skilled, but as I mentioned above, the DBA may not completely understand the application logic or business implications. In such cases the developer, maybe with the help of a business analyst, can develop an appropriate update script.

How does your company manage data change requests? Besides my list above, what other problems do you see with manually manipulating data? Can you think of other objectives a data change request policy should have?

2011/05/02

Poor Man’s SQL Monitor

Here is quick tip that can help you track SQL Job and Alert notifications in Microsoft Outlook. You may already do this, but if you arrange your inbox by conversation, instead of date, emails will grouped by subject. This will enable you to easily “arrow down” through the group and quickly compare messages.

Look at the standard a SQL Server Alert notification as an example:

Subject: SQL Server Alert System: 'Batch Requests/sec
DATE/TIME: 5/2/2011 5:35:01 AM
DESCRIPTION: The SQL Server performance counter 'Batch Requests/sec' (instance 'N/A') of object 'SQLServer:SQL Statistics' is now above the threshold of 1000.00 (the current value is 1077.30).

By sorting through several of these “Batch Requests/sec“ messages you can quickly compare important information such as the date/time and the current Batch Request/sec value.

In my particular case, I received this email exactly every 30 minutes. I may have found this pattern anyway, but by arranging my inbox by conversation, it was quick and easy to spot this pattern. The exact time pattern suggested some type scheduled job or process, which is exactly what this problem turned out to be.