When Geodatabase Design becomes Rocket Science
May 12, 2008
Some time ago I’ve read two interesting articles by Dave Bouwman about several problems with designing/upgrading ESRI ArcSDE geodatabases and that the whole design process actually should not be as hard as rocket science. Though for some reasons lots of things involved in this actually are cumbersome and in some area really feels like rocket science. One big issue with ArcSDE is the lack of tools DBA’s usually love to do backups, imports, exports, upgrades with. I think ESRI noticed that and eventually invented the Visio UML modelling tool. If you ask me… improvement for the worse.
“In the beginning the Universe was created. This has made a lot of people very angry and been widely regarded as a bad move.” D.N.A.
The whole ArcCatalog schema wizard thing (probably been extremely costly to develop) is not practical for database change management; neither scriptable nor automatable. It might be a convenient way to set up a database from scratch but it’s rather useless for upgrading existing schemas. So finally it is nothing else then the very dark side of ArcCatalog.
If I want to roll out a set of DB schema changes, I’ll release a bunch of SQL files. But what if I want to roll out changes specific to the geodatabase – like adding topological network rules or modifying a feature class? What I need for ArcSDE is a dead easy way to deliver short, comprehensible scripts that contain all this DDL stuff.
Just because I’ve had very good experiences with Apache Ant to handle database change management, I started to develop some Ant task’s for ArcSDE. For example if I want to add a column to a feature class, now I can add a simple task along with my usual Ant build scripts.
<sde:addcolumn connection="sde.conn" tablename="BUILDINGS">
<columndef name="BUILD_DATE" type="date" nullable="false"/>
</sde:addcolumn>
Push all of the capability of ArcSDE down to the database level
Hopefully ESRI will continue the process of pushing all ArcSDE capabilities to the RDBMS, as it already been done with the implementation of the st_geometry type. And thus hopefully someday we’ll be able to do all the DBA tasks on the database level where it belongs.
Deprecation within Database Schemas
September 8, 2007
Java developers already know about deprecating API’s – the ability to mark specific classes or methods as “deprecated” thus informing API users that they should stop referencing those parts, because they might be removed in future releases. Introducing such a transition period allows code refactoring being less painful (mainly for the public API parts) and preserve backward compatibility.
Along with the rise of new approaches to more agile database techniques, came the idea of using similar practices also in DB development. That means for example, if you want to remove a column then you don’t drop it immediately but mark the column as deprecated. After a transition period of some weeks or months, the column will be dropped. This strategy is of particular interest to database refactoring, because usually database schemas are highly coupled to a wide variety of things. And unlike Java API’s where we can distinguish between protected and public API parts, items within a database schema are potentially always public.
What to deprecate?
The following database artifacts are items that might be deprecated:
- Tables
- Views
- Single Columns
- Stored Procedures, Functions, Sequences
When to deprecate database items?
- Item will be dropped in future
- Is inefficient or insecure to use
How to deprecate?
A deprecation mechanism should fulfill the following requirements
- Deprecation markers should be attached to their items in a manner that they are preserved even if the items will be moved (import/export etc)
- Markers should refer to an explanation why the item has been deprecated and suggest what to use instead
- Throw warning messages when deprecated items are used
Techniques to deprecate database items
In case of Java, the deprecation is carried out through @deprecation comments within the source code. The Java Compiler throws warning messages when it finds references to deprecated items. Obviously within a database we cannot issue compile warnings. The only way to inform about calls to deprecated items is at runtime. That means every time a deprecated table, column, view or so is accessed, a warning message should be written to a log table. But how can we know when those items are accessed?
Oracle databases are offering the following options:
1) Triggers: It’s the standard way to attach DML monitoring to database tables. But triggers cannot monitor on SELECT’s and they are not working on Stored Procedures, Sequences and things of this nature.
2) Database Auditing: With Oracle’s Regular Auditing one can monitor also the use of Views, Functions and Sequences plus one can even log the SQL statement that causes the audit event. But to monitor table access on column-level, Fine-Grained Auditing (FGA) is needed. FGA is only available with Oracle Enterprise Edition.
Issuing deprecation warnings
Another problem is about the deprecation warning messages content itself. The logged messages should help developers and admins to figure out from where deprecated items are accessed. That means at least the program name and SQL statement should be logged. Unfortunately Oracle doesn’t offer something like a stack trace (as Java).
Conclusion
Wouldn’t it be nice if databases offered a standard mechanism to deprecate all the variety of items within a database? I think that will be a huge help towards achieving a more agile way of database design.
In a previous article i wrote about a solution for passing values from the database via select statements to Ant properties inside Ant build files.
Now I’ve extended the Incanto <sqlplus> Task to do the other way round and pass Ant property sets into the SQL scripts as defined variables.
Within SQL*Plus it is possible to work with defined variables in the form of:
define myusername = SCOTT define usertable = ALL_USERS select * from &usertable where username='&myusername';
In case of running SQL scripts from Ant with Incanto one can pass single properties into SQL this way:
<property name="db.username" value="SCOTT" />
<ora:sqlplus silent="true" logon="${db.userid}" failonerror="yes">
<![CDATA[
define myusername = '${db.username}'
select * from all_users where username='&myusername';
]]>
</ora:sqlplus>
Ant will replace all ${} style constructions with the given value of the property. But this works only for text content (SQL scripts) inside the tag (CDATA section) not with SQL script files that are called using the start attribute. You have to call the SQL file from within the CDATA section to workaround this drawback:
<property name="db.username" value="SCOTT" />
<ora:sqlplus silent="true" logon="${db.userid}" failonerror="yes">
<![CDATA[
define myusername = '${db.username}'
@mysqlfile.sql
]]>
</ora:sqlplus>
I have extended Incanto’s <sqlplus> task to simplify this. It is now possible to include a <propertyset> element within the <sqlplus> task.
<property name="db.username" value="SCOTT" />
<ora:sqlplus silent="true" logon="${db.userid}" failonerror="yes">
<propertyset>
<propertyref prefix="db"/>
</propertyset>
<![CDATA[
select * from all_users where username='&db_username';
]]>
</ora:sqlplus>
All properties included by the <propertyset> element now can be used inside the SQL script. The <sqlplus> Task is passing these as defined variables into SQL*Plus. But note: since SQL*Plus has some rigid naming conventions for variable names, the <sqlplus> Task converts all dot (.) characters to underlines. Thus in the prior example I reference the property db.username by db_username.









