Wednesday, 29 August 2007

A free script to create CRUD procedures.

Recently, both at my job, as well as while working on the site, I found myself having to write countless CRUD stored procedures in order to facilitate retrieving and storing data in my tables. The majority of these reflected an exact replication of the database table model (i.e. updating all columns), and I found writing those stored procedures to be a real burden. It was boring, repetitive and error prone work (especially the UPDATE procedures, where one has to replicate columns three times (once as input variables (which includes data types), once as columnnames, and once again as variables (but this time without the columnnames).

Being the lazy guy I am, I decided I would automate the matter, and write a script to automatically create the procedures for me. Before I started, I decided I would make a few starting assumptions:

1) I wanted the SELECT procedure to be able to accept either an ID-number or a Name (assuming that like me, you called the column something involving '%Name') for an individual row, or neither, in which case the whole table should be returned.
2) I wanted to be able to have the DELETE procedures to be optional. I achieved this by using a bit-flag and IF statements to see if it ought to be created or not.
3) I wanted the user to be able to specify certain columns to exclude. In the database I'm working on for the site, there are numerous audit columns to keep track of when data is entered or modified, and by who. Naturally, this is not required in all of the procedures, so I wanted a mechanism for the end-user to include or exclude these columns.


The result of my work is the script below. I've briefly tested the generated procedures against my own database, and there, they work like a charm. Perhaps others who run more specialised or customised databases will be able to come up with unexpected errors etc. If so, please do let me know.

Of course, the script will not cover 100% of anybody's specific wishes. In fact, I'm quite sure that some of the procedures will need to be customised to suit your needs, but you will find this script will enable you to start with at least some foundation in place.

I'm releasing this script under the Creative Commons Attribution Non-Commercial No Derivatives license, which means you are free to download and share the script, provided you mention and link back to me. However, you are not allowed to change or commercially use the script. I might change this to a less restrictive license in the future, however for now, I want to be able to evaluate comments etc first. This is the first time I ever released anything under a license, so any comments regarding that are more than welcome.

A list of things I have figured I could enhance the script with:

1) Add error handling to the stored procedures.
2) Add support for schemas in SQL Server 2005.
3) Add paging support to the select procedures, by adding two additional parameters.

Also, I might later on decide to automatically generate .NET custom classes based on the exact database tables.

Click here to download the script.

One last comment, should you decide to alter the generated stored procedures to suit your specific needs, do remember to rename the new procedure. The script will automatically drop its standard procedures, in order to always reflect the latest table schemas.


Creative Commons License


This work is licensed under a
Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 License.


Read more!

Tuesday, 28 August 2007

Hierarchical data with Common Table Expressions

Today, I finally managed to crack a problem that's been bothering me for some time. Some weeks ago I started working on the database schema that the site will be using, and the Entropia auction system gave me some trouble.

You see, the auction in the game is built up hierarchically, and I eventually want to show individual items in the database under the auction section they belong to (as with a lot of items, it's unclear exacty where to find them).

That in itself was easy enough, but not if you want to store the data in third normal form, but at the same time want to use some form of a cookie-trail, so the user will know exactly where to go. (With cookie-trail I mean something along the lines of "Items \ Tools \ Misc Tools"). I chose to store my data in a table with an AuctionCategoryID, AuctionCategoryName, and an AuctionCategoryParentID, the latter of which represents a SELF-JOIN.





(The figure above doesn't actually represent my own table. I chose to give the very bottom nodes of the tree a ParentID of -1, and did not define -1 as an AuctionCategoryID. Hence, creating a Foreign Key is just not possible. The figure is just for clarification).

I first tried meddling with code a bit myself, but found that my code just didn't do the trick. Sure, I could hard-code a number of self-joins, but what if suddenly more categories were added? I'd have to alter my function. That was definitely not an option.

I tried a few more things, but found my code was not robust enough, and that there were sorting troubles (among other things).

I Googled a bit, and initially ended up with a solution like this (Joe Celko describes a similar method in his "SQL for Smarties"). Hardly elegant, and hardly dynamic (although it would be possible, of course, to ensure there would be a difference of a few numbers between each of the individual "nodes". However, that still potentially means I could run out of space, and would then have to manually alter stuff again. Not an option if you ask me).

Enter the recursive Common Table Expression. BOL doesn't seem very helpful, but hopefully the code below will make things easier to see.

The first query within the Common Table Expression defines the root level. As you see, the -1 value I mentioned above is hard-coded. The UNION ALL is the key to ensuring the self-join will work for an indefinte number of child nodes. The second query just fills in those child nodes.

Note that it is possible to potentially create an infinite loop, though CTEs also offer a way to cap this.

The code:

WITH AuctionPath -- The name of the CTE
(
[Name]
, [AuctionCategoryID]
, [Path]
, [Level]
)
AS
(
SELECT
AuctionCategoryName -- Name
, AuctionCategoryID -- AuctionCategoryID
, CONVERT(varchar(255), AuctionCategoryName) -- Path
, 1 -- Level
FROM
AuctionCategories
WHERE
AuctionCategoryParentID = -1 -- Only the parent nodes here
UNION ALL
SELECT
AuctionCategoryName -- Name
, AuctionCategories.AuctionCategoryID -- AuctionCategoryID
, CONVERT(varchar(255), Path + ' \ ' + AuctionCategoryName) -- Path
, [Level] + 1 -- Level
FROM
AuctionCategories
INNER JOIN AuctionPath ON AuctionCategories.AuctionCategoryParentID = AuctionPath.[AuctionCategoryID]
)
SELECT
[Name]
, [AuctionCategoryID]
, [Path]
, [Level]
FROM
AuctionPath
ORDER BY AuctionCategoryID
OPTION (MAXRECURSION 5); -- The maximum amount of recursions

Some things of note. There are certain conditions to recursive CTEs. From BOL:

  • The number of columns in the anchor and recursive members must be the same.
  • The data type of a column in the recursive member must be the same as the
    data type of the corresponding column in the anchor member.

In the quoted text, anchor is what I have refered to as root node, and recursive member is a child node.

I hope this helps a little if you're facing a situation where you need to do hierarchies. Once you manage to get the trick, there's nothing to it, really :)

Many thanks to both Kalman Toth and russellb over at the sqlmag forums.


Read more!

Sunday, 26 August 2007

Roman numerals

If you play Entropia Universe (formerly Project Entropia), you're probably aware that mining depositsizes have two distinctive identifiers. The first being the name of the claimsize (i.e. tiny, very poor, poor, all the way to Immense). However, those terms do not clearly indicate which of the two is larger.

In the case of very poor vs. poor, it's obvious, but is Significant bigger or smaller than Substantial? In order to avoid that kind of confusion, MindArk also decided to give claims a Roman numeral indication.

I order to have the database reflect these numerals, I decided to create a function that will translate our number system into Roman numerals. It works surprisingly easy, actually. The function takes an input parameter, which is the number to be translated.


It then determines the regular numbers, the tens, the hundreds, and the thousands, and translates those into their Roman counterparts using a case statement. When it encounters a 4 or a 9 (which, obviously, are the special numbers because they are not a simple repetitive sequence of characters), it will translate those as well.

CREATE FUNCTION [dbo].[fn_ToRoman]
(
@InputNumber INT
)
RETURNS VARCHAR(16)
AS
BEGIN
DECLARE
@Singles INT
, @Tens INT
, @Hundreds INT
, @Thousands INT

SELECT
@Singles = @InputNumber%10
, @Tens = @InputNumber%100/10
, @Hundreds = @InputNumber%1000/100
, @Thousands = @InputNumber%10000/1000

RETURN
(
SELECT
REPLICATE('M', @Thousands) +
CASE
WHEN @Hundreds THEN REPLICATE('C', @Hundreds) WHEN @Hundreds = 4 THEN 'CD' WHEN @Hundreds = 9 THEN 'CM' WHEN @Hundreds >= 5 THEN 'D' +
REPLICATE('C', @Hundreds - 5)
ELSE ''
END +
CASE
WHEN
@Tens THEN REPLICATE('X', @Tens) WHEN @Tens = 4 THEN 'XL' WHEN @Tens = 9 THEN 'XC' WHEN @Tens >= 5 THEN 'L' + REPLICATE('X', @Tens - 5)
ELSE ''
END +
CASE
WHEN
@Singles THEN REPLICATE('I',@Singles) WHEN @Singles = 4 THEN 'IV' WHEN @Singles = 9 THEN 'IX' WHEN @Singles >=5 THEN 'V' + REPLICATE('I',@Singles-5)
ELSE ''
END
)
END


Read more!

Sunday, 19 August 2007

Yes! Code regions in SQL :)

Currently I'm working on a long long long SQL script that automatically writes CRUD (Create, Read, Update, Delete) procedures based on a table's definition. If you are like me, you will find that you group certain code together.

In Visual Studio, you can conveniently wrap #region tags around the code, so you can just collapse the region when you know certain stuff works, and doesn't need to be edited anymore. Not so in SQL Query Analyzer (or SQL Server Management Studio for that matter).

SQL Visualizer does offer that option though. You start a region with a -- REGION comment, and end with -- END REGION.
Because it works with comment blocks, your code will work anywhere. Except within this tool, you will be able to now use regions:



I've used other SQL tools before, and quickly found myself abandoning them due to being, well frankly, crap. I'll have to work with this one a bit more before I can actually judge it properly, but regions in SQL code are definitely something I miss in the Microsoft environment.

Update 08/27/2007: I've decided to not use SQL Analyzer a lot, unless it's for developing large scripts like the CRUD script I mentioned before. Reason is that it's not possible to copy/paste output messages (which, if you're building dynamic SQL scripts, is kind of neccessary to test).


Read more!

Wednesday, 15 August 2007

SSIS - DataReader issue

Just a quick note, unrelated to anything to do with the actual website.

Today, at work, I was working on a task that involves moving data from a production database to a Historic database. The production environment has been misused at times in the past, and as a result, there's numerous backups of tables residing in the database.

Due to regulations, just dropping the tables or wiping data is not an option. Hence, we chose to move it to the historic database. Now to accomplish this, I selected SSIS as our prefered tool. Stupidly enough, I chose to use ADO.NET as a data connection. The fun started when I was creating the actual "Data Flow" objects. I must have created a dozen of them when suddenly I was confronted with an error that SSIS was unable to translate from unicode to non-unicode data.

Huh?


I checked my CREATE TABLE scripts, and sure enough, the data types in both source and destination were set to char. I tried altering the Source's External columns, which seemed to work from the colum's properties, but actually didn't if you would check the datatype coming into the Source. I tried fiddling with the Source's Output column, but got an error message that I was not allowed to. I tried explicitly casting the column as char in the Source's SELECT statement. No luck.

A colleague came over and suggested using a Dereived Column object. Again, that didn't work. I ended up Googling, and came to an MSDN link. There, it stated somewhere halfway the page:

"One issue when using ADO.NET is that there is no ADO.NET data type corresponding to VARCHAR (non-Unicode character) in relational databases. Therefore, ADO.NET data sources output all character data as Unicode. This can cause problems with simple components that use ADO.NET sources and destinations that expect single-byte character types. You can work around this by using a Data Conversion transformation."

I tested this, and using a Data Conversion did indeed work. But what a lot of extra work to do something relatively simple. I decided to just switch all connections to OLEDB.

Conclusion: When dealing with non-unicode char columns, OLEDB might be the better choice.


Read more!

Friday, 10 August 2007

Moan: Commenting while programming

Att: Microsoft, Oracle, Sun
CC: WWC, Google (they're in on anything, so why not CC them? At the very least they'll get a chuckle out of it)

Subject: Programming comments

Dear Sirs/Madams,

If you can find some time in your undoubtedly busy schedules, would it be possible for you to discuss a universal combination of characters to note a line or block of text within code as comment, please?


As a programmer, I love commenting my code. It makes it easy for me to identify what bits of code do, to indicate why I used a certain approach, and my peers will get the same benefits. All in all, comments are our friends.

However, I have never understood the reason for having numerous ways to define comments. For instance, see the following snippets:


Visual Basic:
' This is a comment. Start the line with an apostrophe.

C# / Java:
// This is a comment.

SQL / C# / Java / Oracle:
/* This is a comment block */

SQ: / Oracle:
-- This is a comment

HTML:
<% -- This is a comment -- %>

Etc.


Could it be arranged so that there is one standard approach that works in every language? It would make things so much easier for those of us who regularly switch between languages/flavours.

Kthanks.

My next moan will demand we get the exact same syntax, too ;)


Read more!

Preventing accidental data deletion in SQL Server

After a few hours of working on the proper table definition for your base table, and filling it with data, you sit back, and decide to do some minor maintenance on a few other tables. You truncate a table, and seconds later realise to your horror that you just purged the table you spent quite some time on filling.

Sound familiar?

It recently happened to me, and I decided to put some mechanisms in place to prevent accidental deletion of data. Of course the best way to prevent accidental deletion is by making sure you are always 100% awake, and you simply do not make errors. Unfortunately that is unrealistic, so I have a few tips to help out.


As a very basic and rather obvious tip, I'll advise you to just make a database backup every time you spent a lot of time on filling in basetables. Now for some added technical tricks:

1) To prevent accidental deletion of data, I decided to use INSTEAD OF triggers. BOL again:


INSTEAD OF
Specifies that the DML trigger is executed instead of the triggering SQL statement, therefore, overriding the actions of the triggering statements. INSTEAD OF cannot be specified for DDL triggers.

At most, one INSTEAD OF trigger per INSERT, UPDATE, or DELETE statement can be defined on a table or view. However, you can define views on views where each view has its own INSTEAD OF trigger.

INSTEAD OF triggers are not allowed on updatable views that use WITH CHECK OPTION. SQL Server raises an error when an INSTEAD OF trigger is added to an updatable view WITH CHECK OPTION specified. The user must remove that option by using ALTER VIEW before defining the INSTEAD OF trigger.

{ [ DELETE ] [ , ] [ INSERT ] [ , ] [ UPDATE ] }
Specifies the data modification statements that activate the DML trigger when it is tried against this table or view. At least one option must be specified. Any combination of these options in any order is allowed in the trigger definition.

For INSTEAD OF triggers, the DELETE option is not allowed on tables that have a referential relationship specifying a cascade action ON DELETE. Similarly, the UPDATE option is not allowed on tables that have a referential relationship specifying a cascade action ON UPDATE.


The definition of my trigger is similar to this:


CREATE TRIGGER DeleteTable_1 ON Table_1
INSTEAD OF DELETE
AS
BEGIN
IF
@@rowcount > 0
BEGIN
RAISERROR
( 'Rows in Table_1 cannot be deleted!', 16, 2 )
ROLLBACK
END
END

GO


When I now try to delete rows (using DELETE FROM Table_1), I will see the following errors in my Query result:


Msg 50000, Level 16, State 2, Procedure DeleteTable_1, Line 7
Rows in Table_1 cannot be deleted!
Msg 3609, Level 16, State 1, Line 1
The transaction ended in the trigger. The batch has been aborted.



Unfortunately, that does NOT prevent the table from being Truncated.

2) To avoid accidently dropping a table, I create views over them and use the option WITH SCHEMABINDING. From Bol (my emphasis):


SCHEMABINDING
Binds the view to the schema of the underlying table or tables. When SCHEMABINDING is specified, the base table or tables cannot be modified in a way that would affect the view definition. The view definition itself must first be modified or dropped to remove dependencies on the table that is to be modified. When you use SCHEMABINDING, the select_statement must include the two-part names (schema.object) of tables, views, or user-defined functions that are referenced. All referenced objects must be in the same database.

Views or tables that participate in a view created with the SCHEMABINDING clause cannot be dropped unless that view is dropped or changed so that it no longer has schema binding. Otherwise, the Microsoft SQL Server 2005 Database Engine raises an error. Also, executing ALTER TABLE statements on tables that participate in views that have schema binding fail when these statements affect the view definition.

SCHEMABINDING cannot be specified if the view contains alias data type columns."


The first bit I emphasised clearly shows that using views in this way is a neat little feature to prevent accidental drops of a table.

The second quote is IMHO also important, as it shows that clearly the table should be "final". If you later on decide that the table in fact needs extra (or less) columns, the VIEW will actually make your life harder.

As an extra tip, when you create a view to use WITH SCHEMABINDING, always use the fully qualified name of the table (i.e. dbo.Salaries instead of just Salaries). Also, you cannot use *, but instead you will have to define all the columns in the query statement.

Nice, but still no protection against a TRUNCATE.

3) In fact, the only way to prevent TRUNCATE from executing on your table is to ensure there is a Foreign Key relationship to the table. If you are (like me) still working on the tables that will eventually link to the base table, you can always create a dummy table that references your base table:


CREATE TABLE [dbo].[Table_2]
(
[TableID] [int] NULL,
[Table_1ID] [int] NULL
)
ON [PRIMARY]

GO
ALTER TABLE [dbo].[Table_2]
WITH CHECK
ADD CONSTRAINT
[FK_Table_2_Table_1] FOREIGN KEY([Table_1ID])
REFERENCES [dbo].[Table_1] ([TableID])
GO
ALTER TABLE [dbo].[Table_2] CHECK CONSTRAINT [FK_Table_2_Table_1]



Now when you try to TRUNCATE Table_1, you will receive the following error:


Msg 4712, Level 16, State 1, Line 1
Cannot truncate table 'TABLE_1' because it is being referenced by a FOREIGN KEY constraint.



Well, that concludes this post. I hope you will find it helpful in preventing you from accidentally destroying some of your hard work. Of course all comments are welcome :)


Read more!