In the last year or so I have moved more in to more consulting and contracting based work which means work working with a larger spread of developers with a an equally wide range of skills and abilities. There have been many generic mistakes I have scene (as well as some great code) but the majority of major mess ups I have seen is by far based around Database and Data Access. This is probably one of the major reasons why so many people are moving to ORM’s, because it removes the responsibility of creating the DA layer and the potential pitfalls that can come with it. This is fine, but it also means that a lot of people are becoming ignorant of the details of these layers. Generally these layers are the very foundations on which the application or applications are built on, I believe the blasé nature in which the technologies are approached is the reason why some run in to such major issues mid project. This late in the scene it is very hard to go back and make schema changes, especially if multiple teams are working off this schema.
I have decided to compile a list of issues that I have run in to in the last year, that I believe are potential short comings of the data related architecture. Often these points by themselves are not huge issues, but when I see one I usually can find another just around the corner.
Hopefully these come as some help to you and your design. Please remember this is coming from a developers perspective, a .Net and MS SQL developer at that, and I, in no way, proclaim myself to be a guru in any regard, however I feel these are basic points should be a general rule of thumb.
Normalisation to the 3rd normal is standard. If you don’t know what 3rd normal is then it is a good hint that maybe you may not be the best one to be designing the DB. Sure there are reasons to move away from this rule but it should be just that, a reason, not bad design. Perhaps you may even want to document these reasons so it is clear why.
There will always be a need for nullable columns, however when tables start to have a growing number of nullable columns, you should start questioning why. Perhaps these can be moved to a separate table.
If a table is starting to have a lot of columns and many of them are null then you may want to implement a Property table and a PropertyType table. This is relatively clean and easy to set up and if done properly, is a fast way of getting a potentially large albeit often unknown number of properties, and a lot cleaner.
Incorrect Use of the Primary Key
Recently I saw a table that had a multi column primary key over five columns. After spanning more than 2 columns alarm bells go off in my head. In this case this was just bad design. The table only needed one Primary Key with one index on another column. Often this set up I find is good especially when there are unique values such as “Product Codes” that could be the PK but are in a format that does not suit performance (e.g. varchar(1000)). Using a smaller data type for the primary key improves performance. I also generally prefer to use [int] IDENTITY(1,1) as the default primary key for tables. Using large columns as PK’s in my mind is a bit of a no-no. Some times GUID’s are used if there is movement across to other databases. If this is happening then there is probably a properly qualified DBA involved leaving us developers to move back up a layer J
Incorrect Use of the Foreign Key
I have actually seen foreign keys been “used” without the being specifically being set, meaning referential integrity can not be there! Why this was done, god only knows, as it was not documented.
Not Appling Basic Coding Standards
Commenting code is a given. I can’t stand working on projects where Classes, Methods, Properties or Parameter are not obviously named and well commented. Most developers share these views, especially one who have come into projects late. For some reason these principles are ignored when it comes to Database development. Columns get named things like “DscInd” and have no comments associated to them. As far as I am aware there are no real penalties for naming you columns something that is easily understandable. The benefits are however large. People now understanding what is actually being stored! These basic procedures also help spot flaws in any business logic. If the intention of a piece of code is explicitly commented then, if for some reason it is wrong, alarm bells go off a little easier.
Naming conventions are typically applied in managed code but again are often missed in DB Development. I could go on, but you get the general idea. Treat DB dev in the same way you would when it comes to basic procedure such as naming, commenting and using the correct types.