Tuesday, May 25, 2010

Database Standards Part III: Indexes, Constraints & Primary/Foreign Keys

This is Part 3 of 6 blogs about database standards we use at my current employer. Repeating myself, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is to establish a set of standards and stick to them.


Indexes
The naming of indexes is not unanimous. The standard is for a prefix of idx plus the table name then the field names with an underscore between the table and field name – idxPatient_LocationCode. This will create some very long names. I prefer abbreviation of the table and fields names, or like another DBA a reference name of where the index will be used.

A reminder to developers and contractors is that a primary key clustered index is an index. We have seen duplicate indexes created because of this. I kid not. Also, clustered indexes are included in non-clustered indexes, so a multi-column clustered index has to be seriously reviewed before implemented.

Understanding covering indexes and how they are used is important to reduce the columns used in some indexes. Fields that are regular updated probably should not be in a clustered index.

A reindex maintenance plan needs to be implemented. I will try to cover the custom one we use in a later blog or webcast.

Constraints
A unique constraint has its own index, so do not create an index on a unique constraint.

Use prefix chk (chkDigits) for check constraint and uc (ucPatientNumber) for unique constraint naming.

Default definitions should be in line with the create table statement and default objects have been deprecated, so do not use. Do not use data type conversion in constraint definitions.

Primary Key and Foreign keys
All tables have to have a primary key. When an identity field is used, a unique constraint must be created from a separate field(s) to uniquely identify a row. Every table needs to be able to be queried based on the contents, not an identity field. The normalization talks I have been privileged to give, explains more about this with the 1st normal form portion.

All foreign keys have to be indexes and should be the primary or unique constraint key from the parent table.

Naming convention is prefixed with pk for primary key (pkPatientNumber). Foreign Key naming is prefix fk then parent table name underscore column(s) underscore child table name and last underscore column(s) – fkPatient_ID_Admit_PatientID

Repeating Again, and again…
Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. IF a deviation is needed, a group discussion should be voted on to change.

Next week I will talk about Stored Procedures.

Thomas LeBlanc

Thursday, May 13, 2010

Database Standards Part II: Schemas, Tables & View and Columns

This is Part 2 of 6 blogs about database standards we use at my current employer. Repeating myself, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is establish a set of standards and stick to them.

Schema

The seperation of schmea name and logins was a great addition in SQL Server 2005. Schema names should be used in place of dbo. This makes viewing in SSMS great with filters. It also categorizes the objects so a new DBA or developer can get a good view of the organization in a database. We suggest a Common schema for objects that do not fit in a group, then names that are specific to functionality - Payroll, HR, Billing, etc.

Tables
Tables names should reflect its functionality and they cannot start with a number, contain spaces, underscores or be reserved words. ANSI_NULL (OFF is deprecated in a future version) should be SET ON. Pascal Casing should be used like ClientBill and singular like Admit not Admits.

There are 3 different kinds of Lookup tables we see - developer populated (workflow), small lists (state) and known list that are updattable (RevenueCode). Small lookup tables can use a character primary key. If the developer insists on ID (identity), then a character code/description column created with a unique contraint must be included. The code should be used in the transaction table. (I will clarify this is a seperate blog). Lookup tables use camel case - lkState.

The transaction table(s) should use a ID (indentity or sequential int) as the primary, but should have a unique contraint for uniquely identifying a row. Third Key Normal Form and higher is required. The ID field will be the foreign key in the child table, and named ID (PatientID in an Episode table).

Views
Camel casing should be used in naming views - vwPatientEpisode. Future thinking should be used if the view will eventually become an indexed view, then schema binging and ansi settings need to be looked at.

Columns
Column names should describe the data placed in them, using Pascal Casing - LastName and not start with a number/special character, no spaces, dashes or underscores. Beware of reserve words or key words as object names. Do not repeat table name - use LastName not PatientLastName unless a foreign key to another table - PatientID. No data type in the name - intPatientCount, but Date is acceptable in a date name - AdmitDate.

Be aware of new features in 2008 like a seperate DATE and TIME data types. Before, you had to use midnight as the time in a date column type in order to not specify a time.

Repeating
Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. IF a deviation is needed, a group discussion should be voted on to change.


Next week I will talk about Indexes, constraints and Primary/Foreigh Keys.


Thomas LeBlanc

Thursday, May 6, 2010

Database Standards Part I: Defs, Abbreviation & Data Types

This is Part 1 or 6 blogs about database standards we use at my current employer. Though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is establish a set of standards and stick to them.

Definitions

We list some definitions are the beginning of our document. Included are Camel casing (vwVisitOrders) and Pascal casing (MedicalData)

Abbreviations

Though most developers like to spell everything out because they can use IntelliSense, abbreviations are ok for some common and industry standard abbreviations that can be used. Examples are Id (identification), SSN, DOB or Rpt (Report).

Data Types

Now, the standards for these are good because of the consistency factor, comparisons (WHERE clause) and SQL Server features. Some features of SQL Server get deprecated, and others are improvements. A good example is using varchar (max) instead of text data type.
   Yes/No: Use bit data type. We also suggest naming the field like a question: Isdeleted
   Flag/Status: use tinyint (1-Active, 0-Inactive, other numbers can be used for future status)
   Use Unicode only if necessary: nvarchar(100)
   Integers: Be sure to properly size up front, if not sure use int or bigint
   Fixed field character lengths: State - char(2)
   Variable character length: LastName varchar(25)
   Money/Decimal – use over float/real. Price decimal(10,2)
   Null/Not Null – won’t even discuss
   Max – use with varchar, nvarchar, vbinary instead of text and binary.

Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. It a deviation is needed, a group discussion should be voted on to change.

God Bless,
Thomas LeBlanc