Thursday, December 23, 2010

CXPACKET – that mysterious wait type

At my current employer, we have experience various levels of CXPACKET waits causing or thinking that is causes problems.

The first assumption we made that the CXPACKET wait was a problem came from SQL Server error message that suggested using OPTION (MAXDOP 1). Googling the details of the error, which included the words resource semaphore, was bringing up nothing. Calls to Microsoft Premier Support was not helping either. So, we changed queries that had long CXPACKET waits (some queries would never finish) or the Resource Semaphore deadlocking to use OPTION (MAXDOP 1) and we just waited for the query to run long (at least it finished). This happened to about 5 long running queries in our OLTP system when we converted to SQL Server 2005 and larger production machines.

A good read about this situation can be located here from Bart Duncan. Thanks Bart!!!

So, we thought every time we see CXPACKET waits, we had this same problem. The SQL Server community would make comments like CXPACKET is not the problem or look at your Query Plan. Tune the query, etc. etc. But I could not see the forest through the trees.

Now, after a couple of years of reading and using various tools for performance tuning, I have now realized why some in the SQL community keep saying that CXPACKET is not the problem. The problem I was having with this statement was they were either not communicating what the problem was or my limited understanding of performance tuning hindered me from comprehending the solution. And I really believed I knew what performance tuning was all about. Man, I was wrong. And finally still today I know there is more to learn.

Our OLTP system has 5 to 1 Reads to Writes. An analysis by EMC for a new VMAX gave us these statistics. This means that long processing queries were victim to this perceived notion that CXPACKET wait was our problem. This also included our OLAP system which is Cognos and I am now a member of this department. We just assumed it was a wait problem.

Adam Machanic has a great utility called sp_WhoIsActive. By default, it does not show CXPACKET waits. He also has a recorded session on Parallel Processing which I was blessed to host for Adam and the PASS Performance Virtual Chapter. He also has a series with SQL University, which I suggest reading. Also, Paul White ( Blog | Twitter ) has been trying to clue me in on CXPACKET via Twitter for sometime now. If sp_WhoIsActive is hard to understand, first start with sp_who3, another great monitoring tool. Thanks Mr. Denny

I have 2 executions of sp_whoIsAtive in my shortcuts in SSMS:

Ctrl+7 - exec dbo.[sp_WhoIsActive] '', 'session', 'Replication%', 'program'

image

Ctrl+8 - exec dbo.[sp_WhoIsActive] @filter = '', @filter_type =  'session', @not_filter = 'Replication%', @not_filter_type = 'program', @get_task_info = 2

image

The filter is for not seeing transaction replication in the view. The Ctrl+8 has the @get_task_info = 2 which shows a combination of CXPACKET and other waits associated with the SPID execution. This is really cool stuff for a DBA geek.

After reading the above links and riffling through some execution plans, I came to an ah hah moment when I saw a Display Estimated Query Plan and the Actual Query Plan not be the same. The difference included a Nested Loop in the estimate that did a Lookup and then a Hash Loop with 2 Index Scans in the actual plan.

What I came to realize was that if I tune the query and tables(indexes), parallelism sometimes goes away. Did you get that? …it goes away. Why?  Because now the actual plan does not need to run in parallel for the best plan.

Another one came when I realized the table (Fact table in a Data Warehouse with 30+ million rows) did not have a clustered index, but many non-clustered indexes. The suggested index from the Missing Index feature of SSMS showed a new Non-clustered index, when I applied it, it did not help 90+% like the missing index feature said. So, I went to the table, found the no clustered index on table, combined the surrogate keys into a clustered index and the Query Plan cost went from 1000+ to less than 50.

WOW!!! Now I get it. Thanks Adam, Paul and others for being patient with me

One more thing, I do not remember where I saw this, but in the sys.dm_os_waiting_tasks, when  exec_content_id = 0 and the wait is CXPACKET, that means the process is waiting for parallel processing to finish, and is a valid wait.

I think I will dig up some of these old query tuning experiences and try to show others through this blog what I have found. Maybe even a step by step process I go through nowadays to find problems with performance.

Now, back to the ETL on Data Marts.

God Bless,

Thomas

Wednesday, November 3, 2010

SQLSaturday #56 BI Edition in Dallas

Since I was missing PASS this year, not happy, I decided to drive to Dallas to attend a BI Edition of SQLSaturday on Oct 23rd, 2010.

This event was scaled back from a full SQLSaturday with 2 sponsors, Microsoft and Artis. So, there was not a lot of Vendor information, which help the Dallas SQLSaturday crew not have to attend to them. They were probably able to see some sessions instead.

Got to meet Ryan Adams along with Trevor, Tim Mitchell, Vic Prabhu and some new faces.

    

Above are Vic and Tim, the crowd and Sean.

Here are some more pictures.

         

    

     

More Pictures: http://picasaweb.google.com/dfwsqlsaturday/SQL_Saturday_Oct23#

I presented Transition from DBA to BI Architect, a new presentation. The main points were learning Dimensional Modeling which is not denormalization, but an actual technique and learning the Design Process, SDLC, for a BI department. If you were like me, my DBA duties included more support than design. So, I went through the last couple of months of work I have been doing along with the books I have been reading and bring my developer hat back into the mix.

SQL Server Deep Dives has a great section on BI. 2 authors were at this event, John Welch (Data Profiling) and Erin Welker (BI for the relational guy). It was amazing, as I was presenting and talking about the book, Erin walked into the main room, then I meet John in the Speakers room.

CXPACKET series coming soon.

God Bless,

Thomas

Thursday, October 7, 2010

Houston Tech Fest and Vacation

I will be heading to Houston on Friday afternoon to visit with other Microsoft technology geeks and speak/attend  Houston Tech Fest. Patrick LeBlanc ( Blog | Twitter ) and William Assaf ( Blog | Twitter) from Baton Rouge will be on the same track together giving the tech event some SQL Server sessions. It looks like there are some other SQL Server sessions from Tim MitchellTrevor Barkhouse and Geoff Hiten.
It is great to be able to share the experience I have gained over the years, and network with others that have similar interests in SQL Server. I always meet one or 2 people in the industry to help with me in learning more tricks and tips.

Then, starting Sunday, I will drive to Kanuga, North Carolina for the 3rd year in a row to See The Leaves. This is a week of vacation away from the busy days of work and life to read, pray and meditate.
I plan on reading Richard Foster’s book on Spiritual Disciplines and Phillip Yancey’s Soul Survivor. The wood carving has hooked me for arts and crafts for the week and I plan of updating the walking cane and making a gift for Alex and Janet.

This past couple of weeks I have dove into Dimensional Modeling and using SSIS to ETL data from a supposedly Relational Database, but more and more looking like a database created by Object Oriented developers. Lots of inherent looking transaction tables as well as lookup tables.

I am still using the Kimball Methods (and books) to create and populate these Data Marts. I seemed to have a good feel for Dimensions, but the Fact tables I am struggling to hang on to the normalized structure just cause I know how to report off them. Other contractors seemed to do the same thing with the Facts – they are just transaction tables.

A little more practice and reports should get me out of the normalized only mentality. 

God Bless,
Thomas LeBlanc

Saturday, September 11, 2010

BI reality is Setting in

The new job in the Business Intelligence group has been on for about a month or two and there is a lot to learn…and a lot to teach the group.

The group’s focus has been reporting to the business user, but there has not been a great deal of time focused on the structure of the databases and tables. This is definitely going to be a give and take job.

To help improve performance, we have purchased Ignite8 from Confio. This has saved me hours if not days of finding the worst performing queries. The software looks at wait stats and organizes the queries by longest waits. You can even name the hash number it generates to label the queries. It even gives you starting points where to look for performance improvements. We have used Idera’s Diagnostic manager for years, which gives great historical data for the instance, but not near the help for tuning queries. I have to say though, I have not dug into Ignite enough to see if it has the alerts we need for real time instance/server issues which seemed to come up once a month.

image

One query had 200 minutes of CXPACKETS and was running for about 10 hours. One Clustered index and 4 non-clustered indexes improve the query to 10 seconds. Yeah, you heard me. Apparently, a query can start running and endlessly gave data, process, stop running because of CXPACKETS, and start all over again. Our data table was a heap with 7 million rows. Not good for a data warehouse table.

The other items on my list is to get the feel for the flow. Lots of meetings and lots of reading. The Kimball’s Group Data Warehouse Lifecycle Toolkit is what I am reading right now. It dates back to 1989, but it is still the status quo of today’s structures just like Normalization. I have had to re-read some chapters to get the idea, but I am getting it. The first inclination I had from what i have heard, was denormalize. Well, I threw that out the window and adopted the idea of Dimensional Modeling, not denormalization.

Speaking of normalization, I am blessed to present 2 sessions on Normalization at Houston Tech Fest on Oct 9th at University of Houston. 3rd Key Normal Form: That’s Crazy Talk!!! is my bread and butter. I get 90 minutes for this presentation. I have presented this talk at the Baton Rouge PASS SQL Server User group and SQLSaturday in Baton Rouge and New York City. Whiteboard Normalization will be a continuation of the first talk. I will be able to catch up with Trevor Barkhouse from Dallas and William Assaf and Patrick LeBlanc. Patrick will be doing a CDC + SSIS = SCD for data warehouse population I will finally be able to see.

God bless,

Thomas LeBlanc

Monday, September 6, 2010

SQLSaturday #28 Baton Rouge, LA

 

A wonderful and informative SQLSaturday was held in Baton Rouge on the campus of LSU.

Steve Jones captured it on video:

Blogs about the event”

 

Tim Mitchell - http://www.sqlservercentral.com/blogs/tim_mitchell/archive/2010/08/21/sql-saturday-28-baton-rouge-recap.aspx

Wes Brown - http://sqlserverio.com/2010/08/16/what-a-great-sql-saturday-baton-rouge/

My new Friend Eli Wienstock-Herman -http://blogs.lessthandot.com/index.php/DataMgmt/DBAdmin/sql-saturday-28-baton-rouge

Here are some pictures:

SqlSat28_William_Twitter_BrSqlSat   SqlSat28_MikeHguetandBrianRigsl  

SqlSat28TrevorFriBanquet DSCN0223

DSCN0222   SqlSat28_Al_FriBanquet

SqlSat28_DotNetFreaks   DSCN0214

Friday, August 13, 2010

SQLSaturday Baton Rouge #28

Come one, come all to the largest FREE technology event ever in Baton Rouge and probably in Louisiana.

http://sqlsaturday.com/28/eventhome.aspx

600+ registered attendees to network with and 58 sessions from beginners to advanced. Many MS MVPs from SQL Server and .Net

I will be presenting about Database Normalization and how important it it. http://sqlsaturday.com/viewsession.aspx?sat=28&sessionid=1323


Track Starts Session Title Speaker


.Net 1 07:30 AM .NET 3.5 Fundamentals Mike Huguet

.Net 1 9:00 AM Zen Coding Brian Rigsby

.Net 1 10:15 AM Getting Started with the Entity Framework 4.0 Rob Vettor

.Net 1 11:30 AM Zen Testing Brian Rigsby

.Net 1 01:30 PM C# Ninjitsu Chris Eargle

.Net 1 2:45 PM Advance Your Debugging Skills with VS 2010 Rob Vettor

.Net 1 4:00 PM 6 Months of putting VS2010 and TFS thru the Paces Michael Moles

.Net 2 9:00 AM Exploratory Testing with Microsoft Test Manager Vaneshia Leachman

.Net 2 10:15 AM Building Richer Web Applications with SP2010 Kyle Kelin

.Net 2 11:30 AM The Best of Visual Studio 2010 Zain Naboulsi

.Net 2 01:30 PM 3P's (Principles, patterns and performance) of SD Chander Dhall

.Net 2 2:45 PM Introduction to NHibernate and Fluent NHibernate Brian Sullivan

.Net 2 4:00 PM RESTful Data Chris Eargle

Apps\Cloud\Personal Development 9:00 AM Building a Testable Data Access Layer Todd Anglin

Apps\Cloud\Personal Development 10:15 AM The Modern Resume: Building Your Brand Steve Jones

Apps\Cloud\Personal Development 11:30 AM Getting SQL Service Broker Up and Running Denny Cherry

Apps\Cloud\Personal Development 01:30 PM Intro to Windows Azure Ryan Duclos

Apps\Cloud\Personal Development 2:45 PM Azure - Best Practices Chander Dhall

Apps\Cloud\Personal Development 4:00 PM SharePoint 2010 Management with PowerShell Cody Gros

BI\SSRS 07:30 AM Breakfast Basics, SSAS Cube Creation Barry Ralston

BI\SSRS 9:00 AM An Introduction to Power Pivot Bryan Smith

BI\SSRS 10:15 AM Get your Mining Model Predictions out to all Steve Simon

BI\SSRS 11:30 AM Can you control your reports? Ryan Duclos

BI\SSRS 01:30 PM Data Mining.. Making $mart financial decisions Steve Simon

BI\SSRS 2:45 PM College Football and MSFT Business Intelligence Barry Ralston

BI\SSRS 4:00 PM My first SQL Report Mark Verret

DB Design & App Dev 9:00 AM iPhone Development using .NET and Monotouch Jason Awbrey

DB Design & App Dev 10:15 AM Efficient Data Warehouse Design Suresh Rajappa

DB Design & App Dev 11:30 AM Introduction to Windows Phone 7 Development Carlos Femmer

DB Design & App Dev 01:30 PM Conceptual Data Modeling: Defining Our Data Eli Weinstock-Herman

DB Design & App Dev 2:45 PM 3rd Normal Form: That's crazy talk!!! Thomas LeBlanc

DB Design & App Dev 4:00 PM Parallelism Options in .NET 4.0 Al Manint

Microsoft .NET Framework from Scratch 9:00 AM Microsoft .NET Framework from Scratch Keith Elder

SQL Admin I 07:30 AM Basic SQL Server Steve Jones

SQL Admin I 9:00 AM Common TSQL Programming Mistakes Kevin Boles

SQL Admin I 10:15 AM An Introduction to Profiler and SQL Trace Trevor Barkhouse

SQL Admin I 11:30 AM Database Maintenance Essentials Brad McGehee

SQL Admin I 01:30 PM Understanding Storage Systems and SQL Server Wesley Brown

SQL Admin I 2:45 PM Running mixed workloads on SQL Server Jason Massie

SQL Admin I 4:00 PM Beginning Powershell Sean McCown

SQL Admin II 9:00 AM Best Practices Every SQL Server DBA Must Know Brad McGehee

SQL Admin II 10:15 AM Introduction to DMVs in SQL 2005/2008 William Assaf

SQL Admin II 11:30 AM SSIS and Powershell: A winning combination Sean McCown

SQL Admin II 01:30 PM SQL Server Memory Deep Dive Kevin Boles

SQL Admin II 2:45 PM Deciding if VMs are a good choice for your SQL Svr Denny Cherry

SQL Admin II 4:00 PM A PowerShell Cookbook for DBAs Trevor Barkhouse

SSIS\BI 07:30 AM Build Your First SSIS Package Tim Mitchell

SSIS\BI 9:00 AM Data Mining in Action: A case study Drew Minkin

SSIS\BI 10:15 AM Dynamic SSIS with Expressions and Configurations Tim Mitchell

SSIS\BI 11:30 AM CDC + SSIS = SCD Patrick LeBlanc

SSIS\BI 01:30 PM MDX 101 Bryan Smith

SSIS\BI 2:45 PM ssis templates: the easy way to win. Tim Costello

SSIS\BI 4:00 PM SQL Source Control: Poor man's data dude. Tim Costello


Thomas LeBlanc

Friday, July 30, 2010

SQLSaturday, BI and What I am reading

SQLSaturday # 28 (http://www.sqlsaturday.com/28/eventhome.aspx) is coming to Baton Rouge on August 14th at LSU in the CEBA building. We have 58 sessions and 490+ registered attendees.

This is the second SQLSaturday for BR thanks to SQL Server MVP Patrick LeBlanc leadership and employees of Sparkhound and Antares as well as LSU. Patrick is my brother from a different mother.

I spoke last year on RML Utilities and SQLNexus, but this year the session I will be presenting is “Third Normal Form: That’s Crazy Talk” (http://www.sqlsaturday.com/viewsession.aspx?sat=28&sessionid=1323). This seems to be my calling as a presentation that I have locked down. Volunteering has been interesting as we watch things change month to month by our fearless leader and great politician.


The other news is I have started a new position at Amedisys, moving from Senior DBA to BI Data Integration Lead Developer. The last 3 weeks have been filled with meetings and documentation review. They have already assigned me tech specs to complete, assist with latest Data Mart ETL and improve performance on the End of Month process. My whole career has revolved around normalized databases, which I am very passionate about. Changing my thinking is difficult (I am 42), and seems to be the biggest barrier. Please pray for me.

Louis Davidson's book Pro SQL Server 2008 Relational Database Design and Implementation (http://drsql.org/ProSQLServerDatabaseDesign.aspx)  is a great and long read that I believe is very comprehensive when it comes to educational reading for database design. I am using the reading of this book to help me prepare for SQLSaturday #28 as well as Houston TechFest (http://www.houstontechfest.com/)  on Oct. 9th. It looks like Houston has accepted 2 sessions, the one above and Whiteboarding Normalization (http://www.houstontechfest.com/dotnetnuke/HoustonTechFest/Sessions/tabid/56/CodecampId/3/SessionId/218/Default.aspx) for their sessions packed event.

Wish me luck, and come back for some more talk about Normalization, and Denormalization at this blog.

God Bless,
Thomas LeBlanc

Friday, July 16, 2010

Database Lookup Tables

Over the years, I have noticed a trend in building databases. A developer will get on a certain path when it comes to what I define as Lookup tables. There are many names or labels the community has placed on this Type/Code/Category of tables, but I want to give 3 examples of the types we ran into in the latest creation of a database.


The below list and table structure was standardized once so everybody (developer, DBA, BI, etc.) would be in agreement. After the Create Table below, please read for an explanation of the columns and types.

1. Drop-down list - this table is really just a pick list for a transaction or category type.
    Example
        BillingClass: B - Billable, NB - NonBillable, etc.
        RequstType: P - Phone, L - Letter, etc.

2. Developer Control - developers usually like to have some tables that cannot be change by end user because it would change the flow of the application code. This group of developers wanted to enumerate the IDs in .Net code. Too bad we (DBAs) can use enumeration in a Query window 
    Example
        WorkFlow - I - Initial, U - Updated, P - Processed & C - Close

3. End User Control - end users want to be able to add to the list for use in the application, but unlike the drop-down list, there are other properties to the Item.
    Example
        RevenueCode - 435 - Inventory stream, 321 - visit cost, etc.

The first 2 types have a consistent structure:

CREATE TABLE [Billing].[lkClass](
    [Code] [varchar](25) NOT NULL,
    [ID] [int] NOT NULL,
    [Name] [varchar](100) NOT NULL,
    [Description] [varchar](1000) NULL,
    [DisplayOrder] [int] NOT NULL,
    [RowState] [int] NOT NULL,
    [TimeStamp] [timestamp] NULL,
  CONSTRAINT [PK_lkClass] PRIMARY KEY CLUSTERED
    ([Code] ASC)
      WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
        IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
        ALLOW_PAGE_LOCKS = ON) ON [PRIMARY],
  CONSTRAINT [uc_lkClass_ID] UNIQUE NONCLUSTERED
    ( [ID] ASC )
    WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
       IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
       ALLOW_PAGE_LOCKS = ON) ON [DB_INDEX],
  CONSTRAINT [uc_lkClass_Name] UNIQUE NONCLUSTERED
      ( [Name] ASC )
     WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF,
        IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON,
        ALLOW_PAGE_LOCKS = ON) ON [DB_INDEX]
  ) ON [PRIMARY]

The code is 25 characters because we voted to have a varchar () that would satisfy all possibilities. Why still use Code when there is a unique ID field? End users still like to be able to type known code/type values into an interface for fast data entry. Even with the intelliSence possibilities, I personally do not see codes going away with end users. Also, reports need abbreviated versions of lookup to help compact display. This is the foreign key to a related table, not the ID.

The ID is used for .Net objects that want to reference a list with an integer, rather than using a Search/Find method to locate the entry in a list/collection. The Name is really the long description, while the Description is what you might call a memo field (notice MS Access reference).

DisplayOrder is self-explanatory and RowState has values like 0-active, 1-inactive, and other numbers reserved for future use. This allows the row to be ‘deleted’ – inactive, but not removed from the table. Timestamp is really the RowVersion data type from SQL Server data types. We have not changed this to RowVersion, yet.

Even though State Abbreviation is known as only 2 characters and you need only 2 columns in the table, we still use this structure because the Object Oriented interface/object code using the same structure is global throughout the project. The end user control type table will include the additional columns on the same table. We do not create a one-to-one relationship to another table just for more attibutes, in most cases.

Primary key is in the Primary file group because it is the clustered index, but the other Unique Constraints are in the INDEX File Group in order to place the Clustered Index (table) on a different file group than the Indexes. The Unique constraints are used to make sure duplicate IDs and/or Names are never possible even though the developers promise that will prevent that in the Code.

Thursday, July 8, 2010

Database Standards Part VI: code and design

Database Standards Part VI:  code and design

This is Part 6 of 6 blogs about database standards. This last post is about coding standards that should be followed.

Design
A data dictionary and diagram should be maintained. The diagram can be a bird's eye view. The dictionary contains tables with description and columns with data type, size, default, constraints and descriptions. The foriegn keys would be listed along with primary keys.

Coding

- Avoid using SELECT *.
- Use CTEs instead of Cursors
- Dynamic SQL is difficult to read, confuses security and not very maintainable.
- NOLOCK and READUNCOMITTED can only be used when absolutely necessary.
- INNER JOIN not JOIN
- prefix objects with schema, even dbo.
- Use TRY/CATCH for error trapping
- only use parentheses for AND\OR expressions
- spaces between expressions and variables
   Type = 'S' NOT type='S'

- BEGIN/END should only be used with there is more than one execution line after condition


Formatting
    Bad:
Select Sample.*, Analysis.*,

Sample.DueDate, Sample.EnteredBy
FROM Sample JOIN
Analysis on Sample.SampleID = Analysis.SampleID

   Better:
SELECT s.SampleID, s.Description AS SampDesc,
    a.AnlCode AS AnalysisCode
  FROM Sample s
      INNER JOIN Analysis a
        ON a.SampleID = s.SampleID
  WHERE s.SampleID = @SampleID


Next set of Blogs will be a diary of the transition from production senior DBA to BI Data Integration Lead Developer.

Thomas LeBlanc
Outoging Production Senior DBA

Thursday, June 17, 2010

Database Standards Part V: Triggers and User-defined Functions

Database Standards Part V: Triggers and User-defined Functions

 
This is Part 5 of 6 blogs about database standards. Again, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is to establish a set of standards and stick to them.
 
Triggers
Naming convention for triggers are 'it' for insert, 'ut' for update and dt for delete (itPatient). Trigger can slow CRUD statements so we warn developers to be careful when creating triggers. It also can 'hide' logic from development. Over and over, we get confused people trying to find how data gets added to a table because of a trigger on another table. Verify during a review that the trigger can handle multi-row updates when joining to the inserted or deleted recordsets. Most triggers I see are related to auditing Ins/Upd/Del statements.
 
User-Defined Functions
User-defined functions are prefixed with udf. They can be helpful with string manipulation commonly used in stored procedures or SQL statements. UDFs do add processing time to the execution and can cause scans when used on fields in a WHERE clause. Looking at an execution plan might exclude the code from the UDF. I see table-valued UDFs used like DMFs from dynamic management objects from Microsoft, mostly on small data sets.
 
Also, when  profiling, you get a statement for each call to the udf for each row in the SELECT statement if used in a query, so you might want to exclude the data in the profiled output.
 
Repeating again, and again…

Though my opinion is not the same as some of these standards, I do believe standards should be discussed and established, then followed. If a deviation is needed, a group discussion should be voted on to change.

Next week I will talk about coding and deisgn standards.

Thomas LeBlanc

Friday, June 4, 2010

Database Standards Part IV: Stored Procedures


This is Part 4 of 6 blogs about database standards. Again, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is to establish a set of standards and stick to them.

Stored Procedure
Stored procedures should be required for all create, read, update and delete (CRUD) DML (Data Manipulation Language) statements. Using stored procedures benefits security, consistence, performance and manageability on the DBA side of applications.

Security can be managed by giving GRANT EXECUTE to logins requiring access to DML statements, or in 2005 and greater EXECUTE can be granted to a login on the entire database. Performance gains are realized in compiled query plans and SQL Server’s ability to re-compile only the statement in the SP that needs a new/better plan.

Naming conventions:
  Read - uspPatientGet or GetPatient
  Delete - uspPatientDel or DelPatient
  Report – rptSecurityAudit
  Process – prcProcessPatient

The use of Schemas can better organize the SPs. Do not use prefixes sp, sp_ or xp_. These are used by the SQL Server system or custom extended procedures.

Precompiled SQL can use less memory in the Procedure Cache and require less look up in Proc Cache for existing plans.

Remember not to repeat the schema name in the SP – Patient.GetDetails instead of Patient.GetPatientDetails.

SET ANSI_NULLS ON can be a requirement because it will be depreciated in future SQL Server versions.

SET NOCOUNT ON can be prevent round trips to the client - http://msdn.microsoft.com/en-us/library/ms189837.aspx

Using a Source Control application can reduce comments in a SP. The versioning in TFS or Source Safe gives the developer the ability to associate comments with the check in. Our SPs begin with the following structure

IF NOT EXISTS (SELECT * FROM Sys.Objects
                              WHERE Object_Id = OBJECT_ID(N’[Ptient].[rptPatientReleased] ’)
                                 AND type in (N’P’, N’PC’))
  BEGIN
    CREATE PROCEDURE [Ptient].[rptPatientReleased]
      AS RETURN
    GO
  END

ALTER PROCEDURE [Aptient].[rptPatientReleased] AS
    SELECT TOP 10 * FROM NewProcedureView
  GO

Always end the SP in GO when creating or altering. The Alter is preferred because it retains permissions on SP. If DROP/CREATE is used, remember to apply the permissions. Return should not be used to return data to the calling application. Return should be reserved for returning the status of the SP. Use INPUT and OUTPUT parameters in the SP.

Repeating again, and again…
Though my opinion is not the same as some of these standards, I do believe standards should be discussed and established, then followed. IF a deviation is needed, a group discussion should be voted on to change.

Next week I will talk about Triggers and User-Defined Functions.

Thomas LeBlanc

Tuesday, May 25, 2010

Database Standards Part III: Indexes, Constraints & Primary/Foreign Keys

This is Part 3 of 6 blogs about database standards we use at my current employer. Repeating myself, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is to establish a set of standards and stick to them.


Indexes
The naming of indexes is not unanimous. The standard is for a prefix of idx plus the table name then the field names with an underscore between the table and field name – idxPatient_LocationCode. This will create some very long names. I prefer abbreviation of the table and fields names, or like another DBA a reference name of where the index will be used.

A reminder to developers and contractors is that a primary key clustered index is an index. We have seen duplicate indexes created because of this. I kid not. Also, clustered indexes are included in non-clustered indexes, so a multi-column clustered index has to be seriously reviewed before implemented.

Understanding covering indexes and how they are used is important to reduce the columns used in some indexes. Fields that are regular updated probably should not be in a clustered index.

A reindex maintenance plan needs to be implemented. I will try to cover the custom one we use in a later blog or webcast.

Constraints
A unique constraint has its own index, so do not create an index on a unique constraint.

Use prefix chk (chkDigits) for check constraint and uc (ucPatientNumber) for unique constraint naming.

Default definitions should be in line with the create table statement and default objects have been deprecated, so do not use. Do not use data type conversion in constraint definitions.

Primary Key and Foreign keys
All tables have to have a primary key. When an identity field is used, a unique constraint must be created from a separate field(s) to uniquely identify a row. Every table needs to be able to be queried based on the contents, not an identity field. The normalization talks I have been privileged to give, explains more about this with the 1st normal form portion.

All foreign keys have to be indexes and should be the primary or unique constraint key from the parent table.

Naming convention is prefixed with pk for primary key (pkPatientNumber). Foreign Key naming is prefix fk then parent table name underscore column(s) underscore child table name and last underscore column(s) – fkPatient_ID_Admit_PatientID

Repeating Again, and again…
Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. IF a deviation is needed, a group discussion should be voted on to change.

Next week I will talk about Stored Procedures.

Thomas LeBlanc

Thursday, May 13, 2010

Database Standards Part II: Schemas, Tables & View and Columns

This is Part 2 of 6 blogs about database standards we use at my current employer. Repeating myself, though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is establish a set of standards and stick to them.

Schema

The seperation of schmea name and logins was a great addition in SQL Server 2005. Schema names should be used in place of dbo. This makes viewing in SSMS great with filters. It also categorizes the objects so a new DBA or developer can get a good view of the organization in a database. We suggest a Common schema for objects that do not fit in a group, then names that are specific to functionality - Payroll, HR, Billing, etc.

Tables
Tables names should reflect its functionality and they cannot start with a number, contain spaces, underscores or be reserved words. ANSI_NULL (OFF is deprecated in a future version) should be SET ON. Pascal Casing should be used like ClientBill and singular like Admit not Admits.

There are 3 different kinds of Lookup tables we see - developer populated (workflow), small lists (state) and known list that are updattable (RevenueCode). Small lookup tables can use a character primary key. If the developer insists on ID (identity), then a character code/description column created with a unique contraint must be included. The code should be used in the transaction table. (I will clarify this is a seperate blog). Lookup tables use camel case - lkState.

The transaction table(s) should use a ID (indentity or sequential int) as the primary, but should have a unique contraint for uniquely identifying a row. Third Key Normal Form and higher is required. The ID field will be the foreign key in the child table, and named ID (PatientID in an Episode table).

Views
Camel casing should be used in naming views - vwPatientEpisode. Future thinking should be used if the view will eventually become an indexed view, then schema binging and ansi settings need to be looked at.

Columns
Column names should describe the data placed in them, using Pascal Casing - LastName and not start with a number/special character, no spaces, dashes or underscores. Beware of reserve words or key words as object names. Do not repeat table name - use LastName not PatientLastName unless a foreign key to another table - PatientID. No data type in the name - intPatientCount, but Date is acceptable in a date name - AdmitDate.

Be aware of new features in 2008 like a seperate DATE and TIME data types. Before, you had to use midnight as the time in a date column type in order to not specify a time.

Repeating
Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. IF a deviation is needed, a group discussion should be voted on to change.


Next week I will talk about Indexes, constraints and Primary/Foreigh Keys.


Thomas LeBlanc

Thursday, May 6, 2010

Database Standards Part I: Defs, Abbreviation & Data Types

This is Part 1 or 6 blogs about database standards we use at my current employer. Though we do not enforce these with an iron fist, they are advised for clarity and consistency. The important point, in my opinion, is establish a set of standards and stick to them.

Definitions

We list some definitions are the beginning of our document. Included are Camel casing (vwVisitOrders) and Pascal casing (MedicalData)

Abbreviations

Though most developers like to spell everything out because they can use IntelliSense, abbreviations are ok for some common and industry standard abbreviations that can be used. Examples are Id (identification), SSN, DOB or Rpt (Report).

Data Types

Now, the standards for these are good because of the consistency factor, comparisons (WHERE clause) and SQL Server features. Some features of SQL Server get deprecated, and others are improvements. A good example is using varchar (max) instead of text data type.
   Yes/No: Use bit data type. We also suggest naming the field like a question: Isdeleted
   Flag/Status: use tinyint (1-Active, 0-Inactive, other numbers can be used for future status)
   Use Unicode only if necessary: nvarchar(100)
   Integers: Be sure to properly size up front, if not sure use int or bigint
   Fixed field character lengths: State - char(2)
   Variable character length: LastName varchar(25)
   Money/Decimal – use over float/real. Price decimal(10,2)
   Null/Not Null – won’t even discuss
   Max – use with varchar, nvarchar, vbinary instead of text and binary.

Though my opinion is not the same as some of these standards (I say you can abbrev. anything), I do believe standards should be discussed and established, then followed. It a deviation is needed, a group discussion should be voted on to change.

God Bless,
Thomas LeBlanc

Thursday, April 29, 2010

SQLSaturday New York: Smiling all day

SQLSaturday is catching on all over the place. It is amazing how many people are thirsting for SQL Server knowledge. One of the best experiences I have when going to a session, is the validation that what I am doing at work is on the right track.

First, I would like to thank the sponsors for funding the event. Without them the event is not possible.

Confio (Ignite) even invited the speakers and volunteer to a Friday evening welcoming. It was just what was needed the night before the event. Thank you Matt Larson and David Waugh.

Other sponsors included Quest, Idera, Expressor, Redgate, Pearl, XLeratorDB, PASS, Pragmatic Works, TwentySix, SetFocus, Agile Technologies, Microsoft, CozyRoc and Apress.

Biggest thank you goes out to the NJ SQL Server User Group and volunteers, who did a top notch job. Thanks Alex for helping the speakers and SQLDiva for leading the group.

Breakfast was plentiful, coffee, orange juice, milk, donuts, bagels, etc.

The first session I attended with Andy Leonard SSIS Design Patterns. Andy did a great job with 2-3 patterns involving variables and events in  SSIS.

Next was Matt Larson(Confio), showing how to use sysprocesses and DMV/DMFs to get performance information from SQL Server. A gentleman from the audience encourage all in attendance to install the trial version on Ignite and we would be amazed with the results from this tool. I am trying to schedule a day or 2 for a trial run.

Lunch was in the speakers room chatting with other speakers and volunteers. Next to me was Thomas LaRock, Andy Leonard, Grant Fitchey, Slava Kokaev, Adam Jorgensen and others. Lunch was box style and worked great.

Grant was up next with a great presentation on More Unnecessary Tuning. Basically, some do's and do not's up front when designing data access.

Andy Leonard continued in the afternoon with Incremental Loads in SSIS, but I had to leave before the end to prepare for my session. The Hash compare for changes interested me.

To close the day, I presented 3rd Key Normal Form: That's Crazy Talk!!! The room was a conference room with a table in the middle and chairs all arounnd with somewhere betwee 20-30 attendees(1 or 2 had to sit on the floor). The session was directed at good and bad DB designs, but I probably only got through the first 10-12 slides before the discussion went to Visio diagrams and SSMS scripting.

The participation from the attendees was great, I just had to keep the focus on the topic - kinda of like leading a meeting at work. It turned out better than I imagined and had 5-6 attendees stay a little while afterwards to thank me and offer some great advice and comments.

One open question was about Attributes on an Entity that vary based on a product type. I have never used XML as a data type, but this seemed to be a good case. Another peson suggested looking into EAVs design. I displayed a design in Visio that got a couple of people saying, 'That is not normal form.' OUCH!!!

It seems like a good session topic for 4th and 5th key normal form, and some industry standards or examples.

Plenty of prizes were given away by the sponsors and the conference.

My next Blog post will be a series on the DB Standards document from my current employer. I could not place this with my session because it belongs to Amedisys.




God Bless,
Thomas

Friday, April 16, 2010

Maintenance Weekend...

This Saturday is the third Saturday of the month, which means I will be working. We have a shutdown were the networking, DBA and security teams can update the systems while it is offline to end users.

These functions used to have to be done on unscheduled evenings when ever possible, but now it has been on Saturday for over a year. The time has gone from 6-10 hours to 2-3 if no problems occur. Also, because of failover systems, the work has been lighter.

LSU plays Alabama in baseball and I will be able to go to Sat and Sun. They beat Tulane on Wednesday night 10-4. Always enjoyable to see use take down the greenies.

God Bless,
ThomBeaux

Thursday, April 8, 2010

Passed 70-431, now MCITP DB 2005

I assumed by reading the title of 70-447, that all I needed to upgrade MCDBA to MCITP was to take this test. Wrong!!!

You need to pass 70-431 and become a MCT first, then take 70-447. Well, I did it in reverse and now can say I am a MCITP. Next, 2008 certification.

The test was 35 multiple choice questions, which did not seemed to hard. What I did not realize, was an interactive 12 questions was coming down the pipe.

You actually would be given a scenario, then have to use a stripped down verison of SSMS (Management Studio) to solve the problem. I have to say, I was not prepared for this, but still finished and Passed with 880. Not bad.

Certifications might not be what they were years ago, but the experience in my opinion is good. There is alot of studying required, plus it questioned some of the pratices we DBAs do at our site.

Some questions about CLRs, security, Serice Broker were areas we do not use and I had to learn from reading. Others like database mirror, replication, log shipping, network protocols were areas we use and were easy.

God Bless,
Thomas

Thursday, April 1, 2010

SQLSaturday Baton Rouge

Great lunch with the planning group. Looks like we will have 9 total tracks, 5 SQL Server and 4 development.

The submitted sessions have us about 75% full with addtional sessions and alternates available.

I spoke last year, but was not involved with the planning. Alot of good people helping.

This week at work required some DBA reviews of developers code. I am also reading an 2008 Analysis Service book which has been great. Microsoft gave us a copy and I am on chapter 12, with 4-5 to go.

It seems the tool is great, but the design has to follow some best pratices in order to work well as designed. The are many properties to change in a dimension or cube to get the output just right. Another case of if you do not start using it, you will forget where to adjust output.

God Bless,
ThomBeaux

Thursday, March 18, 2010

New York, New York

Very excited about going to New York April 24th and speaking at SQLSaturday. My girlfirend and I will go down on Thursday and watch some Broadway shows and visit a museum or 2.

The talk is '3rd Key Normal Form: That's crazy talk!!!' - a fellow employee has this on his whiteboard in his cube. In all my time as a developer and DBA, there is always some denormalization, or really no fully normalization to begin with.

The session is going to show some of the experiences I have encountered over the years. One is in the Laboratory Information System Management arena. My first job out of college was to help normalize data from a file system (written in Profressional Basic - DOS 3.3). We used FoxPro DOS to create a report writer for our clients. Many years later MS Access came along, and so did Visual Basic. I had some great ideas that the developers (Chemistists by training) did not want to implement.

The second example is a FEMA program for funding distribution and Performance Indicator tracking. This was a great example of compound primary keys geeting larger and larger. Two lessons here: Not every table is part of a hierachy and Parent-child can benefit from ID (identity field) primary keys.

If I have time, I think I want to get the audience into a 4-6th normal form discussion. Paul Nielson will be there, and it would be great to visit with him about database normalization.

God Bless,
Thomas

Thursday, March 11, 2010

Winton Marsalis and Fragmentation

With replication and mirroring on our large databases, there is a limit on the amount of reindexing that can be done each evening. We have a job that loops through a couple of tables to pick what to attempt to reindex/reorg on a nightly basis. Recently, it has been updated to a better structure of lookup tables and 3 levels of possible work:

1. Rebuild if AvgFrag > 60 or AvgPageSpaceUsed <> 30 or AvgPageSpaceUsed < 65
ELSE
3. Update Stats

I added a History table to record State/End Date and Before/After AvgFrag & SpaceUsed

We hope after a couple of months we can get some useful statistics to change FillFactor on some indexes.

Tuesday night I brought my girlfriend to see Winton Marsalis and the Jazz Lincoln Center ensemble performing in Baton Rouge. The set started with some old and new swing, then finished with 7 works dedicated to 7 different artists. The composer explained how the song was written with the artist and his/her works of art in mind.

http://www.wyntonmarsalis.org/2009/12/23/marsalis-tour-march-2010/

It gave a unique view in how any type of work you do can be art...

God Bless and keep smiling,
Thomas :)

Friday, March 5, 2010

Updating large tables that are Replicated and Mirrored

I am working on an article for submission to explain the steps we use to update a Transactional Replicated table to a Data Warehouse where the database is also mirrored to an off-site Disaster recovery site.

The mirrored site connection and transfer rate ranges from 500k/sec to 2000k/sec and the added columns to a large (20+million rows) produce alot of TLog to transfer to mirrored site. The replicated transaction is delayed by the mirror.

In general, here are the steps:
1. Bring OLTP and OLAP systems down. This is already done because of release.
2. Make sure all transactions have been committed to subscribers. Use Replication Monitor.
3. Generate scripts for DROP and CREATE Publication and Subscription(s)
4. DROP Publication and subscription(s)
5. ALTER TABLE or any other scriptions on all databases, publication and subscribers
6. Create Publication without running any add subscriptions parts of the scripts
7. Use system SP - sp_scriptpublicationcustomprocs to generate replication stored procedures.
8. Execute SPs on Subscriber(s)
9. Run Create subscription script with @sync_type='none' not 'automatic'
10. Use Replication Validation from Replication Monitor to see if everything is OK

The only issue has been if the mirror still has not finished the transaction. Sometimes we have to wait 1-2 hours for this to finish.

There is a new trace flag -T1448, that is suppose to help with this, but the test we have tried have not shown this to be so.

God Bless,
Thomas