Tuesday, December 27, 2016

Query Data Store–SQL Server 2016

SQL Server’s new Query Data Store is a brilliant option for databases if you are a DBA or have to do performance tuning on T-SQL queries. This new feature of SQL Server 2016 is contained within a database with configuration options for storage of history and query plans. The option has many benefits you can take advantage of today, but also this is an option for future upgrades to new SQL Server versions because of the history contained in the database rather than the instance.

image

 

To enable the Query Data Store, you have to be db_owner of the database or SysAdmin to the instance. Once you right-click the database in SQL Server Management Studio (SSMS), go to the Query Store menu choice on the left and change Operation Mode (Requested) to Read Write. There are other options to handle the time retention of data as well as how often it is collected.

image

 

This will enable the database to start storing query execution plans as well as statistics like IO, Read/Write and execution times in the database itself. There are new Data Management Objects (DMVs/DMFs) to use to manual retrieve this data.

image

 

There are also reports available at the database level in SSMS to view information about query statistics. If you drill into the database in SSMS, you will see an option for Query Store.

 

image

Here, you will see the four default reports that come with this Option.

  1. Regressed Queries – shows query history and changes in statistics
  2. Overall Resource Consumption – history of resources used in the database
  3. Top resource Consuming Queries – Top x of queries using the most resources
  4. Tracked Queries – enables you to see multiple query plans for a T-SQL statement and compare the plans or force a plan

The above is just a summary of what you are able to do, but just start using this feature and you will be amazed at what it enables DBAs and others to drill into. This drilling is just the history of queries on one database rather than you having to sort through all plans for an instance.

Thursday, November 10, 2016

SQL Server 2016–Live Query Statistics

WOW!!! The next great update to Execution Plans is a way to watch the processing of rows through iterators while the query runs. Never saw that coming. And even a bonus, you can use SQL Server 2016 Management Studio (SSMS) to watch queries run on versions 2014, 2012 and 2008 R2.

SSMS 2016 can be installed by itself without the server installation. Go here for the installation. Once you install SSMS, you can start to view this addition in SSMS.like Figure 1.

 

image

Figure 1 – Live Query Statistics

After opening a New Query, go to the Query menu choice and select “Include Live Query Statistics.” You will need a long running query in order to see the processing happening at a slow pace.

image

Figure 2 – Include Live Query Statistics

Once you start a query, a new tab will show the Live Query Statistics like Figure 3. The dotted lne means the iterator has not finished or has not started.

 

image

Figure 3 – New Tab

The labeling of the iterators have the usually properties as an Execution Plan plus a running time clock of seconds that this iterator is using. In addition, you can see the “Actual Number of Row” of “Esitmated number of Rows” with a calculated Percent Done (Actual/Estimated). Figure 4 shows 144% for some iterators that have not even finished yet.

 

image

Figure 4 – Percent Greater Than 100

This is useful to see if estimate number of rows is the same or close to actual number of rows. The percentage will rise above 100 if there are more actual than estimated rows. The query in this case probably could use some performance tuning to get the estimate closer to the actual.

WOW!!!

Friday, October 28, 2016

PASS Summit 2016 – Announcements

There have been different announcements for new and upcoming features in the Microsoft Data Technology ecosystem. The one today I am hearing about is Data Migration Assistant (DMA) tool v2.0 GA and Tech Preview of DEA – Data Experimentation Assistant. James Serra blogged about it here - http://www.jamesserra.com/archive/2016/10/pass-summit-announcements-dmadea/. The DEA helps you with understanding the performance improvements by upgrading.

image

The next one the Tech Preview of building PowerBI reports in SQL Service Reporting Services. For more info go to the PowerBI Community Blog - https://blogs.msdn.microsoft.com/sqlrsteamblog/2016/10/28/create-power-bi-reports-in-the-sql-server-reporting-services-technical-preview/

The most exciting one for me is the Azure Analysis Services. It is only Tabular Models, right now, but OLAP is said to be in the works. This is interesting because it once was in the Cloud and then that cloud disappeared. It will be nice to see creating a DW in the Azure DWU, then connect to that data in Analysis Services and reporting in PowerBI, all in Azure. Never though I would say that Smile

 

image

 

Go read about Azure SSAS here - http://sqlmag.com/sql-server/microsoft-announces-azure-analysis-services-preview-pass-summit-2016 and https://azure.microsoft.com/en-us/blog/introducing-azure-analysis-services-preview/

Chris Webb has already wrote his first thoughts about Azure Analysis Services - https://blog.crossjoin.co.uk/2016/10/25/first-thoughts-on-azure-analysis-services/

PASS New Logo

image

Devin Knight presenting Advanced PowerBI

Media preview

Community Zone

image

PASS Summit 2016-Thursday Part II

The second half of the day was spent talking with various speakers and attendees. I got into a great conversation about trying to mentor developers at work places about how to write good T-SQL. It was interesting to here how some developers (or DBAs) are not open to new ways to write T-SQL like using CTEs or proper join syntax.

image

I meet Lance from the program committee and discussed the difficultly of having 800+ submissions and only about 140 slots for those submissions. To make sure you get a wide range of topics and not covering too little or too many of the same topic is not always easy. They do a great job with the volunteers to make sure Summit is a great learning opportunity for all.

I also saw a new posting on SQLServerCentral.Com about learning SQL Server 2016. Try this Link - http://www.sqlservercentral.com/articles/SQL+Server+2016/127518/.

image

The only session I attended was on learning how to price and scale an Azure SQL Server Data Warehouse. So, now I understand DWUs and pricing is 3 fold: DWUs, Storage and GEO DR. It can be expensive, but you get Massive Parallel Processing in the cloud and easier scaling up or down the usage.image

The last 2 session time slots was spent co-presenting with Bill Anton on Performance Tuning Analysis Services. It was well attended but in a large room. There was about 2/3 of the people returning after the break for the 3 hour session. Bill did a great job getting people started with Profiler, PerfMon and some free tools and scripts. Lots of questions and even a Microsoft developer helping us answer questions. Thanks Microsoft.

Last stop of the day was dinner with some great guys from Atlanta at Rock Bottom. They went to a BIML party, I went to bed. Smile

Thursday, October 27, 2016

PASS Summit 2016-Thursday Part 1

Today’s keynote was by David Dewitt. He did an excellent job explaining the architecture behind various Cloud Data Warehouse offerings. It was nice to get an explanation of the behind the scenes node and disk usage.

   image

One announcement that caught me off guard was next year’s Summit is Oct 31 thru Nov 3. That is Halloween and my kids come first on that day. Not sure that was a good idea. Better information shared was an increase in revenue of 10%, plus 100%+ increase in Global membership in some areas. There is a new logo and the updated website will be released early next year. Denise provided some slides on what the website will look like including mobile friendly.

image

Next, I am going to meet up with Bill and attend a session on Azure DW.

PASS Summit 2016-Wednesday

Media preview

As all Summits, the first Regular Session day in Seattle started with the Keynote. Well, really it starts with a great breakfast at the Seattle Convention Center, but I digressed. Microsoft released information about new features for its Microsoft Data Platforms. The Keynote provided Customer specific examples of some of the newer features being used plus Program Managers demoing great examples of what is to come. For a summary, go to watch the PASStv Recap. A feature I am looking forward to seeing in action is Analysis Services in the cloud.

Congratulations to Malathi Mahadevan as PASS Outstanding Volunteer of the Year. She is a great representative of the PASS community and people love her service and passion.

image

The Summit is an amazing trip.Wednesday, I spent time with Bill Anton rehearsing a 3 hour Analysis Service Performance Tuning session for Thursday he has put together. His knowledge of this subject is top tier so I learned a lot. I spent time in the Speaker Ready talking and practicing for a session on Excel Tips & Tricks With SSAS Cubes. This session takes all the technical work we IT people put together in a Cube and start to have some fun with it. What amazed me was half the people in the session (there was at least 100 attendees) were Cube builders. They were in the session to find out how people use cubes in Excel to develop better cubes. That is a lot of people still building cubes.

Before the Vendor Reception, I spent time after the Excel session talking to an attendee. He was very unique to me because he was not an IT person, but hired as a Financial/Accounting analyst. He was using data feeds from their hosted applications in PowerBI. It was difficult to explain to him the difference between a Cube, the Tabular Model, PowerPivot AND the modeling in PowerBI. Dimensional Modeling might be to advanced for him, he just wanted basic table design hints.

He was also concerned that there were not enough beginner T-SQL and database design sessions available for him at the Summit, so I spent time with him going through all the remaining session that were a fit for him plus vendors he should visit. He has never heard of Pluralsite or other online training like from PragmaticWorks. We exchanged business cards and wished each other a pleasant week as we walked past the Community Zone and into the Vender Reception.

Tuesday, October 25, 2016

PASS Summit 2016–Tuesday

For the first day at the Summit this year, I spent half my time in a Pre-Conference session with Idera as an outgoing Idera Ace. The session include a 45 minute presentation for me on a Customer Use-Case. The second half of the day was spent in #SQLPASS Community sessions.

The Customer Use-Case included screen captures of real-world diagnosis using Idera products. Diagnosis Manager provided most of the screens. I was able to show the forecasting reports for data file and table usage/growth. We were able to show a large table that indexes were occupying lots of space and investigated the Index Usage through Glenn Berry scripts on a historical level. Removing some large include indexes help alleviate the table growth in storage size.

image

The other slides were about Alerts/EMails, User T-SQL History, BI Manager and Free Tools like SQL Check and SQL Fragmentation Analyzer.

Media preview

The Community sessions were for #SQLSaturday and User Group/Virtual Chapters. The SQLSaturday session was a full house with many comments and questions. Money and Sponsorship was the hot topic. Growth is still being encouraged, but overall the session had a lot of good points with many thanks to such a great community event system. At the end, an emotional Karla talked about the end to her five reign as PASS Community leader for PASS. What a wonderful ending to the session!!! I won’t show any pictures of her crying (or Grant crying).

The User Group and Virtual Chapter session were a lot about speakers than money or sponsorship.

Truly amazed at the turn out and range of areas represented. I wonder how many User Groups use remote speakers and how often? Interesting discussion about getting a Microsoft TAM or related MS employee to help sponsor or host monthly meeting. You do not pay them but it helps them in their reviews/bonuses.

Ryan Adam announces new features in SQLPass hosted user group pages. There will be an option to send out event notifications for email, twitter, Facebook, LinkedIn, Matchup and other social media at one place, not manually doing many. Cheers from the crowd!!!

The evening end with a bite to eat at Welcome Reception and visiting with new and old friends, too many to list.

Thursday, October 6, 2016

VOTE!!! PASS Board of Director Elections

It is voting season again, and there is a great group of individuals running for the Board of Directors of PASS. You can go to the Elections Page of the SQLPass.org site and review all the information about the elections. Logging into the site will enable you to vote.

As part of the Nomination Committee this year, I got to be apart of the process to help PASS Connect, Share and Learn. Hearing the candidates talk about PASS and the future is very encouraging. Also, watching Headquarters (HQ) put everything together and keep the process rolling was something I never thought about before being on the NomCom. HQ does more than I ever imagined.

To find out more about the candidates, you can go to this page - http://www.sqlpass.org/Elections/Candidates.aspx. There is also still time to go look at the twitter chats that happened in the last 24 hours. Just search in twitter for #SQLPass hash tag. You can also send them question and I am sure they will find time to answer you.

Also, remember the PASS Summit is in less than 3 weeks and I am execited to share with others about NomCom and what I have learned about the PASS organization.

PASS Summit 2016

Saturday, September 10, 2016

Live! 360–Speaking at SQL Server Live Orlando 2016

If you are looking for a great conference to attend, please look at Live! 360. This event combines 6 different areas of IT (including SQL Server) with some of the great experts presenting sessions for 5 days. The sessions are on December 5th thru the 9th in Orlando, FL at Royal Pacific Resort.

This is a link with a discount code for up to $500 dollars off - http://bit.ly/LSPK47_REG

Here are the three I am giving.

SQT07 New Performance Tuning and Security Features in SQL Server 2016

12/06/2016

2:00pm - 3:15pm

Level: Introductory to Intermediate

SQL Server 2016 has added and improved features to an already great product. To start, there is now a Query Store to retrieve history of a query's execution plan and statistics used for that plan. You can compare them to see the changes. You can also see the Execution Plans "Live" to see where a long running query is spending lots of time. You can also compare plans side-by-side, which should make DBAs or anyone that performance tune queries very happy.

From there, you'll go into some of the database design aspects of 2016 to improve table implementation. In-Memory has been half-baked in previous versions and now is enterprise ready with OLTP tables. The Column-store indexes include update-able clustered and non-clustered indexes. Temporal Tables remove the requirement of using triggers or custom code to find a point in time version of the data. You now can mask columns for limited viewing and casting of data to the end users by login permissions.

The T-SQL enhancements will help with better writing of set-based queries. The stretch database feature will assist in archiving data to the cloud with access in applications. Always Encrypted secures the database for abiding to regulations in health care and finance. The last feature will be row-level security, which has been a frequently requested option.

You will learn:

  • About T-SQL enhancements in SQL Server 2016
  • About Table design changes in SQL Server 2016
  • About Query Plan improvements in SQL Server 2016

 

SQW05 Master Data Management with Data Quality (DQS) and Master Data Services (MDS) in SQL Server 2016

12/07/2016

9:30am - 10:45am

Level: Intermediate

Is there data scattered all over your enterprise? Has your boss asked you to help your users create and manage one source of common data? Are there different spellings of values in the same column of a table? Can you use help from algorithms to find the best match? Is the Customer or Product table in multiple databases with different columns? Do you need help managing the Data Warehouse?

This session will jump into SQL Server 2016 with DQS and MDS. You use the Data Quality projects in Integration Services (SSIS) to rank values and help with misspelled data. You'll help your users manage the DQS projects giving them management access to data quality and moving that responsibility to the business where it should be.

Next, you'll use MDS to help consolidate dimensions scattered in multiple databases to form a true Conformed Dimension area. MDS will help your company move away from spreadsheets or flat files to manage critical information for analytics. These features were added to SQL Server in version 2012 and have been upgraded in 2014 and 2016 to help enterprises with the task of Master Data Management.

You will learn:

  • How to use Data Quality Services to cleanse data with DQS and SSIS
  • How to create a Master Data Management system with MDS in SQL Server 2016 to consolidate multiple data sources

 

SQW12 Improve Enterprise Reporting with SQL Server Analysis Services

12/07/2016

4:00pm - 5:15pm

Level: Intermediate

Have you been at a job where requests keep coming in for more and more reports? Does it seem like some are being used and other are not? Are some of the reports exactly the same except a different grouping or sorting request? Has your boss asked you to investigate a Business Intelligence or Data Warehouse solution? Well, it might be time to start using SQL Server 2016 Analysis Services.

Dimensional Modeling is one of the best starts for designing a system with flexible reporting. The database model fits perfectly into Analysis Service (SSAS) databases from Microsoft. There will be a use case for 3 fact tables to create various data marts. We will demonstrate a Multidimensional Cube and the Tabular Model to help you make the decision of which installation to use. Excel and PowerBI will be the focus for reporting.

Thursday, July 21, 2016

Using a Lookup Component in SSIS for Surrogate Keys in a Fact table

There are many suggestions for loading a data warehouse using SQL Server integration Services (SSIS). Once you get started, you find a pattern to repeat for facts and dimensions. This might be because the consistency Dimensional Modeling suggests for the design of fact and dimension tables. The SSIS Lookup component provides options to divert handling of non-conformed data or you can use your T-SQL skills up front where a diversion is not necessary.
The most common use of a Lookup component is to find the Dimension surrogate key for one or more rows in a fact table. The CustomerKey from the Customer dimension will be used in the Sales Fact table in order to join to customer hierarchies and attributes like the following T-SQL.

SELECT c.CommuteDistance, Sum([OrderQuantity]) AS OrdQty, Sum([SalesAmount]) AS SalesAmt  FROM [dbo].[FactInternetSales] fis    INNER JOIN dbo.DimCustomer c ON c.CustomerKey = fis.CustomerKey  GROUP BY c.CommuteDistance


CommuteDistance OrdQty SalesAmt
5-10 Miles 10615 4893148.0413
2-5 Miles 10084 4965514.4362
1-2 Miles 10170 4541608.1498
0-1 Miles 21307 11207592.2248
10+ Miles 8222 3750814.3686

Table 1: Sales Amount and Order Qty sums for Commute Distance

Once you get the source data for the Sales fact in an SSIS package, the Lookup component can be used to get the surrogate from the customer dimension. In Figure 1, the object LU – Customer Key lookup in the flow of populating this fact table.

clip_image003[4]
Figure 1: Loading the FactInternetSales table from the AdventureWorksDW Database

The options for Lookups vary based on some properties. In Figure 2, we can see some of these options.

clip_image005[4]
Figure 2: General Lookup properties

Most lookups for dimensions can use the Full Cache mode because there is not a lot of dimension rows. But, if you come across large dimensions in a data warehouse, sometime should be spent seeing if a Partial Cache or No Cache will help with loading speed as well as using a Cache connection manager. The Full Cache option will load all the data from the dimension table into memory before any flow starts in the package. This is why it is good not to SELECT all the columns in the dimension table for a lookup.

clip_image007[4]
Figure 3: Connection properties

Figure 3 shows the connection using a T-SQL statement to only get the CustomerKey and CustomerAlternateKey from the DimCustomer table. The CustomerAlternateKey is the business key that matches customer to rows in the source data for FactInternetSales. If you use the option “Use a table or a view”, the generated query will select all columns from the source.
Let’s go back to the General properties to look at “Specify how to handle rows with no matching entries”. Figure 4 shows the different options available.

clip_image009[4]
Figure 4: General properties

The “Fail component” option will stop processing of the import if no match is found in the lookup table. This is not a good option for loading data into a fact table. Most systems would want the import to continue for the customer surrogate keys that are found.

  1. 1. Ignore Failure – Null will replace lookup values selected. Those rows with no match are streamed to the normal flow in package with the Null value in match columns selected.
  2. 2. Redirect rows to error output – red line output will show a failure but can pipe the data to any component. Those rows with no match are not streamed to the normal flow in package.
  3. 3. Fail component – the package stops with a failure if no match exists, if all match there is no failure
  4. 4. Redirect rows to no match output – output can be piped to another component and processing continues. Those rows with no match are not streamed to the normal flow in package.

So, the two other options I use most are “Redirect rows to no match output” or “Ignore Failure”. The redirect rows… option can be used to stage the rows that have problems and get someone to look at them before the next import. The Ignore Failure option can be used to substitute a Null value in the CustomerKey column.

Now, this will not work if importing to a fact table with foreign keys and a Not Null property on the CustomerKey column, but what I do most often is change the Null value to -1 and have an entry in the DimCustomer table for the Unknown possibility like this example.

CustomerKey GeographyKey CustomerAlternateKey FirstName LastName MaritalStatus Gender
-1 -1 -1 Unknown Unknown U U


clip_image011[4]
Figure 5: Derived Column component

Figure 5 shows what can been done to convert the Null value in the CustomerKey column to -1. This way with the -1 row in the customer dimension table, we will be able to insert the row into the fact table without having to process the rows outside of this package.

clip_image013[4]
Figure 6: Selected Lookup Columns and names

Figure 6 shows the Columns tab Properties. Here is where we match the Source fact column for customer to the Business Key in the Customer Dimension. We also select the CustomerKey from the Available Lookup Columns list in order to get the surrogate key for the fact table. The selected CustomerKey is where the Null value will be located if no match if found.

This process also indicates that another process needs to be implement to determine why existing fact rows have business keys for the customer source table that are not imported into the Customer dimension. This one tip can go a long way and patterns developed to help with this situation. It also indicates the power and flexibility SSIS provides data warehouse architects in implementing an ETL solution for its business.




















Friday, June 24, 2016

PASS Summit 2016 – Connect, Share, Learn

Please join me and about 4000+ other technology professionals for the annual PASS Summit in Seattle, WA October 24th – 28th. But before I talk about this year’s event, join me in a trip down memory lane about my path to the PASS Summit. If not, skip to the end of this blog. Smile

clip_image002

My first trip to the PASS Summit was in 2007 or 2008 in Denver, CO. I saw the advertisement on SQLServerCentral.com with names of speakers that had written articles I was familiar with. To think I would get to learn from these speakers and get out of the office for a week seemed only a dream. But, it became a reality. My boss actually wanted to attend as well which started a training fund for the IT department. SQLServerCentral.com had a code that would give you a discount for the event as well as entry to the then famous Reception Party and Casino Night with prizes. Man that used to be fun.

clip_image004

Opportunities at work guide us to the sessions to attend. First, we wanted to know more about clusters since we had two different physical clusters that had some problems with failing over and restarting. There were many databases to manage, but two in particular were 500+ GBs (on SQL Server 2005) and growing fast. A indexing and rebuild versus reorg session grab our attention. The last objective was to see what DataDude was all about. This was the first version of a database project in visual studio. I think Grant Fritchey did this talk. There was also a Database Normalization talk given by Louis Davidson and Paul Nielsen (really miss Paul’s talks at PASS).

image

Of course, since Microsoft was going to be present at the event, we had to spend time talking to the support team about problems with SQL Server. A vender, Scalability Experts, had a booth and two of the speakers had help us with an upgrade from SQL Server 2000 to 2005. Does anyone even remember those wonderful times?

image

After our second journey to the Summit the following year, I started thinking that I could do some of these talks because I had learned just as much as they were teaching. This started me on speaking at Lunch and Learns at work followed by the Baton Rouge SQL Server User Group (Thanks Patrick LeBlanc). I think I actually spoke at the .Net User Group first. I also started going into LSU with AITP to teach subjects about databases as well as career sessions to show graduates the different technology companies in and around Baton Rouge.

image

Since then, I have had the privilege to speak many times through the years on many topics I use day to day at work. Most tedious has been the volunteer work with the Abstract Review and PowerPoint Review committees over the past 6 years as well as helping the Performance, Data Architect and Excel BI Virtual Chapters. This has enable me to get out of my shell and interact with the larger Professional Association of SQL Server user community (oops! Now the Microsoft Data Technology community). Smile

This has led me to being selected this year to speak at the PASS Summit 2016 in Seattle, WA. I am really going to enjoy sharing the stage with Bill Anton for a SSAS Performance Improvement session. If you have not started reading Bill’s blog, please start. He is a smart dude with some great experience and enthusiasm with Analysis Services.

So, all this was given from me to you to invite you to the PASS Summit. It does not matter why you come: Sessions, SQLCAT, Networking, Speaking (future speaker), Vender parties, SQLKaraoke or just a break from work. Make the trip, get involved and help change or mold the Microsoft Data Technology community into what we need to help others adopt to the new features or tune the old skills.

Sunday, May 29, 2016

Why use a Date Dimension Table in a Data Warehouse

In the Data Mart, or the Data Warehouse world, there is a date dimension table in all schemas if you are using the Kimball Dimensional Modeling method. In the beginning of Dimensional Modeling, it was called a Time dimension. Since then, the Time dimension has actually been separate from the date values. After listing all these terms: time, date, dimensional modeling, data mart, data warehouse, etc., you might be more confused than ever about what we are talking about.

So, let’s start with some definitions. A Data Mart is actually a small part of a Data Warehouse. The Data Warehouse is considered the entire set of tables in a database. The Data Mart is defined as a business process that is represented by the data contained in the process. The structure of a Data Mart is to enable a simplistic way to do querying or reporting. The data population stage has a component that translates the data complexity into an actual column for a description rather than a code.

Dimensional Modeling is the process to take the data from various sources and produce this Data Mart structure with usually one Fact table and multiple Dimension tables. The Dimension tables are related to the Fact table by a surrogate keys. The Fact contains the measures of the data the business process wants to consume. There can be multiple Data Marts in a Data Warehouse, so do not get hung up by the single Fact table in a Data Mart. Eventually, you will see the Dimension tables related to many Fact tables in the overall schema. These dimension are termed Conformed Dimensions.

The Date dimension is one of these dimension tables related to the Fact. Here is a simple Data Diagram for a Data Mart of Internet Sales information for the Adventure Works DW database which can be obtained for free from CodePlex or other online sources.

image_thumb[1]

As you can see in this diagram, the Date table (DimDate) is related to the fact table (FactInternetSales) by 3 different dates in the fact: Order, Ship and Due Date. This is called a Role Playing dimension in the Dimensional Modeling world. The relationship is created by the surrogate keys columns (integer data type) rather than the date data type. The DateKey column in the DimDate table is related to the OrderDateKey column in the FactInternetSales table as well as ShipDateKey and DueDateKey.

The DimDate table has columns that normally would have to be extracted from a Date column with a function. These include CalendarYear, CalendarQuarter or DayNumberOfMonth. The data in this table has a row for every day possible in your Data Mart.

 

DateKey

FullDateAlternateKey

DayNumberOfMonth

CalendarQuarter

CalendarYear

20050101

2005-01-01

1

1

2005

20050102

2005-01-02

2

1

2005

20050103

2005-01-03

3

1

2005

20050104

2005-01-04

4

1

2005

20050105

2005-01-05

5

1

2005

20050106

2005-01-06

6

1

2005

20050107

2005-01-07

7

1

2005

20050108

2005-01-08

8

1

2005

20050109

2005-01-09

9

1

2005

20050110

2005-01-10

10

1

2005

Think about the reason for every day to be in this table. If there are no Internet Sales on 12/28/2010, then we would have to do some fancy LEFT JOIN with a sub-query to create this row in an analysis of the data. But, with a Date Dimension table, we LEFT JOIN from the DimDate table to the Internet Sales fact table and we can extract $0 with a IsNull(SalesAmount) from this query.

 

SELECT d.CalendarYear, d.FullDateAlternateKey, Sum([SalesAmount])

  FROM dbo.DimDate d

    LEFT JOIN [dbo].[FactInternetSales] fs ON fs.OrderDateKey = d.DateKey

  WHERE d.CalendarYear = 2010

    AND d.MonthNumberOfYear = 12

  GROUP BY d.CalendarYear, d.FullDateAlternateKey

  ORDER BY d.CalendarYear, d.FullDateAlternateKey

image_thumb[4]

The query users have to write against a Data Mart are much simpler than against a transaction database. There are less joins because of the one to many relationships between the fact dimension table(s). The dimension tables are confusing to someone who has been normalizing databases as a career. The dimension is a flattened or de-normalized table. This creates cases of duplicate data, but the simplistic query overrides the duplicate data in a dimensional model.

With this table, we can stop using functions on date columns like YEAR (), MONTH (), etc. So, a query from an OLTP might look like this:

 

SELECT DATEPART(Year, SOH.OrderDate), SUM(DET.LineTotal) AS Sales

  FROM Sales.SalesOrderHeader SOH

    INNER JOIN Sales.SalesOrderDetail DET ON SOH.SalesOrderID = DET.SalesOrderID

  GROUP BY DATEPART(Year, SOH.OrderDate)

Whereas the dimensional model query would look like this:

 

SELECT d.CalendarYear, Sum(SalesAmount)

  FROM dbo.FactInternetSales fi

    INNER JOIN dbo.DimDate d ON d.DateKey = fi.OrderDateKey

  GROUP BY d.CalendarYear

The end user might have a better time understanding the structure of the Dimensional Model than the transactional system. Especially, if data is obtained from different databases maybe on different servers. Is it no fun trying to explain the LinkedServer and how to use it? Consolidating the needed reporting data into a Data Mart, then Data Marts together into a Data Warehouse makes life much easier for report writers.

The date dimension can also contain columns for Weekend versus Weekday, Holiday and month markers like 2014-10 or by quarter like 2014-Q1. All these can be computed once in the dimension table and used at will by query writers. They now do not have to know how to use T-SQL functions or concatenate substrings of “CASTed” date columns.

Then, when the DimDate is related to various Fact tables and processed into an OLAP cube, the measures and aggregations are displayable side by side through the DimDate dimension which is now considered a Conformed Dimension. The slicing and dicing of data has just been made a whole lot easier.

Saturday, April 23, 2016

SQLSaturday Atlanta #521 May 19th & 20th 2016

SQLSaturday events are a pleasure to be a part of whether it involves speaking or volunteering. Volunteering definitely takes more work and time, but to see someone who is interested in SQL Server as a career take time to learn on a Saturday, makes it all worth the effort. When speaking, I tend to gravitate toward those 3-5 in the front row that seem full of questions, and away from that 1-2 that seemed to “know” more than me.

Photo

Friday May 20th 2016, I have be selected to present a full-day session on Reporting with Dimensional Modeling and Microsoft Business intelligence. These are called pre-conference sessions which reflects their purpose at larger conferences like PASS Summit. What makes them more attractive to attendees (AND Bosses) is the fact that you do not have to travel to the larger conference plus they are usually less expensive. This translates to a higher ROI (Return On investment) for you or the company you work for.

Pre-Son

Atlanta’s SQLSaturday this year is May 21st 2016 at Georgia State University in Alpharetta just north of Atlanta. The speaker lineup is incredible and I know half my day will be attending sessions to advance my knowledge with the Microsoft data Platform plus talk about Transitioning from a DBA to Data Warehouse Architect.

sqlsatatl

If you are interested in learning about dimensional modeling and MSBI, please register for the day long pre-conference session I am giving, or you might be interested in the other 2 all-day sessions – Enterprise Scripting by the Midnight DBAs (Sean and Jen McCown) or SQLSentry’s Kevin Kline on 50 Things All SQL Server Developers Need to Know. There is also a half-day session on Friday presented by Linchpin People’s Brian Moran talking about Secrets of independent Consulting.

sqlsatatlsched

Of course, the Saturday event is free (you might have to pay for lunch). You could learn a lot of pertinent skills on Saturday as well as network with probably 500+ attendees and speakers along with sponsors and volunteers. Please come join us.

Photo

Wednesday, March 16, 2016

Excel BI VC presents Dustin Ryan: New Features of Power BI

Next Meeting

Thu, Mar 24 2016 12:00 Central Daylight Time


Dustin Ryan presents What's New in Power BI

RSVP:https://attendee.gotowebinar.com/register/8425155555275941377

WHAT'S NEW IN POWERBI

Between the weekly and monthly updates to Power BI, its tough to stay caught up with the latest updates and improvements with Microsoft's newest business intelligence tool. In this session, we'll take a look at the latest improvements to Power BI and how you and your organization can begin using these new capabilities now.


Dustin Ryan is a senior BI consultant and trainer with Pragmatic Works in Jacksonville, FL. Dustin specializes in delivering quality enterprise-level business intelligence solutions to clients using SSRS, SSIS, SSAS, SharePoint, and Power BI. Dustin has authored and contributed to SQL Server books. You can find Dustin speaking at events such as SQLSaturday, Code Camp, and online webinars.

Analysis Services: Solving Hierarchy Errors of Uniqueness

Multidimensional Cubes provide speed when it comes to retrieving aggregations that are important to business decisions. Being able to slice or group the measures by dimension attributes helps with a quick analysis and flexible/interactive reporting. Configuring these attributes as hierarchies has some details that are not at first obvious. The error message when these problems exist is not extremely helpful for someone new to cubes.

Let’s look at creating a Date hierarchy with Year, Quarter, Month and Day. Our cube already has measures created from a sales fact table and the dimension for date has been created without any hierarchies.

clip_image002

Figure 1: Attribute List for Date Dimension

The measures can be displayed in a Pivot Table in Excel. Figure 2 below shows the Sales amount sliced by Sales Territory and Year/Quarter/Month.

clip_image004

Figure 2: Pivot Table in Excel

Figure 2 show the Year, Quarter and Month as Rows while the Sales Territory is used as columns with Internet Sales used for Values in the Pivot Table. Users will get frustrated when they have to pick one attribute at a time when logically the hierarchy is known.

To create this hierarchy, you need to edit the date dimension in the cube.

clip_image006

Figure 3: Edit Date Dimension in SQL Server Data Tools

To create a hierarchy, you can drag and drop the first or top level of the hierarchy from the Attributes pane into the hierarchies’ pane. We will do this with Year at the top level.

clip_image008

Figure 4: Drag Year to Create New Hierarchy

To finish this hierarchy, drag the Quarter under the Year followed by the Month and Dates attributes. To complete the Hierarchy, right-click on the name Hierarchy, and select Rename. We renamed the hierarchy to Y-Q-M-D.

clip_image010

Figure 5: Y-Q-M-D Hierarchy Created

We can deploy this project and preview in Excel to see the effects.

clip_image012

Figure 6: Preview Hierarchy Y-Q-M-D in Excel

So, what is the problem at this point? Well, for performance reasons, there is a blue line under the hierarchy name in the Cube project. The message tells use to create an attribute relationship for the hierarchy. This is done by editing the date dimension using the Attribute Relationship tab.

clip_image014

Figure 7: Attribute Relationship Does Not Exist.

Y-Q-M-D is a natural hierarchy because a Day is in a Month that is in a Quarter that is in a Year. So, we should be able to show that in the Attribute Relationship for this Hierarchy. You can drag and drop Quarter on Year, then drag and drop Month on Quarter to accomplish this. Dates is the root attribute or key to the dimension.

clip_image016      clip_image018

Figures 8 & 9: Before and After Y-Q-M-D Hierarchy

Now, when we deploy, we get an error. The error message with the red circle and white x does not tell us the problem. The problem is in the last warning indicating that Quarter has duplicates for value 4. In order for attribute relationship to exist, the values in each have to be unique across all occurrences. The Quarter 4 (as well as 3, 2 and 1) are duplicated for every year we have in the data dimension table.

clip_image020

Figure 10: Deployment Failed

There are a couple of solutions for this problem, but we are only going to look at one. We are going to use multiple columns in the KeyColumn property of the Quarter and Month to create uniqueness. Then, we have to add a column to the NameColumn property in order to have something display for the multi-column KeyColumn property.

clip_image022

Figure 11: Changing the KeyColumn of Attribute Quarter

To do this, you have to highlight the Quarter attribute in the Attributes’ pane, then go to the properties. Find the KeyColumn and click on the ellipse. When prompted, add CalendarYear to the Key Columns list and move the Year above the Quarter (Figure 11). Do the same thing for Month, add CalendarYear to the KeyColumn.

clip_image024

Figure 12: NameColumn for Month Attribute

The NameColumn needs to be change from nothing to CalendarQuarter for Quarter attribute and EnglishMonthName for Month attribute (Figure 12). Re-deploy the project and the error should no longer exist and we get a Deployment Competed Successfully.

clip_image026

Figure 13: Deployment Completed Successfully

The use of an Attribute Relationship for natural hierarchies greatly improves the processing and retrieval of data from the cube. This also assists the aggregation builder for indexing the combination of dimension attributes needed for analysis. In the end, the cube can retrieve the aggregation from the month to get quarter or quarter values to get the year which saved retrieving details to aggregate up the hierarchy.