Pages

Tuesday, August 2, 2011

Remote Query (Linked Server) Connection Time-Out Issue & Solution

Today, I faced an issue while running a long MDX query in terms of execution time. I went through so many posts on this issue, but didn't get any answer. Below are the details:

Error Message: 
"OLE DB provider "MSOLAP" for linked server "LINKED_SERVER_SSAS" returned message "XML for Analysis parser: The XML for Analysis request timed out before it was completed.".

Description: My SQL Server DB has a Linked Server connection (i.e. named "LINKED_SERVER_SSAS") to connect to SQL Server Analysis Services 2008 OLAP database. This connection is being used in Stored Procs & T-SQL queries to fetch stats from cube. I executed a MDX using this linked server connection, query ran for 10 mins and threw this error message.

Then I tried to ran MDX directly on Analysis Services which ran successfully & took around 18 mins. I checked multiple options to resolve the issue such as
  1. Linked Server Connection: Query Timeout property under Server Options
  2. SQL Server Analysis Services: Server Properties (including advance server properties)
  3. SQL Sevrer Management Studio: Query Execution Property
None of them were successful attempt in my case. 

Solution:
  1. Open SQL Server Database Instance Property where linked server is created.
  2. Go to Connections.
  3. Check "Remote query timeout" value under "Allow remote connections to this server" property. Update this value as per Query/Business requirement. (In my case this value was set to 600 secs.) 
As this value was 10 mins, my remote MDX query faced timeout issue. After updating value to 1800 secs remote MDX query executed successfully. 

Readers: Kindly share your comments & suggestions.

Wednesday, May 18, 2011

MDX behavior in SQL Server 2008 R2 & Earlier Versions


I have been working on Sql Server Analysis Services & MDX for quite a long period. And interestingly on day to day basis I get to understand new concepts, bugs, and some interesting features. 

Few days back, I encountered with a MDX query which behaved differently in SQL Server 2008 R2 and its earlier versions. After posting my query to MSDN forum, and raising this behavior as bug @ http://connect.microsoft.com/, I got an explanation as this behavior was a design issue in SQL Server 2008 and earlier versions and got fixed in SQL Server 2008 R2.

Below is a sample MDX which behaves differently in SQL Server 2008 R2 and earlier versions: Behavior can be seen in "Adventure Works" cube.

MDX 1:
        select [Measures].[Internet Sales Amount] on 0,
        non empty [Date].[Date].[Date] on 1
        from [Adventure Works]
        where ([Customer].[Country].&[United States],[Customer].[State-Province].&[NY]&[US])

MDX 2:
        select [Measures].[Internet Sales Amount] on 0,
        non empty [Date].[Date].[Date] on 1
        from [Adventure Works]
        where ([Customer].[Country].&[United States],{[Customer].[State-Province].&[NY]&[US]})

When I ran both queries on SQL Server 2008 R2 (Microsoft SQL Server 2008 R2 (RTM) - 10.50.1600.1 (X64)   Apr  2 2010 15:48:46   Copyright (c) Microsoft Corporation  Developer Edition (64-bit) on Windows NT 6.0 <X64> (Build 6002: Service Pack 2) ), both queries returned same result. i.e.

                                Internet Sales Amount
November 23, 2003      $1,735.98
May 7, 2004               $2,337.27
June 19, 2004             $50.94

Then ran these queries on SQL Server 2005/SQL Server 2008 (Microsoft SQL Server 2008 (SP1) - 10.0.2531.0 (Intel X86)   Mar 29 2009 10:27:29   Copyright (c) 1988-2008 Microsoft Corporation  Developer Edition on Windows NT 5.1 <X86> (Build 2600: Service Pack 3) ). They returned different result sets. i.e.

MDX 1 Output:
                                Internet Sales Amount
November 23, 2003     $1,735.98
May 7, 2004                $2,337.27
June 19, 2004             $50.94
MDX 2 Output:

MDX 2 Output:
                               Internet Sales Amount
November 23, 2003     $14,271.38
May 7, 2004                $17,992.14
June 19, 2004             $22,902.75

Reason: Microsoft Product Team described this behavior as bug in SQL Server 2005 & 2008, which has gotten fixed in SQL Server 2008 R2. As per Microsoft Team:

"The behavior in SQL Server 2005 and 2008 was a bug. Adding curly braces around a single tuple should not impact the query results in any way. A set that contains a single tuple is equivalent to the single tuple. In SQL Server 2008 R2, we fixed the bug, and produce correct results which are identical with and without the curly braces.

The nature of the bug was to ignore the tuple inside the braces because a related attribute was present in the WHERE clause of the query."





Readers: Kindly share your comments & suggestions. 


Sunday, February 27, 2011

Calculated Member & Named Set Definition

Analysis Services provides flexibility to add business logic by creating Calculated Member/Named Set on Measures or Dimensions in existing Cube. These are custom MDX expressions defined and saved as part of cube definition. There are multiple options to create these calculations depending on scope of use.
  1. Create Members in Calculations tab of Cube designer.
  2. Create Calculated Members at run time using client tools. For Example: SQL Server Management Studio, ProClarity...
Generally, developers know about logic behind calculated members. Sometimes, as a user, we want to know what expression has been used for these calculated members. And there is no option available to get the MDX logic behind these members except using calculation tab of cube designer.

We can use DMVs (i.e. Dynamic Management Views) to get mdx expressions of Calculated Members. DMVs are only available for SQL Server Analysis Services 2008 and not for earlier versions.
Below are stored procedures to get Calculated Members expression: In order to run these procedures, create a linked server on Analysis Server. Here in these examples, linked_server name is 'CUBE'.

1. Get Calculated Measure Expression:
Parameters detail:
  1. @linked_server: Name of Linked Server created on Analysis Server.
  2. @cube_name: Cube Name
  3. @measure_name: Calculated Measure Name
/*
EXEC GET_MEASURE_EXPRESSION @linked_server = 'CUBE'
                        , @cube_name = 'Adventure Works'
                        , @measure_name = 'Internet Gross Profit'
*/
CREATE PROCEDURE GET_MEASURE_EXPRESSION
    @linked_server VARCHAR(100)
    ,@cube_name VARCHAR(100)
    ,@measure_name VARCHAR(100)   
AS
BEGIN
    SET NOCOUNT ON
   
    DECLARE @cube_script VARCHAR(max)
    DECLARE @script VARCHAR(max)
    SET @cube_script = 'SELECT * FROM $system.mdschema_measures
                        WHERE CUBE_NAME = ''''' + @cube_name + '''''
                        AND MEASURE_NAME = ''''' + @measure_name + ''''''
    SET @script = 'SELECT
                    "MEASURE_NAME" AS MEASURE_NAME
                    ,"MEASUREGROUP_NAME" AS DIMENSIONS
                    ,"EXPRESSION" AS EXPRESSION
                    ,"DEFAULT_FORMAT_STRING" AS FORMAT_STRING
                    FROM OPENQUERY('+@linked_server+', '''
                    +@cube_script+ ''')'
    --PRINT @script
    EXEC (@script)
   
    SET NOCOUNT OFF
END


2. Get Named Set Expression:
Parameters detail:

  1. @linked_server: Name of Linked Server created on Analysis Server.
  2. @cube_name: Cube Name
  3. @set_name: Set Name
/*
EXEC GET_SET_EXPRESSION @linked_server = 'CUBE'
                        , @cube_name = 'Adventure Works'
                        , @set_name = 'Core Product Group'
*/
CREATE PROCEDURE GET_SET_EXPRESSION
    @linked_server VARCHAR(100)
    ,@cube_name VARCHAR(100)
    ,@set_name VARCHAR(100)   
AS
BEGIN
    SET NOCOUNT ON
   
    DECLARE @cube_script VARCHAR(max)
    DECLARE @script VARCHAR(max)
    SET @cube_script = 'SELECT * FROM $system.mdschema_sets
                    WHERE CUBE_NAME = ''''' + @cube_name + '''''
                    AND SET_NAME = ''''' + @set_name + ''''''
   
    SET @script = 'SELECT
                    "SET_NAME" AS SET_NAME
                    ,"DIMENSIONS" AS DIMENSIONS
                    ,"EXPRESSION" AS EXPRESSION
                    FROM OPENQUERY('+@linked_server+', '''
                    +@cube_script+ ''')'
    --PRINT @script
    EXEC (@script)
   
    SET NOCOUNT OFF
END

These procedures can be modified as per requirement to get more detail about Set/Member expressions.

More on DMV queries: SSAS DMV(Dynamic Management View)


Readers: Kindly share your comments & suggestions. 

Saturday, February 26, 2011

MDX Extraction VB Tool

I had already posted one blog about extracting MDX from Excel pivot table by installing Excel add-in. This post is again about extracting MDX. It is a very small executable file, which can be downloaded from below link and can be used to extract mdx.

Steps:

1. Run ExtractMDX.exe file and browse Excel Workbook, from which you want to extract MDX.. 
 

2. Select Worksheet, which has Pivot table created on Analysis Services cube.

3. If selected sheet has Pivot Table, Then it will get listed in "Select Pivot Table" drop down list. Select appropriate Pivot table.

4. Now click on "Extract MDX" button to extract mdx of the selected Pivot table.It may ask for below option:
5. Click 'OK'. MDX will get displayed in "Pivot Table MDX" text box.

6. This MDX can be saved/copied for future reference. "Refresh" button can be used to reset all selected settings.

Readers: Kindly share your comments & suggestions.

    Saturday, February 19, 2011

    Cube Mangement for Complex Business/Data Requirements

    SQL Server Analysis Services provides OLAP functionality. To use most of its features and capabilities to build a powerful BI solution, we need to understand basic cube designing & processing concepts.

    It looks quite easy to design OLAP cubes using Business Intelligence Development Studio, because it comes with cube designing wizard, which help to create cube objects. After cube design is done successfully with wizard, next step is to deploy it to on Analysis Server. And if everything goes fine, then cube can be browsed using different client applications.

    In real project scenarios, it is not as easy as said above. In order to to design a BI solution with Analysis Services for a complex data requirement requires more detail understanding of business problem and how Analysis Services can be used to provide a better BI solution. There are many online documentations/blogs/books available to design and manage cubes. Some of them are mentioned below:
    1. Books Online: SQL Server Analysis Services (SSAS):  http://msdn.microsoft.com/en-us/library/ms175609(v=SQL.90).aspx
    2. Basic cube concept videos: http://www.learnmicrosoftbi.com/
    3. Cube Concepts: http://training.inet.com/OLAP/Cubes.htm
    Here, I would like to explain, basic practices to design complex cubes. Lets try to generalize problem statement by considering some below stated points:
    • Underlying data warehouse/mart has around 20 dimensions and 6 facts.
    • Out of 20, 6 dimensions are incremental in nature, which grows by 50,000-200,000 records per day.
    • Fact tables size grows by 1,000,000  to 3,000,000 records per day depending on traffic in system.
    Business Requirement: Some requirements include:
    • Business users want to do analysis on 4 main dimensions (For Example: Time, Product, Entity, Location) on frequent basis with very high query performance (i.e. queries should return result within a second).
    • Maximum acceptable latency between logging system (i.e. OLTP) and cube is 3-4 hours.
    • Data availability should be 24*7 regardless of cube processing failure.
    Challenges: To handle such requirements, major challenges comes with cube management.
    • To manage such huge dimensions & facts.
    • To provide query performance within 1 second for mentioned business requirements.
    • To make cube available 24*7 irrespective of cube failure.
    • Cube has to be processed every hour to make latency within 3-4 hours.
    Solution: Cube design requires exact problem statement. So here I'll discuss how to manage such complex cubes & not the designing part.

    Cube Dimension Management: Processing large cube dimensions is an expensive process. So to manage such huge dimensions require proper attribute & hierarchy design.
    1. Use Dimension Views instead of Tables for source: Create views on dimension tables and use them while designing data source view (DSV) for large dimensions. It provides flexibility to dimension processing model to change view definition at processing time, so that it only reads new records. Dimension views also decouple link between dimension tables and DSV, which gives added advantage in case if there is any change in table's meta data information, then there will no direct impact on DSV.
    2. Use ProcessAdd processing type: Use ProcessAdd to process large dimensions, because it doesn't look for any data updates/deletes. Therefore, it makes processing faster in comparison to any other processing options. Use of Dimension views gives us an option to alter its definition and make sure only new records are read while processing.
    3. Alter View definition while processing: Every time before dimension precessing, their views should be altered. There are different ways to do it automatically. One method is to maintain a master table at date & hour level (If cube is being processed hourly) with process_status flag. Depending on process_status value, alter view definition and if dimension processed successfully, then update process_status flag for that date & hour as 'processed'.                                                   
    Fact/Partition Management: There can be many ways to manage large facts, but all of them require a very good design. One of them is discussed below:
    1. Create daily partitions: As we mentioned above that each fact table size grows by 1m - 3m on daily basis, and maximum latency acceptable is 3-4 hours. Therefore, to get this optimization level, Measure Group partitions should be created on daily basis. So that when we process partitions on hourly basis, only current day partition get processed, which minimizes overall processing time by great extent and also minimizes MDX query performance.
    2. Merge Older Partitions: If we create partitions on daily basis, then number of partition in a Measure Group will increase and will require strategy to manage these partitions. Analysis Services provides a utility to merge existing partition in a single a partition. One strategy could be to keep last 2 weeks data on daily partitions, last 3 months data on monthly partitions, and records older than 3 months on yearly partitions. This strategy can be maintained through AMO script and can be written in SSIS script task which can further be trigger by SQL Server Agent on daily basis.
    3. Add Slice Property: Analysis Server 2005 & 2008 internally assigns attribute ranges to each partitions so that MDX queries only hit the required partition and avoid unnecessary partition scans, which is referred as "Auto Slice". Refer this link for more details: Slice property. Also can look at this discussion Why MDX Queries use multiple partitions to return the result?  As per SSAS Performance Guide, auto slice works for MOLAP partitions, and Data slices are only applicable to ROLAP partitions. But there are so many cases where people observed that auto slice doesn't work for MOLAP partitions as well. I personally have faced this issue 2-3 times. For more detail on partition Slice, Please read this article: Get most out of partition slices
    4. Use ProcessAdd & ProcessFull to process Partitions: Use ProcessFull,  when new partitions get created, and use ProcessAdd afterthat to add records on hourly basis to an existing partition (ProcessAdd can only be applicable on existing partitions). ProcessAdd only processes new records, therefore makes processing much faster.
    Design Aggregations: Aggregations should be designed as per requirement, so that most used queries use aggregations and not partitions, which makes better query response. Example: For above mentioned requirement, we know that analysis will be done primarily on 4 dimensions, therefore we can design aggregations on particular hierarchy levels for these dimensions, so that queries will use aggregations instead of partitions.

    Server Architecture to make data availability 24*7: Data availability is a crucial part of cube management. MSDN best practice article on load balancing & cube availability is very briefly explained in Scale-Out Querying with Analysis Services.


    Readers: Kindly share your comments & suggestions. 

    Monday, February 14, 2011

    MDX Extraction of Excel Pivot Table (Excel Add-in)

    In today's BI world, Microsoft Excel is the most common & widely used BI tool for reporting and analysis because of its rich and easy to use features, which most users know better than me. So lets discuss about one Excel feature which is being used by many SQL Server Analysis Services users and other BI front end users.

    Microsoft Excel provides compatibility with various data sources (i.e. SQL Server, SQL Server Analysis Services,  Microsoft Query Wizard & ODBC ...). Out of these, SQL Server Analysis Services which is commonly known as SSAS, is widely used data source to browse OLAP Cube data and perform powerful analytics provided by SSAS and Excel.

    When we browse olap data, excel pivot table engine creates and sends MDX query to SSAS server. In most cases, users want to see what query is being used to perform this action because of number of reasons:

    1. To debug & test data consistency issues with cube/reports.
    2. To understand query which returns complex data sets.
    I have created an Excel add-on which returns MDX behind every pivot table result set. This can be installed on both Excel 2007 & 2010. Add-in is available on below link:

    Download: MDX Extract Add-in

    Installation & Configuration Guide

     Table of Content:
    • Installation Guide
      • Installation by using Excel add-in wizard
      • Installation by copying to Excel Installation directory  
    • Configuration/Working Guide 
    • Usage
    Section 1. Installation Guide:
    There are 2 ways to install this add-in in Excel.

    Installation by using Excel add-in wizard:
    • Open an existing Excel workbook or create a new one.
    • Go to the office button and select Excel Options. 
    • In the Excel Options dialog box click on "Add-Ins", it will show all the existing Add-ins. In the "Manage" option choose "Excel Add-ins" and click "Go".
    • Add-Ins dialog box will open, which will show all the available add-ins that can be used in the current Excel. If required add-in available in list, then select the check box and press OK. 
    • Otherwise, browse "MDX Extract Add-in.xla" file and press OK.
    • Now "MDX Extract Add-in.xla" will be available in the available Excel Add-Ins list. Select this add-in and press OK to make it available in the Excel workbooks.
    • You may get a security warning while adding it, select "Enable".
    • Now this add-in will be available for all existing Excel workbooks and for new ones as well. 
    Installation by copying to Excel Installation directory:
    • Copy "MDX Extract Add-in.xla" to "C:\Program Files\Microsoft Office\Office12\XLSTART" folder in case of Excel 2007 and respectively to the Excel 2003.
    • Now create a new Excel or open an existing one, this add-in will automatically be get added to that excel.
    • To check whether add-in is available in the current working excel workbook/sheet, select Developer tab, and click on Visual Basic Tab.
    Section 2. Configuration/Working Guide:
     After the successful installation, you can now work on the tool which has very simple interface but powerful usage.
    • Open an excel workbook, and create a Pivot table report on Analysis Services cube.
    • Right click on any pivot table cell; you will get an added option as "MDX Query". 
    • Click on "MDX Query" option, 1 new sheet will get added to the existing workbook named as "MDX" with the corresponding underlying pivot table MDX in the first row and first column. After first run, whenever you will try to generate underlying MDX of excel pivot table, MDX will get added to the next empty row in the existing "MDX" sheet.
    Section 3. Usage:
    • It can be used to extract underlying MDX of Excel Pivot Table which can further be used for testing and debugging the existing issues in reports/cubes.
    • It can extract MDX of protected pivot table Excel reports. (I.e. Excel pivot table has a limitation that if there is any protected pivot table then pivot table cells become read only and user cannot make any changes to it and so cannot extract MDX as well).


    Readers: Kindly share your comments & suggestions.