Retrieving SQL Server Product key from Registry (all versions)

In a previous post, I modified an existing script by Jacob Bindslet that retrieves the SQL Server product key stored in the registry.  The modified script worked for SQL Server 2012 only.  Since then, SQL 2014 and SQL 2016 has been released, and inevitably people will misplace product keys. I’ve decided to cleanup/revise the script so that it can run for all versions of SQL Server 2005 and above.

The script scans through the possible registry locations for the different SQL Server versions, and outputs all keys it finds by version.  So far I’ve been able to test it for 2008R2/2012/2014.

## function to retrieve the license key of a SQL 2012 Server.
## by Jakob Bindslet (jakob@bindslet.dk)
## 2012/2014/2016 Modification by Xian Wang (daanno2@gmail.com)

function Get-SQLserverKey {

    param ($targets = ".")
    $hklm = 2147483650 #HK_LOCAL_MACHINE
    $regPath = $null
    $baseRegPath = "SOFTWARE\Microsoft\Microsoft SQL Server\"

    ##SQL2016 130
    ##SQL2014 120
    ##SQL2012 110
    ##SQL2008R2 105
    ##SQL2008 100
    ##SQL2005 90
    $sqlVersionArray = "90","100","105","110","120","130"


    $regValue1 = "DigitalProductId"
    $regValue2 = "PatchLevel"
    $regValue3 = "Edition"

    ##loop through all Hosts
    Foreach ($target in $targets) {

        ##loop through all potential SQL versions
        Foreach($sqlVersion in $sqlVersionArray) {
            $regPath = $baseRegPath + $sqlVersion + "\Tools\Setup"

            $productKey = $null
            $win32os = $null
            $wmi = [WMIClass]"\\$target\root\default:stdRegProv"
            $data = $wmi.GetBinaryValue($hklm,$regPath,$regValue1)

            if($data.uValue -ne $null) {
                [string]$SQLver = $wmi.GetstringValue($hklm,$regPath,$regValue2).svalue
                [string]$SQLedition = $wmi.GetstringValue($hklm,$regPath,$regValue3).svalue

                $binArray = $null

                #Array size is dependant on SQL Version
                if([convert]::ToInt32($sqlVersion,10) -gt 105) {
                    $binArray = ($data.uValue)[0..16]
                }
                else { 
                    $binArray = ($data.uValue)[52..66]
                }

                $charsArray = “BCDFGHJKMPQRTVWXY2346789”.toCharArray()

                ## decrypt base24 encoded binary data
                For ($i = 24; $i -ge 0; $i--) {
                    $k = 0
                    For ($j = 14; $j -ge 0; $j--) {
                    $k = $k * 256 -bxor $binArray[$j]
                    $binArray[$j] = [math]::truncate($k / 24)
                    $k = $k % 24
                    }
                    $productKey = $charsArray[$k] + $productKey
                    If (($i % 5 -eq 0) -and ($i -ne 0)) {
                        $productKey = "-" + $productKey
                    }
                }
                $win32os = Get-WmiObject Win32_OperatingSystem -computer $target
                $obj = New-Object Object
                $obj | Add-Member Noteproperty Computer -value $target
                $obj | Add-Member Noteproperty OSCaption -value $win32os.Caption
                $obj | Add-Member Noteproperty OSArch -value $win32os.OSArchitecture
                $obj | Add-Member Noteproperty SQLver -value $SQLver
                $obj | Add-Member Noteproperty SQLedition -value $SQLedition
                $obj | Add-Member Noteproperty ProductKey -value $productkey
                $obj
            }
        }
    }
}
##Dummyproof local execution for people who don't PowerShell
Get-SQLserverKey

SQL Server Query Optimizer – when joins are ignored

The query optimizer in SQL Server is very powerful and smart – provided you give it enough information and clues to do its job.  One recent discovery caught me by surprise – the fact that SQL Server can skip performing joins altogether in certain scenarios.

I came upon this while reviewing the definition of views in Dynamics CRM, most of which look like this:

from [AccountBase]
left join [CustomerAddressBase] XXaddress1 on ([AccountBase].[AccountId] = XXaddress1.ParentId and XXaddress1.AddressNumber = 1)
left join [CustomerAddressBase] XXaddress2 on ([AccountBase].[AccountId] = XXaddress2.ParentId and XXaddress2.AddressNumber = 2)
left join [AccountBase] [account_master_account] on ([AccountBase].[MasterId] = [account_master_account].[AccountId])
left join [LeadBase] [account_originating_lead] on ([AccountBase].[OriginatingLeadId] = [account_originating_lead].[LeadId])
left join [AccountBase] [account_parent_account] on ([AccountBase].[ParentAccountId] = [account_parent_account].[AccountId])
left join [ContactBase] [account_primary_contact] on ([AccountBase].[PrimaryContactId] = [account_primary_contact].[ContactId])
left join [EquipmentBase] [equipment_accounts] on ([AccountBase].[PreferredEquipmentId] = [equipment_accounts].[EquipmentId])
left join ...
left join ...

and so forth. Surprisingly, when viewing the execution plan, entire tables were skipped if no columns from those tables were referenced. Now, how is that possible? The result of a join affects the cardinality of the result set; therefore, the criteria for skipping a join cannot be just not selecting columns from a certain table. The key, it turns out, is that the Query Optimizer must be able to guarantee the a one-to-one relationship between the two join tables.

One way SQL Server can guarantee this is if the right hand table has a unique key on the join column(s). Take for example, a table with this unique index:
uniqueIndex

Running this query and specifying a left join on both columns of the unique key, you can see the query optimizer realizes there’s no need to actually perform the left join:
join1

But if the join condition on the 2nd column is removed, the query optimizer realizes this is now a many-to-many relationship, and not performing the join would be semantically incorrect:
join2

Not surprisingly, having a foreign key relationship between the join tables also satisfies SQL Server’s criteria of not performing the join. In fact, further optimizations can be made when there’s a trusted foreign key relationship. Playing with the AdventureworksDW database, inner joins can be skipped:

sql_innerjoin

As can where exists conditions:
sql_whereexists

Of course, this is because SQL Server checks the existence of value against the target table when the record is inserted; effectively, this is improving query performance by transferring the workload to the insert/update/delete phase.

Curiously, if I select the join column from the joined table, SQL still thinks it has to perform the join, even though there is no difference between FIS.OrderDateKey and dd.DateKey in this scenario. I suppose the Query Optimizer isn’t omniscient.

sql_boo

The best use case for this technique is probably improving performance of large views that join to a lot of tables; this is often the case when abstracting complicated schemas from end users. For joins to complicated subqueries or CTEs expressions in the view, I’ve also successfully used this technique by creating a materialized view with a unique index, thereby guaranteeing to the query optimizer that the result set will be a one-to-one relationship.

A better way to write MDX Period over Period calculations

Period over Period calculations, which compare the results of one time period against a previous time period, are extremely useful in the financial world. Everyone’s interested in growth since last month/quarter/year.  Most MDX implementations that I’ve seen use the ParallelPeriod function, which

Returns a member from a prior period in the same relative position as a specified member

The typical approach would be to create separate Year over Year, Quarter over Quarter, and Month over Month calculations. For example, to create a Year over Year calculation for the previous year:

PeriodOverPeriod1

This script first calculates the amount for the current member of the hierarchy:
([Date].[Calendar].CurrentMember, [Measures].[Amount])

Then finds the previous member of this of a specific level in this hierarchy using the ParallelPeriod() function, and subtract that from the current member:
(ParallelPeriod(
[Date].[Calendar].[Calendar Year],
1,
[Date].[Calendar].CurrentMember
),
[Measures].[Amount])

Here’s how that looks in excel, along with a demonstration of what happens if you try to use this calculations with other levels in the date dimension:

POP2

In essence, using this approach means we’d have to create a calculation for every level of the date hierarchy; this is tedious and a ton of maintenance for what amounts the same logic. Besides, it would more user-friendly to have one calculation that works for ALL levels in a natural date hierarchy. But how can this be done if the first parameter for ParallelPeriod() must accept a specific level argument? The trick here is to use .LEVEL property to dynamically return the level of the current member in the hierarchy:

([Date].[Calendar].CurrentMember, [Measures].[Amount])
-
(ParallelPeriod(
[Date].[Calendar].CurrentMember.Level,
1,
[Date].[Calendar].CurrentMember
),
[Measures].[Amount])

Here’s how that looks in excel:
POP3

Lastly, since this calculation exclusively looks at the previous member of the hierarchy (previous year, previous month, etc), we can write the MDX expression even more elegantly using the PrevMember property:

([Date].[Calendar].CURRENTMEMBER,[Measures].[Amount])-
([Date].[Calendar].CURRENTMEMBER.PREVMEMBER,[Measures].[Amount])

SSAS Dynamic Security invalidates Aggregations

Analysis Services role-based security is great –  it gives us the flexibility to define sets of data that users should or shouldn’t see.  And when the number of different possible set combinations get too unwieldy to handle, we can use data driven dynamic security to meet all of our data security requirements in a single role.  To top it all off, in most cases there is very little perceived performance impact: upon connecting to the cube, SSAS creates a subcube of the data that you’re supposed to see.

That is, no performance impact until you check this little box:

visualtotals

Visual Totals means the fact measure aggregates only on members that the user is allowed to see; this is the expected behavior in almost all cases.  However, this also means any aggregations made at a higher granularity level than the attribute used for security is no longer valid.  Here’s a contrived example : we have aggregations designed at the Year level, which pre-calculates the sum for January….December.  Now, if we use the Month attribute to secure our cube, and disallow users from seeing September, that pre-calculated sum is no longer valid; SSAS now has to calculate the Year level from at least the month level.

The moral of the story is, be very careful about defining dynamic security using low granularity attributes.  It WILL invalidate most aggregations.  I don’t see the performance aspect of dynamic security mentioned whatsoever in most how-to articles.  In most cases, the performance hit is an acceptable compromise when it means less maintenance for the developer.  However, if performance considerations are critical, then less elegant solutions (creating separate cubes) could come into play.

As an aside, I would love to see SSAS be smarter about using aggregations.  In the previous example, SASS has to throw away the pre-calculated aggregation at the Year level due to just one member missing at the month level; it now has to calculate the Year total from the 11 Months that the user can see.  It would be a lot more elegant to just subtract September from the pre-calculated Year aggregation.  This becomes especially important in large Parent-Child hierarchies that can only aggregate at the lowest and highest levels.

TL;DR: Dynamic security with visual totals invalidates all aggregations designed at higher granularities.

Ragged Hierarchies and HideMemberIf: an in-depth look

One of the most poorly-documented features I’ve come across in Analysis Services is the handling of ragged/unbalanced hierarchies. I’m not going to write about parent-child hierarchies, which may be more intuitive to model but are terrible for performance. Instead, I’ll provide an analysis of the various options of the HideMemberIf functionality, and for which data modeling scenarios each option will work when clients set MDX COMPATIBILITY=1. The end-goal is not having to drill through repeating/null intermediate levels in client tools which almost always set MDX COMPATIBILITY=1.

Here’s the test query/data we’ll be working with:

SELECT 'Scenario 1' AS Scenario, 'Repeat lowest level' AS [Scenario Description], 'Joe CEO' AS Level1, 'Secretary' AS Level2, 'Secretary' AS Level3,  'Secretary' AS Level4
UNION ALL
SELECT 'Scenario 2' AS Scenario, 'Repeat lowest parent level' AS [Scenario Description], 'Joe CEO' AS Level1, 'Joe CEO' AS Level2, 'Joe CEO' AS Level3, 'Secretary' AS Level4
UNION ALL
SELECT 'Scenario 3' AS Scenario, 'Replace lowest levels with NULL' AS [Scenario Description], 'Joe CEO' AS Level1, 'Secretary' AS Level2, NULL AS Level3, NULL AS Level4
UNION ALL
SELECT 'Scenario 4' AS Scenario, 'Replace intermediate levels NULL' AS [Scenario Description], 'Joe CEO' AS Level1, NULL AS Level2, NULL AS Level3, 'Secretary' AS Level4

raggedhierarchydata

Now, we’ll set up the attribute relationships in our test dimension like this:

RaggedHierarchyAttributeRelationship

and create 4 separate hierarchies, each with a different HideMemberIf option:
RaggedHierarchies

And here are the results in Excel (I’ve highlighted the scenarios that work):
raggedhierarchyExcel

Couple of things stand out here:

  • In this simple dataset, {ParentName & OnlyChildWithParentName} produce the same results, as does {NoName & OnlyChildWithNoName}
  • We can sense a general pattern from the combination of options that work: we need to shift all levels “up”, and then either duplicate the lowest levels  or replace them with NULLs.

One disadvantage of modeling your data this way is that the levels in your hierarchies may no longer make sense, e.g.  a Country -> State -> City hierarchy will have “Washington DC” at the State level.  I’ll argue that in hierarchies where each level has a distinct semantic meaning, we should not hide levels; after all, Washington DC is a city, not a state.  In these cases, users may very well expect a NULL for the state level.  However, in hierarchies that DO represent a generic structure such as organizational report, where level names are more abstract e.g. Level 1 > Level 2 > Level 3, hiding levels make sense.  It alleviates the nuisance of users drilling down 10 levels of “Suzy Secretary”.

Note that using HideMemberIf also incurs a performance hit, though nowhere as much as parent-child hierarchies.  I may explore the query performance implications in a future post.

SSRS Tip: Use variables to store formatting expressions

SSRS provides a rich set of formatting options that allows you to control virtually every aspect of a report. Often times you’ll find that these formatting expressions are repeated throughout the report, and having to manually code this numerous times for each affected column.

As a good practice, if I find that the same formatting expression is used more than once, I’ll define a report variable for the format expression, and use that to drive the formatting of multiple columns. Let’s say we want to round a percentage figure to a whole number. Instead of  hard coding it like this:

ssrsformatbefore

You can create a report variable:

reportproperties

 

reportvariable

 

and then set the format like this:

setformat

 

 

Besides not repeating the same expression in multiple places, we can now also change the formatting for all affected columns by altering one variable.   More innovative uses might entail populating formatting variables with run-time parameters.

Retreiving Sql Server 2012 Product Key from the registry

UPDATE: Please see revised script that should work for all modern SQL Server versions here

If you’ve misplaced your Sql Server installation key, you can retrieve it from the Windows Registry using a powershell script posted here.  However, that script works only for Sql 2008; in 2012, the location of the key in the Registry has changed, as well as the format. So I’ve posted my modification of the powershell script below:

function Get-SQLserverKey {
## function to retrieve the license key of a SQL 2012 Server.
## by Jakob Bindslet (jakob@bindslet.dk)
## 2012 Modification by Xian Wang (daanno2@gmail.com)
param ($targets = ".")
$hklm = 2147483650
$regPath = "SOFTWARE\Microsoft\Microsoft SQL Server\110\Tools\Setup"
$regValue1 = "DigitalProductId"
$regValue2 = "PatchLevel"
$regValue3 = "Edition"
Foreach ($target in $targets) {
$productKey = $null
$win32os = $null
$wmi = [WMIClass]"\\$target\root\default:stdRegProv"
$data = $wmi.GetBinaryValue($hklm,$regPath,$regValue1)
[string]$SQLver = $wmi.GetstringValue($hklm,$regPath,$regValue2).svalue
[string]$SQLedition = $wmi.GetstringValue($hklm,$regPath,$regValue3).svalue
$binArray = ($data.uValue)[0..16]
$charsArray = "B","C","D","F","G","H","J","K","M","P","Q","R","T","V","W","X","Y","2","3","4","6","7","8","9"
## decrypt base24 encoded binary data
For ($i = 24; $i -ge 0; $i--) {
$k = 0
For ($j = 14; $j -ge 0; $j--) {
$k = $k * 256 -bxor $binArray[$j]
$binArray[$j] = [math]::truncate($k / 24)
$k = $k % 24
}
$productKey = $charsArray[$k] + $productKey
If (($i % 5 -eq 0) -and ($i -ne 0)) {
$productKey = "-" + $productKey
}
}
$win32os = Get-WmiObject Win32_OperatingSystem -computer $target
$obj = New-Object Object
$obj | Add-Member Noteproperty Computer -value $target
$obj | Add-Member Noteproperty OSCaption -value $win32os.Caption
$obj | Add-Member Noteproperty OSArch -value $win32os.OSArchitecture
$obj | Add-Member Noteproperty SQLver -value $SQLver
$obj | Add-Member Noteproperty SQLedition -value $SQLedition
$obj | Add-Member Noteproperty ProductKey -value $productkey
$obj
}
}

Output:

output

Oracle OLE DB Performance with fetchsize

I’ve been working on some SSIS packages that move data from an Oracle source to a Sql Server destination. Unfortunately, the source and destination servers are in data centers located geographically far apart, so the throughput was never going to be optimal. However, I’m seeing some exceptionally bad throughput – something like 5 minutes to transfer ~500k rows of not very wide data using the OLE DB source.

Doing some googling revealed that using the Microsoft Oracle connectors by Attunity can give a 20 to 30 percent boost in performance when using Oracle as a source. To my great surprise, after installing and rewriting the packages using the Attunity connectors, I was experiencing more along the lines of 300% boost in performance!

However, there are downsides to using the Attunity connectors:

1) They are not included in an out-of-box SSIS installation; they require a seperate download and installation. As such, they may or may not be easy to deploy in your server environment.

2) Attunity connectors use a different data source connection that is incompatible with OLE DB source connectors. Additionally, you can’t Data Sources defined for the entire SSIS project, and must instead create a new connection for each SSIS package. This unnecessarily complicates management and deployment of your SSIS solution.

What’s interesting about the boost in performance with the Attunity connectors was the corresponding increase in network saturation. With the OLE DB source, the network utilization was around ~1%, but the Attunity connectors was getting 3%. How could there be such disparity in network throughput when the same query was issued to Oracle?

One clue I got was that when using the the Attunity Connector source, a new property called the Batch Size is exposed:
batchsize

Batch size controls how many rows SSIS retrieves with each “batch”.  After retrieving all the rows in a batch, SSIS signals to Oracle that it is ready for another batch.   Of course, network latency is introduced during this signaling process.  Thus, decreasing the number of total batches also decreases the number of round-trips SSIS has to make.  The default batch size is only 100;  increasing this to a reasonable number should boost performance.  In our environment, a value of 5000 seemed to be optimal.  But there was no equivalent setting in the OLE DB source.

After more googling and trying out performance ‘tricks’ from bloggers of various repute, I came upon an obscure blog which suggested altering the fetchsize property of the connection string.  The author was originally diagnosing poor  processing performance of an SSAS cube going against an Oracle data source, but I decided it was worth a shot doing the same for SSIS.

The default fetchsize is set at 100 rows, but anything from 100 to 1,000,000 is feasible, depending on your network and server configuration.  Adjusting this setting proved to be the exactly what was needed, boosting the performance by about 270% when I set it to 5000 rows.  Unfortunately, the fetchszie property is not exposed anywhere in the GUI when creating an OLE DB source.  Instead, you’ll have to edit the connection string manually.  In our environment, we use DTEXEC to call packages and point them to a run-time dtsConfig  file containing the connection string (which can be edited manually). YMMV.

I’m frankly surprised more people are not experiencing poor performance while using Oracle OLE DB source in SSIS, and especially how googling yielded very few helpful results.  Hopefully this post can be of some help to anyone who is having this issue.

 

TL;DR: Increase fetchsize property for OLE DB connections to Oracle.

Validating SSAS cubes against relational data source

It’s a pretty common requirement to validate the data in an OLAP cube against the underlying relational data source. This could be something as simple as manually spot checking numbers, but I prefer something more comprehensive and automated.

The Sql Server relational database engine is capable of accessing other OLE DB datasources via OPENROWSET, which is similar to Linked Servers, the main difference being that OPENROWSET lets you make an ad-hoc connection to a data source without the need to set up a Linked Server in advance. This feature led me to the idea of comparing a set of data from a relational data source with another set of data from an OLAP source by writing two equivalent queries. The basic query structure would go something like this:

(select * from [olap source]
except
select * from [relational source])
union
(select * from [relational source]
except
select * from [olap source])

The except set operator returns all rows in Set A that is not in Set B. Doing this operation in reverse will return all rows in Set B not in Set A. Needless to say, our goal is to return 0 rows for this query.

The relational SQL query part is pretty straight forward. Using the AdventureWorksDW sample database, let’s say we want to validate Internet Sales for all customers (as differentiated by Email address). The SQL query would be

SELECT EmailAddress,
sum(SalesAmount) as SalesAmount
FROM [dbo].[FactInternetSales] t1,
[dbo].[DimCustomer] t2
where t1.CustomerKey = t2.CustomerKey
group by EmailAddress

Now, for the OLAP part. The openrowset syntax to connect to an SSAS cube is:

select *
from openrowset('[Provider Name]',
'DATASOURCE=[Server name]';Initial Catalog=[Database Name]',
'[MDX QUERY]')

Here is the query we need to validate against the equivalent SQL statement:

select cast("[Customer].[Customer].[Customer].[Email Address]" as nvarchar(50)) EmailAddress,
"[Measures].[Internet Sales Amount]" SalesAmount
from openrowset('MSOLAP', 'datasource=localhost;Initial Catalog=AdventureWorksDW2012Multidimensional-EE;',
' SELECT NON EMPTY { [Measures].[Internet Sales Amount] } ON COLUMNS,
NON EMPTY { ([Customer].[Customer].[Customer].ALLMEMBERS ) }
DIMENSION PROPERTIES [Customer].[Customer].[Email Address] ON ROWS FROM [Adventure Works] ')

(Note that the provider for SSAS is MSOLAP).

Running the above mdx query will likely result in this error message:

Msg 15281, Level 16, State 1, Line 2
SQL Server blocked access to STATEMENT 'OpenRowset/OpenDatasource' of component 'Ad Hoc Distributed Queries' because this component is turned off as part of the security configuration for this server. A system administrator can enable the use of 'Ad Hoc Distributed Queries' by using sp_configure. For more information about enabling 'Ad Hoc Distributed Queries', search for 'Ad Hoc Distributed Queries' in SQL Server Books Online.

This means you will have to enable ad-hoc access to other data sources using OPENROWSET. This can be done by running the following query:

sp_configure 'show advanced options', 1;
RECONFIGURE;
GO
sp_configure 'Ad Hoc Distributed Queries', 1;
RECONFIGURE;
GO

Now with all the server settings out of the way, we can rewrite this query more elegantly:

with datamart as
(SELECT EmailAddress,
sum(SalesAmount) as SalesAmount
FROM [dbo].[FactInternetSales] t1,
[dbo].[DimCustomer] t2
where t1.CustomerKey = t2.CustomerKey
group by EmailAddress),

SSAS as (select cast("[Customer].[Customer].[Customer].[Email Address]" as nvarchar(50)) EmailAddress,
"[Measures].[Internet Sales Amount]" SalesAmount
from openrowset('MSOLAP', 'datasource=localhost;Initial Catalog=AdventureWorksDW2012Multidimensional-EE;',
' SELECT NON EMPTY { [Measures].[Internet Sales Amount] } ON COLUMNS,
NON EMPTY { ([Customer].[Customer].[Customer].ALLMEMBERS ) }
DIMENSION PROPERTIES [Customer].[Customer].[Email Address] ON ROWS FROM [Adventure Works] '))

(select * from datamart
except
select * from SSAS)
union
(select * from SSAS
except
select * from datamart)

And voila! You should have no rows returned.

No rows returned

This query could be either used in a report, or even incorporated at the end of the ETL process to check data consistency with cubes. The beauty of this approach is that the relational data source isn’t restricted to just Sql Server…it could be ANY supported OLEDB provider.

Changing TFS Login ID in SSMS

If you are using the TFS MSSCCI Provider for SSMS, it may not be obvious how you can change the Login ID for TFS.

In the plugin settings, the Login ID box is greyed out:

Image

To change the Id, first go to Plug-in Selection, and select Current source control plug-in to None:

Image

Click Ok. Reopen the options screen, and re-select TFS MSSCCI Provider as the source control plugin. Under  the Plug-in Settings tab, you can now enter your new Login ID:

Image