Schott SQL

Monday, January 17, 2011

SQL 2008 Merge and PK/FK Constraints

We ran into this issue a while back. SQL 2008 has a “feature” that will not allow inserts into a table on either side of a Foreign Key relationship. If you attempt to insert into a table involved in the FK relationship, you get an error something like:

The target table 'TableName' of the INSERT statement cannot be on either side of a (primary key, foreign key) relationship when the FROM clause contains a nested INSERT, UPDATE, DELETE, or MERGE statement. Found reference constraint 'ConstraintName'.

This is documented in Connect 435031. There is one workaround documented in the ticket – drop the FK constraint, run the MERGE query, then re-create the constraint. Happily, this bug is still open as the requirement to drop a constraint just to run a MERGE statement kind of defeats the purpose of putting a constraint on the table in the first place. While dropping the constraints will work, there is another possible workaround. I was able to run this without any issues while leaving my FK Constraint in place.

Create a temp table that matches the definition of the table into which you want to perform your insert.
Instead of inserting into your base table like you normally would in a merge, write your code to INSERT #Tablename at the beginning of the MERGE process. You’ll still use your main table and staging table inside the MERGE INTO … USING query.
After the Merge, insert into the main table using the values in your Temp table.

Here’s a shell of an example. Line 11 below contains the main difference to the MERGE query from a “normal” MERGE.

   1: --1: Create Temp table

   2: CREATE TABLE #MyFactTable(

   3:     ID INT NULL

   4:     , CustomerName VARCHAR(100) NULL

   5:     , SourceID INT NULL

   6:     , OutputAction VARCHAR(100) NULL

   7: );

8:

   9: --2: INSERT into the temp table instead of your normal target table

  10: --   Merge query will be the same otherwise

  11: INSERT INTO #MyFactTable (ID, CustomerName, SourceID, OutputAction)

  12: SELECT so.ID, so.CustomerName, so.SourceID, so.output_action

  13: FROM (

  14:     MERGE INTO dbo.MyFactTable AS t

  15:     USING Staging.MyFactTable AS s

  16:     ON ( s.ID = t.ID

  17:         AND s.NewLoad = 0 )

18:

  19:     WHEN MATCHED AND ( s.SourceID <> t.SourceID )

  20:             AND s.NewLoad = 0

  21:     THEN UPDATE

  22:         SET RecordState = 0

  23:         , UpdatedDate = getdate()

24:

  25:     WHEN NOT MATCHED BY TARGET AND s.NewLoad = 0 THEN

  26:         INSERT (ID, CustomerName, SourceID)

  27:         VALUES (s.ID, s.CustomerName, s.SourceID)

  28:         OUTPUT $action AS OutputAction

  29:             , ID

  30:             , CustomerName

  31:             , SourceID

  32:         ) AS so (OutputAction, ID, CustomerName, SourceID)

  33:         WHERE OutputAction = 'UPDATE'  ;

34:

  35: --3: Perform the final insert into your target table

  36: INSERT INTO MyFactTable (ID, CustomerName, SourceID)

  37: SELECT DISTINCT ID, CustomerName, SourceID

  38: FROM #MyFactTable ;

39:

  40: --4: Clean up your temp objects.

  41: DROP TABLE #MyFactTable ;

I’d love to hear if anyone else has a workaround for this that doesn’t involve dropping and recreating the FK constraint. If this is an issue for you, please vote for the Connect Ticket.

Monday, December 13, 2010

Book Review: Windows Phone 7 Plain & Simple

Overall, Windows Phone 7 Plain & Simple is a good book for people new to Windows Phone 7 and especially for people new to Smart Phones. Michael does a great job illustrating how to use the phone and get up to speed quickly. He walks quickly through the general operation, setting up the phone, and entering text in the first couple chapters. The next several chapters discuss using the most popular features of WP7: calling, e-mail, text messages, calendar, browsing the web, and using the maps/navigation. Next we learn more about music, video, taking pictures and videos, and using the Marketplace. We end with a quick session on using WP7's Office Hub for documents, spreadsheets, and OneNote. This is followed by a discussion on using the Zune software to synchronize files between your phone and your PC. (Michael doesn't mention the Mac software to do this. It's capable, but more limited.)

Having followed the progress of WP7, I found that I knew a lot of the information in this book. There were still a couple interesting bits of information that were news to me, but not many. Michael didn’t write this for people like me who have followed WP7 for some time. He wrote for people who aren’t really familiar with smartphones and especially with the changes that WP7 brings. For those people, this will be a useful read and a quick reference while they get used to their device. I plan to show this to my family members who have WP7 devices and aren't quite sure what to do with them.

The positives:

The information is very well presented, easy to follow, and broken out into logical sections.
Michael's casual writing is well-suited for this book.
Readers will not feel intimidated by their new phone and are encouraged to try things out.
Michael presents some information that you may not easily find by reading articles online.

The negatives:

I don't see this as a book that will be a long-lasting reference book. Once the reader has used the phone for a month or so, the lessons from this book should be second-nature.
WP7 already has two planned updates at the time of this review. A minor update in January 2011 that will reportedly enable Copy/Paste functionality and a larger one in February 2011. The errata for this book will need to be updated to take these changes into account as some information may be outdated.

Conclusion:
If you can find Windows Phone 7 Plain & Simple for a reasonable price and are new to the platform, pick it up. It's a short, but informative, read. If you've followed WP7 for a while, this is not likely the book for you as you'll know most of the information here already. You may still want to pick up a copy to share with family or friends who are new to the phone. It could save you "support" calls.

Please visit the Windows Phone 7 Plain & Simple page at O’Reilly for more information about this book.

Disclaimer: I received a free copy of this book in electronic format in return for providing an honest review. I was not compensated in any other way.

Tuesday, November 30, 2010

Book Review: Cooking For Geeks

(I know this isn’t SQL-related, but I think this may be a great read for people with similar interests.)

This is not a cookbook. If you’re looking for a new collection of recipes, this is not the book for you. If you like shows like Good Eats or enjoy seeing how food is prepared and served, you’ll almost certainly love Cooking for Geeks. If you like experimenting in the kitchen and knowing why food turns out the way it does, pick up this book!

As a geek, I loved Jeff’s analogy: Recipes are code. Follow a recipe as written and you generally get good results. Forget the where clause and you could have unrecoverable errors. Introduce your own changes and you could get something great or you could get something horrible that requires a lot of cleanup. Recipes may have bugs or need corrections. Perhaps there’s more than one way to the same result. Oh, and don’t forget to comment your recipe. Otherwise you might not be able to recreate something fantastic.

Each chapter of Cooking for Geeks deals with different concepts, each with their own scientific background. Common utensils, ingredients, time/temperature, baking, additives (chemicals), and even some geeky fun with hardware or unusual cooking techniques – all are included in a way that not only gives some neat recipes, but the science behind the recipes.

To me, the most interesting parts were on baking and the chemical reactions that take place as heat is applied. It was great reading exactly why food turns out with all of its various nuances. That science got me thinking about ways to tweak the outcome of various recipes I follow and was just fun to read.

You can get your own copy of Cooking for Geeks direct from O’Reilly.

Disclaimer: I received an electronic review copy of this book, though I’d likely have wanted to read and review this anyway.

Thursday, October 28, 2010

SSIS, OLEDB Data Sources, and Table Variables

I ran across an interesting problem when trying to tune some SSIS data source performance issues the other day. I had noticed that we were doing a pretty large select out of a table filtered on a handful of date-related columns. We then pulled in most of the data from the table. Well, SQL Server saw this query and apparently figured that since we needed most of the columns, a table scan would be the most efficient way to get that data. Since in real life, we only need a very small percentage of those rows, I decided to rewrite the select to pull in the PK values first, then pull in the data based on that subset.

My first attempt was to use a Temp Table. Sadly, using a temp table does not work in SSIS Data Sources, even if you put that inside a stored procedure. Disappointed, but not deterred, I decided to use a Table Variable. I quickly made the changes and tested them with some values I knew would returned quite a few results. Great – no worries. I was still using a stored procedure at the time and used some of our standard code to start it off so had no issues.

Ultimately, I didn’t want to write a bunch of stored procedures that I would need to maintain outside of SSIS just for this particular load. We had the code already inside of the OLE DB Sources, just not optimized as much as I’d like. I started pulling out the old queries, adding a table variable, populating it with the necessary PK values, the using that to filter out the main data set. Save, deploy, run, see major speed increase.

Then, I looked at the actual insert/update counts. Yes, it ran much faster, but in the process, it didn’t pick up any results. I was pretty confused because the step was successful, the code ran as expected through SSMS, there were no errors or warnings. I re-verified that running that same code got results in SSMS; it did. I started doing some searching to see what might be happening and found this post by Dustin Ryan.

In short, it seems that if you have row counts being returned (default behavior), you return a number of rows affected when you insert the table variable, which results in a “Done in Proc” type message, which in turn tells SSIS that the data flow is done and moves on to the next step. The workaround? Add SET NOCOUNT ON; to the beginning of your OLEDB Source query (or make sure this is in your stored proc – we tend to put that in our procs by default).

Once again, thanks to Dustin Ryan for his post on this topic. There’s a good possibility I’d still be struggling with the root cause of this without that assistance.

Saturday, July 24, 2010

Review: Windows PowerShell 2.0 Best Practices - Introduction

A while back I received a review copy of Windows PowerShell 2.0 Best Practices by Ed Wilson (blog | twitter) and the Windows PowerShell Teams at Microsoft. Having finally found the time to sit down and really digest the information within, I thought I’d do a multi-part review of the book. Ed Wilson has compiled an excellent resource not only to show off what PowerShell 2.0 can do, but also to provide practical examples and several useful tips for best practices not only in using PowerShell, but in writing reusable and efficient code.

This book is broken down into 5 major sections:

Introduction
Planning
Designing
Testing and Deploying
Optimizing

I want to give a quick overview of what’s contained in each chapter, though I’m not going to include the code from the book. You may be able to get it from the main page for the book, but ideally I think this book is well worth buying. You get a PDF file of the book and all of the code included in the book as well. If you plan to do a lot of PowerShell scripting, this book is well worth the money. I will say that this book is not designed to teach PowerShell. Ed Wilson really does expect that readers have a working knowledge of PowerShell. He builds on that knowledge to show how to write PowerShell scripts with these best practices in mind.

Chapter 1 contains an overview of PowerShell 2.0. Ed explains some of the differences between PowerShell 1.0 and 2.0, focusing on the new remoting capabilities, WMI enhancements, changes to the way some cmdlets work, and why they have changed the original behavior. He touches on some of the differences between VBScript and PowerShell and why PowerShell might be a better choice than VBScript, despite people’s familiarity with the latter. One of the key advantages of PowerShell over VBScript is that PowerShell was designed to be run from a prompt or as a script. VBScript must be run through something that calls the script. This command-line interaction allows people to work with the script in a much more interactive way and even use native Windows commands with which people are already familiar. Ed closes the chapter by reminding readers of the minimum requirements for PowerShell 2.0, informing readers from where they may obtain the latest version of PowerShell and suggesting where PowerShell 2.0 should be installed (pretty much everywhere).

Chapter 2 is where readers become more familiar with PowerShell 2.0 capabilities. We’ve already read about the promise of PowerShell 2.0, but now we can start to see it in action. Ed starts by introducing readers to the PowerShell 2.0 interactive command line. The first command shows users running processes inside the PowerShell environment. Ed then shows some of the easiest cmdlets to run and remember, followed by the “most important” cmdlets: Get-Help, Get-Command, and Get-Member. I agree that these three cmdlets are key. You can always find out more about what commands are available, how to run them, and what properties are available. Ed then lists several useful cmdlets to enhance filtering, grouping, sorting, and such. Chapter 2 ends with a good introduction to WMI, Remoting, Services, and the Event Log from a PowerShell perspective.

Chapter 3 concentrates on Active Directory. Ed reminds us that what we typically call “AD” is really a legacy term now and is actually “Active Directory Service Interfaces” or ADSI. We tend to think of AD in terms of users and domains, but it contains much more now than it used to. Ed jumps right in by showing readers how to reference ADSI Users, Groups, and Computers with some quick examples of how to reference each within PowerShell. There’s then a quick “Inside Track” by James Turner showing how PowerShell can be used to automate some common administrative tasks around user accounts. James tells us to use the right tool for the job as well. While the LDAP functionality allows more authentication methods, the WinNT functionality allows more object types and more familiar syntax. The bulk of Chapter 3 really concentrates on building a functional script to create and update Users within ADSI. The reader is guided from the more simple scripts to query data through reading a file to automate the creation and updating of user accounts. We also get to see several ways to accomplish this task to introduce the reader to reading from a connection to Excel.

Chapter 4 builds on what we’ve already learned from Chapter 3 to work within ADSI. Ed starts by helping us become familiar with the AD schema through PowerShell. As always, Ed actually puts to use the “Best Practices” that he wants his readers to learn. His first script includes a full Help example and several functions, followed by a breakdown of each function. I found it helpful to look through the original script while Ed explains the details so I could keep track of how each function fit into the bigger picture.

Ed then demonstrates how to query AD for information, going beyond merely listing the results – this is PowerShell after all and the results can be full objects. He gives a pretty good list of ADSI-related properties, shows how to display results that list results, then moves on to true queries – ADO and OLEDB queries. Of course, his overall goal is to really use PowerShell effectively. ADO and OLEDB are useful, but we also want to be efficient. Therefore, the reader is introduced to the DirectorySearcher class and PowerShell 2.0’s [ADSISearcher] to simplify calling DirectorySearcher. Along the way, we see the “FindOne()” method to limit the results to a single row. That is useful to see what data we can expect. Some familiarity with ADSI can really help when going through these chapters.

We’re led from these queries to actually managing user accounts. First, the reader needs to understand ADSI User Account Control values. He then shows using the Bitwise AND operation to compare the various bits set and determine properties of the various user accounts. Ed shows how to find and display disabled user accounts in a short script using bitwise AND and color coding. Readers are then shown how to move objects within AD without too much code. The final example details how to find missing values within AD. These are the values that are expected, but not always completed. Once again, I found it very useful to see the whole script while Ed explains each step of the script. Be sure to include the backtick if you want to type these scripts yourself and try them.

Conclusion

The Introductory section of Windows PowerShell 2.0 Best Practices is well-written and serves its purpose of getting the readers ready to move through the rest of the book. If I have any complaints about this section it’s that it uses ADSI pretty heavily. For people who don’t work directly with ADSI on a regular basis, a lot of the examples are harder to follow. While still possible to see the best practices for PowerShell, it’s easy to get lost in trying to figure out what Ed is doing with ADSI. The Planning section in chapters 5-8 seem to be less focused on a specific set of functionality and much more on how to plan and use PowerShell effectively. I plan to post a review of those chapters in the near future.

I’d appreciate feedback on this review. Too lengthy? Too short? Let me know what’s good and what’s not good. I’d like to write a useful and honest review for those considering this book.

Tuesday, April 13, 2010

Powershell for Database Developers

I recently presented for the SQLPASS AppDev Virtual Chapter on "Powershell for Database Developers". I promised that I'd get the files used in the demos up to my blog shortly so wanted to make good on that promise. First, some of these depend on having either SQLPSX or the SQL 2008 Snap-in loaded in your Powershell environment. Some of those demos won't work unless that's set up. I think I've got some posts on that already, but if you need help with it, leave me a comment or shoot me a message.

I know that the SQL ISE portions of my demo require SQLPSX. Once again, I highly recommend that you download and install these modules for your use. You don't necessarily need them in your startup profile, but they are useful if you work with SQL Server.

I'd also mentioned Red-Gate in my presentation. Here's one link for more details on Doing Schema Compares.

You can access all of my files used for this presentation on Skydrive.

Tuesday, March 16, 2010

Powershell: Writing Text Files

I was recently trying to generate a bunch of data that I needed to pass through one of our in-house DLLs in order to use some custom encryption algorithms. Thanks to the wonderful Powershell community, I found several examples on using custom DLL methods. However, I ultimately wanted this data to end up in SQL Server. My first attempt used an “Invoke-SQLCmd” Cmdlet from the SQLPS snap-in. However, generating several million rows took a while to insert. I knew that I didn’t want to queue all of this data in memory, so was inserting one row at a time. I stopped this process after a while because it was too time-consuming.

My next attempt was to push these values to a text file using a variable pointing to a file and then using add-content to write each line to the file. This took longer than doing the SQL inserts one at a time.

I remembered coming across an article by Linchi Shea titled Add-Content and Out-File are not for performance. After finding the link to that article, I found that he recommends using System.IO.StreamWriter to write out files in the most performant manner. I definitely recommend that you check out his article for his summary and examples. For my part, I can say that it took me the better part of 6 hours to generate and populate 10 million rows. I just generated 20 million into a file in < 15 minutes!

I’m including a mockup of my code below in case anyone is interested. I’m definitely not a Powershell expert so as always, be careful if you copy any of this code. My first attempt at directly inserting into SQL Server worked, but threw an error at about 3.5 million rows. As the time was already pretty lengthy by then, I stopped the process and moved on to writing the text file, resuming where I’d left off. That ran for hours until I finally killed it and looked for a better way. The uncommented code is what finally ended up working in a timely manner.

I started by loading my DLL. I called it’s “EncryptValue” function (against just the integer – that’s a longer story) and populated both out to a table. There are several reasons for doing this, but this gave me a good working set for a test project and definitely lays some groundwork for the future. If anyone has any ideas on how I can speed this up, feel free to leave me some comments. If I find some, I’ll post an update.

[Reflection.Assembly]::LoadFile("C:\MyDLL.dll")

# First attempt – insert directly into SQL Server.
#foreach ($i IN 1..10000000) { invoke-sqlcmd –query '
#("insert dbo.MyTable VALUES(" + "$i" + ",'" + [MyDLLNamespace]::EncryptValue("$i") + "')") -serverinstance "localhost" }

#Attempt #2 – Write to file with add-content
#$file = New-Item -type File "C:\Values.txt"
#foreach ($i IN 1..10000000) { add-content $file ("$i" + "," + [MyDLLNamespace]::EncryptValue("$i") )  }

$file = New-Object System.IO.StreamWriter "E:\Values.txt";

foreach ($i IN 10000001..30000000) { $file.Writeline("$i" + "," + [MyDLLNamespace]::EncryptValue("$i") )  }

$file.close();