Temp Tables vs Table Varibles: The Great Debate

There seems to be a lot of confusion around the differences between  temp tables and table variables (in the classic sense; I'm not talking about in memory table types or anything here). Some people say you should only use table variables. Some people say you should only use temp tables. Most people caution a more nuanced approach. Here I want to cover what I think are the biggest issues in an attempt to shed some light on this surprisingly tricky topic.

Statistics This is usually the most important factor when considering which one to use. It boils down to two facts, and everything else flows from these: Temp Tables maintain statistics, Table Variables do not.

Table Variables Because table variables maintain no statistics, the Query Optimizer always assumes the contain exactly one row. As a result, any joins against a table variable, unless explicitly told to behave otherwise will (probably) alwaysbe a nested loop join.

As long as the number of rows in the table variable are small…

Beware the Errors of SSIS

SSIS is a bit of a touchy subject where I work, and the more I use it, the more I sort of understand why.

While SQL Server is my background and I consider myself an expert in TSQL (not like, MVP level or anything), I don't do too much in SSIS. We have a home grown architecture which, in many ways, mimics functionality SSIS can do, and, for our purposes, usually works better and comes with fewer headaches. That said, there are some operations SSIS is simply hands down the best tool for the job. But I didn't come here to talk to you to day about when to or not to use SSIS. No dear reader, I come with a warning/complaint about how it surfaces error.

The description of how it surfaces errors is "not very well". It's undoubtedly something you get better at teasing out over time, but I think largely because of the breadth of systems it needs to interact with, surfacing meaningful errors for specific stages can often seem to have absolutely nothing to do with the error at hand.

Take for instance the following error I got and subsequently spent a day and a half trying to figure out.

[Execute SQL Task] Error: Executing the query "exec alerts.dbo.SyncAddressesBatch @BatchId = ?" failed with the following error: " ". Possible failure reasons: Problems with the query, "ResultSet" property not set correctly, parameters not set correctly, or connection not established correctly

Here's the setup. Execute SQL Task, which calls a proc with a single parameter.

What the proc does isn't important. Sufhttps://msdn.microsoft.com/en-us/library/ms140355.aspxice it to say it accepts a @BatchId parameter and then does stuff. 

So how would you go about troubleshooting this issue? Well first, let's look at the error text. The first troubling thing is all the "or"s in the error. Could be A... or B... or C... or something. Googling the error you find responses related to pretty much all the cases there; none of which applied to my situation. There's also the empty string error message.

So with the internet not being able to help, what would you look at? first, you need a little background on Execute SQL Task Parameterization (a topic I'll let you read up on your own if you're interested). The syntax for parameters is a little funky, but basically you put question marks in your SQL where you want to substitute a value, and then on the Parameter Mapping page, you supply variables and name them by the ordinal position the question mark appears (so the first question mark in the script is Parameter Name: 0, the second question mark is Parameter Name: 1, etc). Everything looked on the up and up. I put in a script stage to verify that the parameter I was passing indeed had a value, and wasn't invalid or null or something. Finally I threw up my hands in frustration and went home for the day. 

When I came back I just started fiddling with things. I tried omitting the (required) @BatchId parameter. It correctly surfaced an error from SQL Server saying I didn't provide a required parameter. I tried replacing the @BatchId with a hard coded "1" (getting rid of any parameterization altogether), but still got the same error.

Finally, I think by accident, I put a return statement early on in the proc, and without knowing that, re-ran the package. To my confused delight, the package succeeded. Through a little more sleuthing, I figure out that what was happening was the underlying procedure was hitting a primary key error, and the empty string it was surfacing just happened to be the result of a null concatenation. Easy fix on both counts, but very frustrating to have to arrive at that conclusion in such a circuitous and laborious process

tl;dr, it's difficult to know when to take SSIS's errors seriously and when to assume it's a different component. Other than making these mistakes yourself and learning from them, there's not any deterministic way I can think of to decide which to do when. All I can say is take any errors you see with a healthy dose of skepticism and if you find yourself banging your head against a wall trying to figure out what a given, vague, error really means, take a step back, gather your thoughts, and perhaps approach it from a completely different angle. We often rely on informative error messages to guide our development (and rightfully so). But SSIS has a habit of trolling us developers, whether it knows it or not. Please, do not feed the trolls.


Popular posts from this blog

JOINs Using Playing Cards

Named Constraints on Temp Tables

Master Data Services on Windows 10