Two more powerful features in SQL Server 2005

SQL Server 2005 has many new and improved features. So many, in fact, that it would be very difficult to cover them all thoroughly. Therefore, I've been concentrating on those features that I feel will have the most impact on developers. Because of this, I want to present two more new T-SQL features. They were left out of my first T-SQL article because of space limitations. However, I believe that interest in these capabilities has grown, and that they deserve some coverage here.

I'm including the first one, the PIVOT clause, because it has generated a lot of interest on the Internet. (In one forum, someone declared that SQL Server is underpowered because it "doesn't even have a PIVOT statement." Well, now it does.) I'm also including it because I believe that code created with PIVOT is much easier to read than code created without it. As someone who has had to maintain another person's obtuse code, I appreciate clarity. The PIVOT clause helps provide that clarity, and it also performs a useful task at the same time. Combine this with the related UNPIVOT clause, and you've got yourself a powerful tool for massaging data.

The second feature is the OUTPUT clause. I'm including this one because I'm finding an increasing number of situations where this clause solves some specific problems. It provides SQL batches and stored procedures with some of the most useful features of triggers. Although you probably won't use it every day, this clause deserves a place in your programming arsenal.

Sometimes your data isn't in the format that you need. Sometimes this means that the data are available as a series of rows, but you want a series of columns. Previous versions of T-SQL didn't provide an easy way to convert one representation into the other. Developers had to write complex expressions involving SUM functions and CASE statements to reorganize their data. Those desiring a more generic solution had to use dynamic T-SQL and cursors. While this wasn't exceedingly difficult, it was more trouble that it should have been. Let's examine a typical SQL 2000 batch and compare it to an equivalent one from SQL 2005. That will give us a good idea of how much easier things can be.

Refer to Listing 1 for an example. The first bit of code declares a table-valued variable and inserts a couple of rows into it. You can see this data in Figure 1.

The next section of code performs a pivot operation using SQL 2000 syntax. As you can see, this involves some tricky T-SQL expressions. The idea is to use CASE statements to ensure that each column gets only the data that we want it to have. Then we use SUM functions to add all of this data together. Note that we have to include expressions for each column that we want to create. The result is shown in Figure 2. We've taken six rows of data and pivoted them into three columns.

Now let's do the same thing with SQL 2005. The last bit of code in Listing 1 uses the same table-valued variable and the same data as our first example. This time, however, we use the new PIVOT clause instead of all of those SUM functions and CASE statements. All we have to do is specify the aggregate function and value column, the pivot column, and the pivot values themselves. (The pivot values will become columns in the result.) In our example, we've chosen SUM(OrderCount), the MonthNumber column, and month numbers 1, 2, and 3 respectively. Compare the SQL 2000 SELECT statement with this one. You can see how much easier and more readable the SQL 2005 version is. Now imagine that we needed to pivot 15 columns instead of three. Which version would you rather use?

Of course, when you pivot table data, you don't always change rows into columns. You can go the other way too. You can convert columns of data into rows. Now, I've never gone to the trouble of figuring this out with SQL 2000, but in SQL 2005, it's easy. Listing 2 shows how. First we create a table-valued variable and fill it with data just like the pivoted data from our previous examples. We then use the UNPIVOT clause to create our rows. Once again, we specify the value column, the pivot column, and the pivot values. This time, however, the value and pivot columns can't exist in the source table. They represent columns that will be created for us. The result can be seen in Figure 3.

As you can see, we've created a copy of our original table, but with the OrderCount values summed together. The UNPIVOT clause can't un-sum values; it can only un-pivot them.

There are many situations where you need to create a copy of the data from an INSERT, UPDATE, or DELETE statement. You may want to keep a record of all changes made to a table, or maybe you need some information for debugging purposes. T-SQL expressions can become quite complex, so it's not always obvious exactly what the output from a particulate statement may be. In some cases, you may need this data immediately and in the same stored procedure or batch that created it. In other cases, you may want to store the data for later use.

There are two common approaches to this; implement triggers that fill audit tables with the resulting data, or add extra code to stored procedures and batches to do something similar. In both cases, the end result is the same; the desired data is put into a table for later use. There are drawbacks to each approach. Triggers operate after the fact, and they can't be used with table-valued variables. Extra code in a stored procedure or batch can quickly become hard to maintain and isn't guaranteed to produce the exact same results as the original statement. SQL Server 2005 offers an alternative.

The new OUTPUT clause provides us with some of the features of each approach. We have access to the actual data values via the special deleted and inserted tables, just like we do with the trigger approach. We can also place the resulting data in a real table or in a table-valued variable, just like we can with the extra-code approach. This new clause is by no means a replacement for the above techniques, but it does offer an interesting alternative in particular situations. Let's see how.

Examine the code in Listing 3. There are several sections of code there. The first section declares a table-valued variable and inserts some sample product information. The next three sections demonstrate how the OUTPUT clause can be used with INSERT, UPDATE, and DELETE statements respectively.

The INSERT statement section first declares a table-valued variable that will hold the information that we want to collect from the upcoming INSERT statement. The INSERT statement begins as normal with the target table and column list. Next, however, we use the new OUTPUT clause. The clause specifies the information that we want to insert into a separate table. Here, I've chosen to store the data in the @InsertDetailsTable variable. (We could have chosen a regular table, a temporary table, or no table at all. When no table is specified, the results are returned to the caller.) Note that some of the data that I'm collecting comes from the inserted table and that the rest comes from T-SQL functions. This way, I'm guaranteed to collect accurate data no matter how complex the data value list may be. After the OUTPUT clause, the regular data value list appears. This can be a VALUES clause or a SELECT query, just like in a regular INSERT statement. Finally, the contents of the product and insert details tables are returned to the caller. The result set is shown in Figure 4. We've inserted a new product and captured the data in a tracking table.

The next section of code performs similar operations with an UPDATE statement. The price of one product in the product table is increased by 10% and the details are captured in the @UpdateDetailsTable variable. Notice that this time the OUTPUT clause pulls data from both the inserted and deleted tables. The results are shown in Figure 5.

The final section of code demonstrates the OUTPUT clause when used with a DELETE statement. You can see the related output in Figure 6. I've deleted a specific product row and recorded that fact in the tracking table.

The PIVOT, UNPIVOT, and OUTPUT clauses provide capabilities that didn't exist in previous versions of T-SQL. In the past developers needing this functionality were forced to create stored procedures or batches that were more complicated than necessary. Now, with SQL Server 2005, they can write much simpler, more manageable, and much smaller code. This code produces the same results as before, but more efficiently. In my opinion, this is one of the major strengths of SQL Server 2005. Developers can do more, and write less code at the same time.

More Stories By Jerry Dixon

Jerry Dixon is a senior developer and architect for ACH Food Companies in Memphis, Tennessee. Over the past 16 years he has led development projects for a number of enterprise, mid-level, and small business organizations. While he has fulfilled multiple roles as an infrastructure designer, database administrator, and software developer, he specializes in XML, SQL and ASP.NET. He is a co-leader and frequent presenter at the Memphis .NET User Group. Jerry holds the following Microsoft certifications: MCSD (VB 6.0 and .NET), MCDBA (SQL 2000), MCSA (Windows 2000 and 2003), MCSE (Windows 2000), MCAD (.NET), MCT. He resides in Olive Branch, MS with his wife and son.

Comments (4)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.