T-SQL Grows Up

Understand the new language features in T-SQL 2005

Have you ever heard the expression "I want it so bad that I can taste it?" Well, that's the way I feel about SQL Server 2005. I've been looking forward to the official release for quite some time. When people ask me why, I usually say "SELECT TOP X." In addition to generating some funny looks, this response gets my point across rather nicely. The next version of SQL Server supports many enhancements to the T-SQL language, and the ability to pass a variable to the TOP clause is just one of them. While changes such as this may seem small and insignificant, they can have a large impact on SQL developers.

This month I will discuss a couple of new T-SQL features, focusing on some of those that will impact the way SQL developers write code. I'll describe the syntax of the new feature, and also point out how the feature might actually be used. In certain cases, I'll throw in some equivalent SQL 2000 code for comparison. The intent of these examples is to show some of the ways that the new T-SQL language will change a developer's coding habits. Keep in mind that there are many changes, both large and small, and that no single article can properly describe them all. You should consult the SQL 2005 Books Online or a future article for more information.

The TOP clause specifies that only the first set of rows of a result set should be returned to the caller. Although it is typically used in SELECT statements, the TOP clause can also be used in INSERT, UPDATE, and DELETE statements. Listing 1 shows the full syntax. If the PERCENT keyword is used, then the expression specified in the TOP clause represents the percentage of the total result set that should be returned. Without this keyword, the expression specified represents an actual row count. The WITH TIES clause, which can be used only when an ORDER BY clause is also used, indicates that additional rows may be returned if there are duplicate values in the columns specified in the ORDER BY clause.

In previous versions of T-SQL, the TOP clause could not accept an expression. The number of rows to return had to be stated explicitly. Sometimes, this could be a problem. The code in Listing 2 demonstrates how to select the first seven rows from the Employee table. The result set can be seen in Figure 1. In many situations, it is not possible to determine ahead of time exactly how many rows should be returned. Because the TOP clause could not accept an expression, developers had to resort to dynamic SQL statements such as the one shown in Listing 3. While dynamic SQL statements are not necessarily bad, using them just so that the TOP clause could return different numbers of rows was not optimal.

Because the TOP clause can now accept an expression, dynamic SQL is no longer necessary. A developer can use code similar to that in Listing 4 to return a variable number of rows. The code in both Listing 3 and Listing 4 returns the same results (Figure 1), but the code in Listing 4 is more efficient and less prone to errors. Wrap the code in a stored procedure, pass in the number of required rows via a parameter, and you can save the dynamic SQL for other uses.

Structured Error Handling
One of the absolute worst aspects of SQL 2000 development is error handling. In a large batch, especially one that implements explicit transactions, this can become pretty complex. The only real option is to check the value of the @@ERROR system function after each and every SQL statement. In many cases there would be more code for error detection and correction than code for the actual procedure logic. In large batches, this could lead to code that is not only hard to read, but difficult to maintain as well.

The new TRY-CATCH construct, shown in Listing 5, addresses this. Any number of SQL statements can be placed between matching BEGIN TRY and END TRY statements. This is followed by matching BEGIN CATCH and END CATCH statements, which contain error-handling code. If an error is generated by the code inside the TRY block, the code in the CATCH block is executed. If no error is generated, the code in the CATCH block is bypassed. As you might expect, TRY-CATCH constructs can be nested. The TRY or the CATCH blocks, or both, can contain additional TRY-CATCH constructs. In addition, there are a few T-SQL functions that work only from within a CATCH block. These functions are summarized in Table 1.

In Listing 6, we see an example taken from the SQL 2005 Books Online. The code attempts to delete a row from the Product table, which belongs to the Production schema. This will not succeed, and instead causes a constraint violation. When this error occurs, the code in the CATCH block will execute, which returns useful error information to the caller. The result set is shown in Figure 2. This programming paradigm is vastly superior to that allowed by previous versions of SQL Server.

Common Table Expressions
A Common Table Expression (CTE) is a temporary, named result set. Once created, a CTE can then be used in subsequent statements in the same SQL batch. This provides two benefits. First, it allows queries to be written in a more clear and concise manner. Second, because CTEs can refer to themselves, they allow the creation of recursive queries. In previous versions of SQL Server, stored procedures could call themselves, but SQL statements could not. CTEs therefore represent a new and powerful addition to the T-SQL language.

To create a CTE, you must use the new WITH statement (see Listing 7 for the syntax). In Listing 8, a simple, nonrecursive CTE has been created. It consists of a single SELECT statement that returns three columns from the Employee table. Because the WITH statement does not specify any column names, the names defined in the SELECT statement will be used. Therefore, this CTE creates a result set that contains three columns from the Employee table, and assigns it the name "Employee." (If a CTE uses the same name as a table, the CTE name takes precedence in subsequent queries.) To use the CTE, all you have to do is refer to it in a query. The SELECT statement below this CTE returns the entire contents of the CTE's result set, which is shown in Figure 3.

As useful as this is, we can do better. In Listing 9, the WITH statement includes names for the columns in the result set. Notice that the number of names matches the number of columns in the embedded SELECT statement. The query also includes a GROUP BY clause and the COUNT aggregate function. The result set, shown in Figure 4, lists the occurrences of Married versus Unmarried employees broken down by Gender. In Listing 10, a single CTE is used twice in the subsequent query. The results of this query, shown in Figure 5, compare each employee's total sales figures to that of their managers. As you can see, these capabilities can be used to make complex SQL batches more readable. You should remember, however, that they typically exhibit comparable performance to queries written without CTEs.

Now comes the good part. Remember that CTEs can refer to themselves, thus allowing recursive SQL statements to be created. This capability is very useful when you store hierarchical data in an SQL table. Retrieving this data is most easily accomplished via a recursive SELECT statement. However, this was not possible in earlier versions of SQL Server. Developers were reduced to creating recursive stored procedures or more complex WHILE loops. Now with CTEs, we can easily use recursion to simply the process.

Suppose that we create a table to store employee information. Each row of the table contains an Employee ID, pertinent employee information, and a Manager ID. (This is actually the case in the SQL Server 2005 AdventureWorks database.) The Manager ID contains the Employee ID of the employee's manager. This hierarchy can be arbitrarily deep. In Listing 11, we see a recursive CTE that retrieves data from this table. The first thing that you will notice is that this CTE contains two SELECT statements joined together with a UNION ALL clause. Although multiple statements can be used in a nonrecursive query, they are required in a recursive one. The first SELECT statement handles the "anchor" or terminating condition. This statement returns those employees at the top of the hierarchy and who therefore have no managers. The second statement handles the recurring condition, returning those employees that do have managers. Notice that this second statement contains an INNER JOIN between the Employee table and the CTE itself. This is what makes the CTE recursive. Figure 6 shows the result set, which lists the manager ID for each employee.

In this article I've shown a few of the new T-SQL features in SQL Server 2005, and how they can change the way that developers write SQL batches and stored procedures. There are, of course, many more for us to discover and use. Some changes allow cleaner code, while others boost performance. Still others give us capabilities that did not exist in previous versions. (I am particularly fond of the new Ranking Functions and the OUTPUT clause, for example.) Nothing beats a few passes through the SQL Server Books Online for a good overview of the new language. I highly recommend that you take some time to read it. Future articles will explore more of these new T-SQL enhancements. However, next month we will visit another of the SQL Server 2005 "biggies": native XML and XQuery support.

More Stories By Jerry Dixon

Jerry Dixon is a senior developer and architect for ACH Food Companies in Memphis, Tennessee. Over the past 16 years he has led development projects for a number of enterprise, mid-level, and small business organizations. While he has fulfilled multiple roles as an infrastructure designer, database administrator, and software developer, he specializes in XML, SQL and ASP.NET. He is a co-leader and frequent presenter at the Memphis .NET User Group. Jerry holds the following Microsoft certifications: MCSD (VB 6.0 and .NET), MCDBA (SQL 2000), MCSA (Windows 2000 and 2003), MCSE (Windows 2000), MCAD (.NET), MCT. He resides in Olive Branch, MS with his wife and son.

Comments (1)

Share your thoughts on this story.

Add your comment
You must be signed in to add a comment. Sign-in | Register

In accordance with our Comment Policy, we encourage comments that are on topic, relevant and to-the-point. We will remove comments that include profanity, personal attacks, racial slurs, threats of violence, or other inappropriate material that violates our Terms and Conditions, and will block users who make repeated violations. We ask all readers to expect diversity of opinion and to treat one another with dignity and respect.