Wednesday, November 9, 2011

The Short-Curcuit WHERE Conditional

I challenge myself often to optimize all of my code when possible. One of my challenges was to write stored procedures that execute as part of "ad-hoc" filtering required by applications. For example, say the user CAN filter a grid by 0-6 different fields, such as first name, last name, phone number, zip code, age, and marital status. A stored procedure could execute a select statement that included all fields as parameters:

CREATE PROCEDURE ReadPerson
    @FirstName varchar,
    @LastName varchar,
    @PhoneNumber varchar,
    @ZipCode varchar,
    @Age int,
    @IsMarried bit
    AS
    BEGIN
        SELECT *
        FROM Person p
        WHERE p.FirstName = @FirstName
        AND   p.LastName = @LastName
        AND   p.PhoneNumber = @PhoneNumber
        AND   p.ZipCode = @ZipCode
        AND   p.Age = @Age
        AND   p.IsMarried = @IsMarried
    END
    GO
Which could be executed like this:
Exec ReadPerson 'John', 'Doe', '8005551234', '90210', 50, 1
But now, let's say the user doesn't know the phone number. We now have a dilema. Should we write another stored procedure that eliminates the phone number from the WHERE clause. Not a good idea, since we will probably have to write more stored procedures that provide ALL of the combinations of the filter. We could re-write this stored procedure to interrogate each parameter and "build" a SELECT statement with Dynamic SQL. Not a good idea either, since Dynamic SQL is (1) hard to read, and (2) cannot validate or intellisense table names or columns within the single quotes (''). There are other solutions that are equally bad, but there is one that works just great!

Enter Short-Circuit evaluation! T-SQL (SQL 2008 and above) now perfoms short-circuit evaluations of expressions. But what is "Short-Circuiting"? A brief explanation.

Most high-level programming languages are able to evaluate boolean expressions using an optimization called short-circuiting, which can stop evaluating an expression as soon as the result can be determined.

For example, an "AND" operator can stop evaluating a combined expression as soon as it finds the first expression that evaluates to false.
IF (1 = 1 AND 1 = 5 AND 5 = 5)
In this evaluation, the order of the combined expression is key. The result of the combined expression will obviously be false. However, that last expression { 5 = 5 } is NEVER evaluated! The reason is that an AND operator requires all expressions to evaluate to true. Since the second expression { 1 = 5 } evaluates to false, the combined expression can immediately evalutate to false.

The same short-circuit evaluation occurs with an "OR" operator, that stops evaluating a combined expression as soon as it finds the first expression that evaluates to true.
IF (1 = 1 OR 1 = 5 OR 5 = 5)
Again, the order of the combined expression is key. The result of the combined expression will obviously be true. However, that second and third expressions { 1 = 5 OR 5 = 5 } are NEVER evaluated! The reason is that an OR operator requires only one expressions to evaluate to true. Since the first expression { 1 = 1 } evaluates to true, the combined expression can immediately evalutate to true.

Since Short-Circuit evaluation is now available to us, we can tackle our challenge with a VERY optimized stored procedure. Let's take a look at a re-write of the above stored procedure that uses Short-Circuit evaluation.
CREATE PROCEDURE ReadPerson
    @FirstName varchar,
    @LastName varchar,
    @PhoneNumber varchar,
    @ZipCode varchar,
    @Age int,
    @IsMarried bit
    AS
    BEGIN
        SELECT *
        FROM Person p
        WHERE ((@FirstName = NULL) OR (p.FirstName = @FirstName))
        AND   ((@LastName = NULL) OR (p.LastName = @LastName))
        AND   ((@PhoneNumber = NULL) OR (p.PhoneNumber = @PhoneNumber))
        AND   ((@ZipCode = NULL) OR (p.ZipCode = @ZipCode))
        AND   ((@Age = NULL) OR (p.Age = @Age))
        AND   ((@IsMarried = NULL) OR (p.IsMarried = @IsMarried))
    END
    GO
Which could be executed the same as before:
Exec ReadPerson 'John', 'Doe', '8005551234', '90210', 50, 1
But now adds all different kinds of combinations
-- Returns any record with FirstName is John
    Exec ReadPerson 'John', NULL, NULL, NULL, NULL, NULL
    
    -- Returns any record with FirstName is John and age 50
    Exec ReadPerson 'John', NULL, NULL, NULL, 50, NULL

    -- Returns any record with LastName is Doe and ZipCode is 90210
    Exec ReadPerson NULL, 'Doe', NULL, '90201', NULL, NULL

    
Let's look at why. Check out the first condition in the WHERE clause
WHERE ((@FirstName = NULL) OR (p.FirstName = @FirstName))
Because it is an OR condition, if the first expression (@FirstName = NULL), then the rest of the combined expression (p.FirstName = @FirstName) won't be evaluated. If FirstName is NULL, the results are NOT Filtered by FirstName. However, if the first expression is false (such as when @FirstName is 'John'), then the rest of the combined expression will still be evaulated. Then, the results WILL BE Filtered by all records where FirstName field is 'John'. Since this combined expression is all encapsulated in it's own parenthesis, it is evaluated completely separate from all of the Other conbined expressions that are separated by AND.

Cool, huh???

No comments:

Post a Comment