Database: 2006

11 December 2006

Legacy OUTER JOINs (=, =)

Maybe you've seen the *= and =* LEFT and RIGHT OUTER JOIN operators floating around third party data access code or even (gasp!) Microsoft system generated code. I've never used them but, I have seen them lurking around, on occasion. Although these operators have been valid and supported in SQL Server, they are going away for SQL 2005.

For backwards compatibility, they will still work in databases with a compatibility level = 80 but, for databases with compatibility level = 90, they no longer work. Not that this is very exciting but, to play around with these JOIN operators, we'll need a couple tables:

CREATE TABLE tbl_tst1 (
tst1_pk int NOT NULL identity(1, 1),
tst1_data varchar(50));

CREATE TABLE tbl_tst2 (
tst2_pk int not null identity(1, 1),
tst1_pk int,
tst2_data varchar(50));

We'll also need to load them up with a bit of data:

INSERT tbl_tst1 VALUES ('tst1_data01');
INSERT tbl_tst1 VALUES ('tst1_data02');
INSERT tbl_tst1 VALUES ('tst1_data03');
INSERT tbl_tst1 VALUES ('tst1_data04');
INSERT tbl_tst1 VALUES ('tst1_data05');

INSERT tbl_tst2 VALUES (1, 'tst2_data01');
INSERT tbl_tst2 VALUES (1, 'tst2_data05');
INSERT tbl_tst2 VALUES (2, 'tst2_data01');
INSERT tbl_tst2 VALUES (2, 'tst2_data05');

Now, that we have a bit of useless data, we'll want to select all rows from tbl_tst1 and, any rows from tbl_tst2 that have tst2_data = 'tst2_data05' AND that match our rows from tbl_tst1:

SELECT * FROM tbl_tst1 AS one, tbl_tst2 AS two
WHERE one.tst1_pk *= two.tst1_pk
AND two.tst2_data = 'tst2_data05';

Assuming your compatibility level = 90, the first thing you'll probably notice when you run this query is that you get an error:

Msg 4147, Level 15, State 1, Line 28

The query uses non-ANSI outer join operators ("*=" or "=*"). To run this query without modification, please set the compatibility level for current database to 80 or lower, using stored procedure sp_dbcmptlevel. It is strongly recommended to rewrite the query using ANSI outer join operators (LEFT OUTER JOIN, RIGHT OUTER JOIN). In the future versions of SQL Server, non-ANSI join operators will not be supported even in backward-compatibility modes.

Since that is highly inconvenient for our sample code (and for that 3^rd party monitoring application you just installed), we'll need to put our database into compatibility level 80:

EXEC dbo.sp_dbcmptlevel @dbname=N'db_name', @new_cmptlevel=80;

Now, we get the five rows from tst_tbl1 we were expecting. I was thinking of beating this dead horse with a =* RIGHT OUTER JOIN but, we all get the point. Let's get rid of the carcass:

DROP TABLE tbl_tst1;
DROP TABLE tbl_tst2;

And, since we don't wan't our 3^rd party monitoring tool cluttering up our shiny new SQL Server 2005 database with legacy JOINs, let's shut the door on them:

EXEC dbo.sp_dbcmptlevel @dbname=N'db_name', @new_cmptlevel=90;

Since these OUTER JOIN operators have been deprecated, you can count on them to not work in some future version of SQL Server, regardless of whether you have set your compatibility level below 90. Now, if we could only get rid of the legacy INNER JOIN syntax...

07 December 2006

Database Mail

I have a confession to make. Beyond sending a couple of test emails, I've never used xp_sendmail to send email from my database servers. Because I was never comfortable with the idea of a MAPI client on the server, and hated the bother of setting it up, I've always used sp_OA* to send email via CDO (hold the derision, please).

The new mail procedure, sp_send_dbmail, however, works a bit differently. Although we no longer need a MAPI profile on the server, we still have a bit of setup to do. Database mail is disabled by default. The first thing we'll need to do is to enable it. Since it's an advanced option, we'll need to turn on advanced options, first:

EXEC sp_configure 'show advanced options', '1';
RECONFIGURE WITH OVERRIDE;

Even though sp_send_dbmail isn't an extended procedure like its predecessor, it still uses extended procedures to handle the mail. This is a bit of a case of “do as we say, not as we do,” since extended procedures are deprecated and Microsoft says we should no longer use them for new development (email might make a good CLR procedure candidate, assuming you don't have to run UNSAFE to access the email server). At any rate, let's enable mail extended procedures:

EXEC sp_configure 'Database Mail XPs', '1';
RECONFIGURE WITH OVERRIDE;

Despite being liberated from the tyranny of the MAPI profile, we still need a mail profile in the database. Even though we have to create a profile, it doesn't require any extra software; it's all in the database and is a convenient way to specify our email account settings as the sender of a message. Books-Online was kind enough to provide a code sample:

EXEC msdb.dbo.sysmail_add_profile_sp
@profile_name = 'Database Mail Profile',
@description = 'Profile with which to send database mail';

Of course, with any legitimate email message, you need an mail account from which to send it. We get to specify several properties for our account. The account name, email address, and email server are some of the more pertinent properties. Don't forget to set your mail server to allow your database server to relay mail:

EXEC msdb.dbo.sysmail_add_account_sp
@account_name = 'Database Mail Account',
@description = 'EMail account for sending database mail',
@email_address = 'mymail@domain',
@replyto_address = 'myemail@domain',
@display_name = 'Inbox Assassin',
@mailserver_name = 'smtp.mymailserver.domain';

Now it's getting a bit tedious. Just one more step and we can actually send an email message. I mentioned before that the profile was a convenient way to specify our email setting for a message. This next procedure associates an email account with a profile:

EXECUTE msdb.dbo.sysmail_add_profileaccount_sp
@profile_name = 'Database Mail Profile',
@account_name = 'Database Mail Account',
@sequence_number = 1;

Assuming we specified a valid email account and smtp server, we should be all set to go. All we have to do to send an email now is to provide our profile, a recipient, a subject, and a message:

EXEC msdb.dbo.sp_send_dbmail
@profile_name = 'Database Mail Profile'
, @recipients = 'myemail@domain'
, @subject = 'My first database mail'
, @body = 'Look, ma, No hands!';

Okay... that was fun. Now we have database mail set up on our server, what's next? Well, we could send out thousands of email messages to our close personal friends whose email addresses we store on the corporate database server, but, let's cleanup, instead! To remove our database mail settings, we pretty much just do our setup steps in revers. First we need to disassociate our email account from our profile:

EXEC msdb.dbo.sysmail_delete_profileaccount_sp
@profile_name = 'Database Mail Profile',
@account_name = 'Database Mail Account';

Now, we'll get rid of the email account:

EXEC msdb.dbo.sysmail_delete_account_sp
@account_name = 'Database Mail Account';

and get rid of our profile:

EXEC msdb.dbo.sysmail_delete_profile_sp
@profile_name = 'Database Mail Profile';

Finally, since we want to leave our server just the way we found it, we'll disable the mail extended procedures and turn off advanced options:

EXEC sp_configure 'Database Mail XPs', '0';
RECONFIGURE WITH OVERRIDE;

EXEC sp_configure 'show advanced options', '0';
RECONFIGURE WITH OVERRIDE;

With a relatively straightforward setup, and no messing with MAPI or sp_OA*, I like database mail enough that I'll be using it for all of my database email messaging needs (all two of them); despite its reliance on extended procedures.

13 November 2006

COLLATE

Well, not strictly SQL 2005 related but, as some of our developers have recently noticed, our database servers are set to be case insensitive. SQL Server, by default, is case insensitive. For example, SELECT * FROM myTable WHERE column = 'six' gives the same results as SELECT * FROM myTable WHERE column = 'SiX'.

So, how do we do a case sensitive SELECT statement for something like a password comparison, when you don't want 'six' = 'SiX'?

SQL Server provides a COLLATE clause to do this. You may have seen this clause when generating creation scripts for tables. Here is a quick example:

CREATE TABLE [dbo].[cd_LandCategory] (
[LandCategory_SV] [int] NOT NULL ,
[Desc_TX] [varchar] (50) COLLATE SQL_Latin1_General_CP1_CI_AS NOT NULL ,
etc...

The second to last character group in the collation specifies the case sensitivity (CI = Case Insensitive).

It turns out you can also include the COLLATE clause in a query to specify the collation you want your query to use at runtime, completely independent of the DDL used to create the table and store the data (this is similar in function to the CAST function). Using the above table snippet, you may want to write a case sensitive query on the [Desc_TX] column:

SELECT [Desc_TX]
FROM [dbo].[cd_LandCategory]
WHERE [Desc_TX] = 'Old' COLLATE SQL_Latin1_General_CP1_CS_AS

I have changed the CI character group in the collation to CS, providing a case sensitive comparison. One thing to note is that, if I am comparing two columns, I will need to use COLLATE on both sides of the comparison or, I could get a collation error or other unexpected results:

SELECT [Desc_TX]
FROM [dbo].[cd_LandCategory]
WHERE [Desc_TX] COLLATE SQL_Latin1_General_CP1_CS_AS = [SomeOtherColumn_TX] COLLATE SQL_Latin1_General_CP1_CS_AS

The COLLATE clause seems to work the same on both SQL 2000 and SQL 2005.

A readily apparent use for this is comparing passwords in a case-sensitive manner (assuming you didn't create your password column with a case sensitive collation to start with; a common oversight).

Below is a quick code sample you can run on your favorite development server to see how collation, and overriding it with the COLLATE clause, works. It covers case sensitive and insensitive column collations with no COLLATE clause in the query. It also demonstrates overriding the column's collation using the COLLATE clause.

/******** COLLATION EXAMPLE 1.
Case-insensitive column and no COLLATE in where clause ************/

CREATE TABLE #tmp (pwd varchar(100) COLLATE SQL_Latin1_General_CP1_CI_AS not null)
INSERT #tmp values ('@ssWord')
INSERT #tmp values ('@ssword')
INSERT #tmp values ('bit3mE')
INSERT #tmp values ('bit3me')

SELECT * FROM #tmp
WHERE pwd = 'bit3me'

DROP TABLE #tmp
GO

/******** COLLATION EXAMPLE 2.
Case-sensitive column and no COLLATE in where clause ************/

CREATE TABLE #tmp (pwd varchar(100) COLLATE SQL_Latin1_General_CP1_CS_AS not null)
INSERT #tmp values ('@ssWord')
INSERT #tmp values ('@ssword')
INSERT #tmp values ('bit3mE')
INSERT #tmp values ('bit3me')

SELECT * FROM #tmp
WHERE pwd = 'bit3me'

DROP TABLE #tmp
GO

/******** COLLATION EXAMPLE 3.
Case-insensitive column and case-sensitive COLLATE in where clause ************/

SELECT * FROM #tmp
WHERE pwd = 'bit3me' COLLATE SQL_Latin1_General_CP1_CS_AS

DROP TABLE #tmp
GO

/******** COLLATION EXAMPLE 4.
Case-sensitive column and case-INsensitive COLLATE in where clause ************/

SELECT * FROM #tmp
WHERE pwd = 'bit3me' COLLATE SQL_Latin1_General_CP1_CI_AS

DROP TABLE #tmp
GO

17 February 2006

Crosstab / Pivot queries

I heard that SQL Server 2005 had this really cool new feature: PIVOT

Since I just finished working with one of our lead developers (name omitted to protect the innocent) on a SQL Server 2000 pivot query, I thought I’d take the chance to implement the query using SQL Server 2005’s new PIVOT feature.

First, we need to convert some of our rows into columns (21 in this case). With SQL 2000, we did this using a CASE statement to generate a count for each row that was to be converted to a column.

(SELECT
'AA' = Count( CASE Action_ID WHEN 'AA' THEN ap_id END),
'AS' = Count( CASE Action_ID WHEN 'AS' THEN ap_id END),
...
'RZ' = Count( CASE Action_ID WHEN 'RZ' THEN ap_id END),
'XX' = Count( CASE Action_ID WHEN 'XX' THEN ap_id END)
) AS pivotTable

With SQL 2005, we still need to convert the rows to columns. However, the syntax is somewhat different. Rather than performing a CASE and count on each row, we get to specify the data values we are looking for and it will count them all for us.

PIVOT
(
count(action_id)
FOR Action_ID IN(
[AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX] )
) AS pivotTable

Now that our basic pivot is put together (notice that both queries have an alias of “pivotTable”), we want to count how many occurrences of each unique row we have. We’ll start by selecting ALL of our converted columns and our count. This portion of the query looks the same in both versions of SQL Server.

SELECT [AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX], count(*) AS Frequency

Since we’re doing a count, and we want distinct values, we also have to group by all of our columns. This is another portion of our query that will look the same in SQL 2000 and 2005. We’ll also sort them so we get our most frequently occurring row counts at the top.

GROUP BY [AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX]
ORDER BY Frequency DESC

Now that we have all the pieces of our query put together, let’s tie them up into something useful. In SQL Server 2000, we just need to select our newly created columns from the sub-query containing all the CASE count statements (I’ll add in the base table name and the remainder of our 21 columns, at this point).

SELECT
AA, [AS], CA, CB, CE, CF, CP, CW, CX, CY, CZ, EX, IC, KC, KT, KV, LR, OE, PR, RZ, XX
, count(*) as Frequency
FROM
(
SELECT
'AA' = Count( CASE Action_ID WHEN 'AA' THEN ap_id END),
'AS' = Count( CASE Action_ID WHEN 'AS' THEN ap_id END),
'CA' = Count( CASE Action_ID WHEN 'CA' THEN ap_id END),
'CB' = Count( CASE Action_ID WHEN 'CB' THEN ap_id END),
'CE' = Count( CASE Action_ID WHEN 'CE' THEN ap_id END),
'CF' = Count( CASE Action_ID WHEN 'CF' THEN ap_id END),
'CP' = Count( CASE Action_ID WHEN 'CP' THEN ap_id END),
'CW' = Count( CASE Action_ID WHEN 'CW' THEN ap_id END),
'CX' = Count( CASE Action_ID WHEN 'CX' THEN ap_id END),
'CY' = Count( CASE Action_ID WHEN 'CY' THEN ap_id END),
'CZ' = Count( CASE Action_ID WHEN 'CZ' THEN ap_id END),
'EX' = Count( CASE Action_ID WHEN 'EX' THEN ap_id END),
'IC' = Count( CASE Action_ID WHEN 'IC' THEN ap_id END),
'KC' = Count( CASE Action_ID WHEN 'KC' THEN ap_id END),
'KT' = Count( CASE Action_ID WHEN 'KT' THEN ap_id END),
'KV' = Count( CASE Action_ID WHEN 'KV' THEN ap_id END),
'LR' = Count( CASE Action_ID WHEN 'LR' THEN ap_id END),
'OE' = Count( CASE Action_ID WHEN 'OE' THEN ap_id END),
'PR' = Count( CASE Action_ID WHEN 'PR' THEN ap_id END),
'RZ' = Count( CASE Action_ID WHEN 'RZ' THEN ap_id END),
'XX' = Count( CASE Action_ID WHEN 'XX' THEN ap_id END)
FROM
dbo.los_LoanServicingActions
GROUP BY
ap_id
) AS pivotTable

GROUP BY
AA, [AS], CA, CB, CE, CF, CP, CW, CX, CY, CZ, EX, IC, KC, KT, KV, LR, OE, PR, RZ, XX
ORDER BY Frequency DESC

We’ll also add the base table to our SQL 2005 query. Rather than going inside the pivot statement, however, it will be placed inside a sub-query right before the PIVOT clause.

SELECT [AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX], count(*) AS Frequency
FROM(SELECT Action_ID, ap_id FROM los_loanServicingActions) AS p

PIVOT
(
count(action_id)
FOR Action_ID IN(
[AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX] )
) AS pivotTable
GROUP BY [AA], [AS], [CA], [CB], [CE], [CF], [CP], [CW], [CX], [CY], [CZ], [EX], [IC], [KC], [KT], [KV], [LR], [OE], [PR], [RZ], [XX]
ORDER BY Frequency DESC

As you can see, our final queries have the same opening and closing clauses. Be alleviating the CASE count line for each column, SQL 2005 allows us to express our pivot much more concisely. The one awkward thing about the 2005 pivot is enclosing the actual data values inside the IN clause in square brackets as if they were object names.

Well, form and brevity are all very nice but, how do they perform? It turns out, oddly enough (based on all the SQL Server 2005 release hype about performance), that they perform equally well (I pinned the source table in memory to remove disk IO as a factor).

SQL Server 2005:
(143 row(s) affected)
0.280000000 seconds

SQL Server 2000:
(143 row(s) affected)
0.280000000 seconds

So, which style should I use? Performance-wise, it’s a wash. It seems that the new PIVOT query may win in ease of use and reduced typing. One caveat to this is that the database must have the compatibility level set to 90 in order to use the new syntax. If you are developing a query that may have to be run against a database with a compatibility level lower than 90, or on SQL Server 2000 or older, you must continue to use the CASE count syntax.

08 February 2006

SSMS Source Safe integration

Bit of a publishing break but, hey, what's eleven months in the grand scheme of things? I'm now playing with several SQL Server 2005 RTM instances in DEV and one starring in a TEST role until new hardware is in place. We're getting ready to run all of the SQL 2K5 DEV/TEST on a single cluster instead of several separate machines.

At any rate, my current project is still being developed with SQL2K. However, I've begun using SSMS to develop all the non-DTS portions. The source safe integration is VERY NICE. No more check-out/check-in from VSS while editing in Query Analyzer and building the database from the command prompt. I've added my DOS commands for building the database into the SSMS tools menu and I can manage source control, development, and build all from a single interface.

Although the IDE isn't as mature (read "flexible") as VS2005, it looks like it will meet my DB needs quite well.

11 December 2006

Legacy OUTER JOINs (*=, =*)

07 December 2006

Database Mail

13 November 2006

COLLATE

17 February 2006

Crosstab / Pivot queries

08 February 2006

SSMS Source Safe integration

Legacy OUTER JOINs (=, =)