10 September 2010

Alter Ego

So the other day I'm doing some light reading from Paul Randal about DBCC CHECKDB when I learn that, beginning with SQL 2005, CHECKDB actually performs its checks on an internal database snapshot and not the "real" database.  That was a totally unexpected bit of knowledge, so it really caught my eye.  It's great to have somebody that loves to share these things with the world.  Thanks, Paul!

A little more reading revealed that the internal database snapshot is implemented as an alternate stream of the live database file.  And also that I have no control over where it's created, so it can cause a nearly full drive to fill up during a CHECKDB (the original reason for his original article).

I had never heard of an alternate stream before, but I did find a couple pretty good explanations, that I won't cover here, other than to mention that it's a feature of NTFS and allows multiple data streams to be stored within a single file (the second link has downloadable source-code for utilities that manipulate alternate data streams, if you really want to get down in the weeds).

I wanted to see one of these alternate streams in action myself, and discovered that you can view a file's alternate streams by using the /R switch of the DIR command.  Here is DBCC CHECKDB in action, complete with a peek at the alternate steam in the file system.

I'm using the following command to provide directory listings for the  AdventureWorks database:

declare @cmd sysname
set @cmd = 'dir /R "C:\\AdventureWorks_Data*"'
exec master..xp_cmdshell @cmd

This shows the AdventureWorks_Data.mdf file:

Now run dbcc checkdb('AdventureWorks') and, while that is still running, kick off the DIR command from a separate SSMS window.  This shows the AdventureWorks_Data.mdf file and the alternate stream that is used for the internal database snapshot:

Notice that the DIR command only reported a single file in the summary.  The alternate data stream is completely invisible to most DOS style commands, and you'll never see one in Windows Explorer.

I didn't really expect to do so much digging after a little reading about DBCC CHECKDB, but it turns out there was a lot to learn, and I couldn't resist diving into the mysterious internal database snapshot and its alternate data stream.


07 September 2010

Feeling the Love

DBAs @ Midnight just kicked off its second season. If, like me, you missed the live season opener, you'll have to settle for the preview until the recording is available.

In the meantime, they are forming a zombie gang and soliciting public displays of affection. I don't know if I'd make a good zombie gangster but I sure do enjoy the show, so here's my PDA:

Have a great second season Sean and Jen!

06 September 2010

That's Virtually Twice as Fast

We do much of our software testing and validation on virtual machines. We happen to be using VMWare ESX as the host and are testing with Windows XP SP3 as the guest.

We were getting pretty poor launch performance from our application as compared to a physical machine. I'll measure launch time in Task Manager Grid Squares (TMGS). Think of them like story points for performance monitoring. They just represent some arbitrary amount of time (what it is... 9 seconds per square?).

We have a main application, a supporting GIS application and a 3rd application for moving things around the network, and it was taking as many as 10 Task Manager grid squares for the three applications to launch. Here is what the launch sequence looks like in Task Manager:

Here is the key for the image:
1. Main application launch
2. GIS application launch -- almost 1 minute of 100% CPU -- OUCH!
3. Network search
4. Opening the first search result
To try and help out with the launch speed, we added a virtual processor and a virtual GB of RAM to the virtual machines. This reduced our launch time from 10 Task Manager grid squares to just over 6.

You can see that opening a search result wasn't really helped out because it is rather network intensive. Main application launch is a mix of network and processor, and didn't fare much better with two processors than it did before the upgrade.

The real benefit we saw was the large gain in both the launch of the GIS application and the processing of search results, both of which are processor intensive. We see the times here cut by more than half.

03 September 2010

Chew Once, Swallow Twice

I've been working on a legacy system that's been sitting on the shelf for about a year. Most of the system is in VB.Net, so I've been writing a fair amount of "VB#" as habit dictates that I end each line with a semi-colon -- it works in T-SQL and C#.

A workmate did find an import/export routine written in C#, and it had a great catch statement I thought I'd share. First it catches any error and swallows it, a la ON ERROR RESUME NEXT. But, just for good measure, it also checks the error message so it can execute... a comment!

catch (Exception ex)
if (ex.Message.Contains("Violation of PRIMARY KEY"))


I guess any data integrity error worth suppressing is worth over-suppressing. I've probably done the same type of thing a dozen times, so it's nice to have a little fun with somebody else's forgotten bit of error control.

01 September 2010

What's in a name?

The software team I've been working with had a “string or binary data would be truncated” error where we were trying to copy a 100 character “buyer name” column from one table into a 50 character column in another table. We also had a "seller name" column with the potential for the same error and fixed that one up, too.

I took a quick peek at the data to see how the name lengths were distributed in each table. Both names had an average length in the mid- to upper-teens. Virtually none of the names were longer than 50 characters (probably why the bug went unnoticed for years).

Here are a couple visuals of the distribution:

Since this was a reasonably large sample (more than 100,000 names in each column), it probably makes a good guideline for full name column sizing.

As is usual with software defects, the standard “we inherited it” rule applies in this case. Which brings up the quick unrelated observation that I rarely meet developers who claim to write bugs, just folks who inherit them. I'll have to write more on that, later.

At any rate, it seems that bug writers like myself are so prolific that it doesn't take very many of us to keep all the real developers busy.