13 November 2009

Finding Something Somewhere in a Database

I threw this ugly query together (at the bottom of the message) a few years ago (hence the ancient system table names) and recently stumbled across it in an old email I sent to myself. I figured I would post it here, since it’s a handy way to find a string in a database when you have no idea where to begin looking.

I wrote it to find a string within a database I was unfamiliar with. You provide a search term to the script -- @searchString -- such as ‘%acre%’ (use the standard LIKE wildcards) in the example below, and it searches all of the char, varchar, nchar, and nvarchar columns in all of the user tables for your term. For my initial needs, all of the tables were in the dbo schema, so the script doesn’t handle table owner/schema. You can tweak it to your needs on that front.

The result set gives you your initial search term, how many instances where found in which table, and the query used to generate the count for that table so you have a good starting place for further analysis.

DECLARE @searchString nvarchar(100)

SET @searchString = '%acre%'

SET NOCOUNT ON

CREATE TABLE ##search_results (result_table sysname, search_term NVARCHAR(100), result_count bigint)

CREATE TABLE ##search_queries (result_table sysname, query_text NVARCHAR(MAX))

DECLARE @tbl sysname

set @tbl = (select top 1 so.name FROM sysobjects as so inner join syscolumns as sc on so.id = sc.id inner join systypes as st on sc.xtype = st.xtype

where so.type = 'u'

and st.name in ('char', 'varchar', 'nchar', 'nvarchar')

order by so.name desc )

declare @sql NVARCHAR(max), @where nvarchar(max)

while @tbl is not null

BEGIN

set @where = ''

SELECT @where = @where + ' OR ' + quotename(sc.name) + ' LIKE ''' + @searchString + '''' FROM sysobjects as so inner join syscolumns as sc on so.id = sc.id inner join systypes as st on sc.xtype = st.xtype

where so.type = 'u'

and st.name in ('char', 'varchar', 'nchar', 'nvarchar')

and so.name = @tbl

-- get rid of the initial ' OR '

SET @where = substring(@where, 4, len(@where))

SET @sql = 'SELECT ' + quotename(@tbl, '''') + ' AS resultsTable, ' + quotename(@searchString, '''') + ' AS searchTerm, count(*) AS resultsCount FROM ' + quotename(@tbl) + ' WHERE '

INSERT ##search_queries (result_table, query_text) VALUES (@tbl, @sql + @where)

SET @sql = 'INSERT ##search_results ' + @sql + @where

EXEC sp_executesql @statement = @sql

set @tbl = (select top 1 so.name FROM sysobjects as so inner join syscolumns as sc on so.id = sc.id inner join systypes as st on sc.xtype = st.xtype

where so.type = 'u'

and st.name in ('char', 'varchar', 'nchar', 'nvarchar')

AND so.name < @tbl

order by so.name desc)

END

SELECT sr.search_term, sr.result_count, sr.result_table, sq.query_text FROM ##search_results AS sr JOIN ##search_queries AS sq ON sr.result_table = sq.result_table ORDER BY sr.result_count DESC, sr.result_table DESC

DROP TABLE ##search_results

DROP TABLE ##search_queries

Searching Cached Query Plans

A quick way to search cached query plans (for missing indexes, table scans, etc.)

--exec dbo.dba_SearchCachedPlans '%MissingIndexes%'

--2. exec dbo.dba_SearchCachedPlans '%ColumnsWithNoStatistics%'

--3. exec dbo.dba_SearchCachedPlans '%TableScan%'

--4. exec dbo.dba_SearchCachedPlans '%CREATE PROC%MessageWrite%'

DECLARE @StringToSearchFor VARCHAR(100)

SET @StringToSearchFor = '%%'

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

SELECT TOP 20

st.text AS [SQL]

, cp.cacheobjtype

, cp.objtype

, DB_NAME(st.dbid)AS [DatabaseName]

, cp.usecounts AS [Plan usage]

, qp.query_plan

FROM sys.dm_exec_cached_plans cp

CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st

CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp

WHERE CAST(qp.query_plan AS NVARCHAR(MAX))LIKE @StringToSearchFor

ORDER BY cp.usecounts DESC

Found on http://www.sqlservercentral.com/articles/Performance/66729/

04 November 2009

Full Recovery or Stealth Simple

This query lists all databases on an instance that are using the FULL recovery model but do not have a full or differential backup to anchor the log chain – that way you don’t have to inspect each database on the server to find the offender.

SELECT

db.name,

recovery_model_desc,

last_log_backup_lsn

FROM

sys.databases AS db

INNER JOIN sys.database_recovery_status AS dbrs

ON db.database_id = dbrs.database_id

WHERE

db.recovery_model = 1

AND dbrs.last_log_backup_lsn IS NULL

You can test it out by placing a fully backed database into SIMPLE recovery then placing it back into FULL recovery. The query will now return the modified database. Backup the database and the query will now return an empty set.

This query is based on the concepts in an article by Paul Randal, in case you want to do some more digging: http://www.sqlskills.com/BLOGS/PAUL/post/New-script-is-that-database-REALLY-in-the-FULL-recovery-mode.aspx

19 August 2009

ANSI SQL Compliance Validator

Of course, most of us don't write "pure" ANSI SQL queries because vendors don't fully implement the standard. But, it might be interesting to see how well your queries comply with the ANSI SQL standards.

Here is a web-based validation tool into which you can paste a query and it will tell you the query's level of compliance give corrective suggestions...

SQL-2003:
http://developer.mimer.com/validator/parser200x/index.tml

SQL-99:
http://developer.mimer.com/validator/parser99/index.tml

SQL-92:
http://developer.mimer.com/validator/parser92/index.tml

12 August 2009

Replicating Stored Procedures

For occasionally connected client replication, we are publishing stored procedures so all the clients have what they need to interact with their local data. So, what happens when you need to modify a procedure? Let's just alter one and see what happens. Ready? ALTER PROCEDURE usp… Now, let's wait and see what happens at the client... okay, nothing yet. Let's check again; still nothing. Okay, we'll wait a few minutes. Now? Still nothing. What's going on? Let's ask BOL.


Me: Hey, BOL, what happened to my stored procedure definition?


BOL: Well, the ability of merge replication to replicate stored procedure definitions (or views or user-defined functions), is merely a convenient mechanism for deploying these components to subscribers.


Me: What, exactly does that mean?


BOL: Subsequent changes to the definitions of these objects are not copied automatically to Subscribers.


Me: Thanks, BOL. You sure know your stuff.


So, there you have it. If you use replication to send procedures to your clients, it only works once. After that, you're on your own. Good thing we learned how to use sp_addScriptExec last time!

Replicating Database Security

When setting up an occasionally connected database to work off-line, I had a great thrill the first time the replication snapshot actually made it to the client with all tables, data, and stored procedures intact. In fact, after days of frustration, I was so happy I went home for the day determined to put the entire replication experience (unpleasant) out of my mind for the evening!

Fast-forward to our iteration review demo the next day: "Hey, let's see if the application works off-line, now that we have replication working." "Okay, here goes…" Of course, we immediately got an error telling us that the user could not open the database specified in the login: our new off-line client was broken.

After spending the rest of the morning sobbing in the corner, I thought I'd look into the problem a little bit. It turns out that the DisconnectedClient_Role database role and associated permissions had not replicated to the client and Domain Users did not have access to the database. Only tables, procedures, views, and functions can be replicated.

My first resource, Google, failed to return anything useful so, I went to my second resort/newsgroup aggregator, groups.google.com. A quick search on microsoft.public.sqlserver.replication turned up the solution; sp_addScriptExec. This is basically a carte blanche post-initialization script deployment tool. I took my role creation script and stored procedure permission script and saved them in a file where both the publisher and subscriber could see and access it. Then, I executed the sp_addScriptExec against the publisher database like this:

sp_addScriptExec
@publication='DisconnetedClient_replicated'
, @scriptFile = '\\snaphot_storage_location\DisconnectedClient_role.sql'
, @skipError = 1

It immediately sent the role and permission changes down to all active subscribers and any subscriptions that were created subsequently received the permissions as part of the snapshot -- we now had security. This time, I actually logged into a client and verified that the off-line client could access the local copy of the database. Success!

Easily Create Database Snapshots

Creating database snapshots can be a bit tedious, especially on databases with multiple files.

A workmate put the following script together to take away the drudgery. Just give your database name and desired snapshot name to the following script and it creates a database snapshot for you using the same data directories as the live database.

If you want the snapshot data files somewhere else, you can comment out the EXECUTE statement then modify the generated script with your preferred file location.

Notice the $CURSOR$ keyword. This makes a handy SQLPrompt snippet!



/*
This script will create a snapshot of the source DB, it will handle multiple
files in the DB.  It also supports multiple snapshots of the same source by
changing @SSName.
  
   Set @SourceDBName to the name of the DB you want a snapshot of.
   Set @SSName to what you want to call your snapshot.
  
You can disable the Execute statement at the end if you just want to get
the TSQL code for later use.
*/
DECLARE  @SourceDBName  AS SYSNAME,
        
@SSName        AS SYSNAME,
        
@SQL           AS VARCHAR(4000)

SELECT
  
@SourceDBName = '$CURSOR$',
  
@SSName = 'snapshot_name'

SELECT
  
@SQL = 'CREATE DATABASE '
          
+ @SSName
          
+ ' ON '
          
+ LEFT(script,LEN(script)
                           -
1)
           +
'AS SNAPSHOT OF '
          
+ @SourceDBName
FROM  
  
(SELECT
    
(SELECT  
      
col
        
+ ', ' AS [text()]
    
FROM    
      
(SELECT   TOP 100
        
CASE
          
WHEN RIGHT(mf.physical_name,4) = '.ndf'
          
THEN '(NAME = '
                
+ mf.name
                
+ ' ,FILENAME = '''
                
+ REPLACE(mf.physical_name,'.ndf','_'
                                                    
+ @SSName
                                                    
+ '.ss')
                 +
''')'
          
ELSE '(NAME = '
                
+ mf.name
                
+ ' ,FILENAME = '''
                
+ REPLACE(mf.physical_name,'.mdf','_'
                                                    
+ @SSName
                                                    
+ '.ss')
                 +
''')'
        
END AS col
      
FROM    
        
sys.master_files AS mf
        
INNER JOIN sys.databases AS db
          
ON mf.database_id = db.database_id
      
WHERE    db.name = @SourceDBName
                
AND mf.type_desc = 'ROWS'
      
ORDER BY [file_id]) AS a
    
ORDER BY col ASC
     FOR XML
PATH('')
    )
AS script) b

EXECUTE( @SQL)

PRINT @SQL

10 July 2009

Searching SQL Jobs for Package References

Did you ever find a lone package and wonder whether it's used by a SQL Agent job?

This query will show which, if any, job steps use the package. It can also be used to search job text for any particular string you wish to find (table name, sqlcmd usage, etc.)


USE msdb

DECLARE @filter VARCHAR(1024)

-- include the dtsx extension to search for a package

SET @filter = '%DailyLoad%.dtsx%'

---- we can also look for a particular table or any

---- other value we might expect to find in a job step

SET @filter = '%DatabaseQueue%'

---- what might we be using for SQLCMD?

SET @filter = '%sqlcmd -S%'

SELECT

NAME,

step_name,

subsystem,

command

FROM

sysjobsteps AS ss

INNER JOIN sysjobs AS sj

ON ss.job_id = sj.job_id

WHERE

command LIKE @filter


I did a quick search for '%DatabaseBackup%' to get the following result:



09 July 2009

Anatomy of a Missing Index: What the Performance Dashboard doesn't tell you

Most of us have seen the performance dashboard and associated reports that were supplied by the CSS SQL Server engineers at Microsoft. In this discussion, we’ll be taking a closer look at the missing indexes report.




This report provides a fair amount of information, including the missing index, itself; the database the missing index would belong to; the object id of the table/view which needs the index; and some other usage, cost, and impact data. Two of the key columns we’ll be looking at are “Avg Total User Cost” and “Avg User Impact”.

There isn’t a lot we can do with the report right in place within SSMS (no copy/paste, etc), so we’re going to have to jump through a couple of hoops to get the data in a format we can work with. We’ll start with a right-click export to excel.

This looks pretty good, and it also gives us the ability to copy the missing indexes so we can paste them into ssms.

If we want, we can just apply all the indexes as suggested and enjoy the satisfaction of a job well done as our newly optimized databases scream off into the sunset. That’s all nice, but, Not being the kind of database administrator who likes to blindly apply whatever the software tells me, I’d like to only apply the indexes that will have the greatest performance impact so my boss will be really impressed.

The missing index report does have an “overall impact” column, but providing a 99.9% impact on a query with a cost of 4.52 is somewhat less impressive than providing a 6.14% impact on a query with a cost of 105.95 – admittedly not much more impressive, but you get the idea: We need a way to find the queries that will provide the largest combination of overall impact and cost.

To do this, we want a to get a weighted impact by multiplying the Avg Total User Cost by the Overall Impact, then sort the report by that. Again, our report leaves us a bit short, even after we placed in into excel. Even after we create a new column (Avg Total User Cost * Overall Impact) we cannot sort the spreadsheet because of the overuse of merged cells.

So, let’s copy our data from the table within excel, paste it into a new tab, export the new tab to a .csv file, open the .csv file in Excel, remove the empty columns left over from the “unmerge”, add our “weighted impact” column (“Avg Total User Cost” * “Overall Impact”) then sort the file by that column.

In our sample report, this immediately shows five indexes that will have large impacts on high cost queries. Now we can apply these indexes and go home secure in the knowledge that our boss is impressed. But…

If we take a moment to think about it, we’ll recall that applying an index to support a query often reveals the need for yet another index. We’re going to need to do a bit more work.

It would be nice if we could find out which stored procedure each of these indexes would support, Then, we could focus our optimization efforts on those stored procedures and catch any additional indexes they will need. Unfortunately, we hit our recurring theme – the missing index report isn’t that comprehensive. The dynamic management views (DMVs), however, are that comprehensive.

We can see in the text of the proposed index – CREATE INDEX missing_index_1793 ON [Transact].[dbo].[ta_Archive_TranCustAccountDtls] ([Amount_CUR]) – for our top ranked missing index (95.08 Avg User Impact, 95.59 Avg Total User Cost) that it is proposed for the [ta_Archive_TranCustAccountDtls] table. Let’s ask the DMVs to show us the stored procedure that goes with that index (the table name is underlined in the query for clarity). Thanks, Bri!

DECLARE @StringToSearchFor VARCHAR(100)

SET @StringToSearchFor = '%%ta_Archive_TranCustAccountDtls%'

SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED

SELECT

st.text AS [SQL]

, cp.cacheobjtype

, cp.objtype

, DB_NAME(st.dbid)AS [DatabaseName]

, cp.usecounts AS [Plan usage]

, qp.query_plan

FROM

sys.dm_exec_cached_plans cp

CROSS APPLY sys.dm_exec_sql_text(cp.plan_handle) st

CROSS APPLY sys.dm_exec_query_plan(cp.plan_handle) qp

WHERE

CAST(qp.query_plan AS NVARCHAR(MAX))LIKE @StringToSearchFor

ORDER BY

cp.usecounts DESC

This will show us two queries that have indexes missing from this table (the third row is the first procedure with a different Plan Usage value).

Fortunately, the first two rows returned by the query happen to correspond to the first two rows in our missing indexes report. We can verify the cost, impact, and missing index columns by clicking the ShowPlanXML in the query_plan column within SSMS. The impact is listed right with the missing query (95.2267 in the plan vs 95.08 in the spreadsheet/report)

the cost can be viewed in the properties window. highlight the “Select cost: 0 %” in the top left of the query plan to get the cost of the entire plan (99.0924 in the plan vs 95.59 in the spreadsheet/report)

view the create index statement by right clicking within the query plan

We also see that our spreadsheet create statement

CREATE INDEX missing_index_1793 ON [Transact].[dbo].[ta_Archive_TranCustAccountDtls] ([Amount_CUR])

matches our query plan create statement

CREATE NONCLUSTERED INDEX []

ON [dbo].[ta_Archive_TranCustAccountDtls] ([Amount_CUR])

Let’s take a closer look at the query plan while we have it open. We can see that 95% of the cost is a clustered index scan, with some parallelism thrown in for good measure.

Let’s apply our missing index and see what happens to our cost and query plan.

Our cost is down to 4.9

and our clustered index scan has been converted to an index seek and a key lookup. Of course our inner join has also gone from a merge to a hash match, but that’s okay at this point (our cost has gone down by 90 without applying any thought). we’ll look at the hash match inner join in a later step.

Now that the biggest index scan has been removed, one of our smaller index scans now accounts for most of the cost. There is also a “new” missing index that addresses this index scan

CREATE NONCLUSTERED INDEX []

ON [dbo].[ta_Archive_TransactionDisbursements] ([CheckNum_VAL])

You can see, when we apply this index that our last clustered index scan has been converted to an index seek/key lookup and our cost for the entire query is now down to 0.16.

The relative cost of the key lookup portion of our original clustered index scan is back up to 68% of the total cost, but that isn’t a problem. These things always add up to 100% so something in the query has to take the largest share.

Now we’re going to take a quick look at the hash match inner join. It probably isn’t strictly necessary to fix this since our cost is now so low, but we did introduce this through our tuning efforts and getting rid of it isn’t too difficult.

If we take a closer look at the details of the join, we’ll see that the probe is on the Transaction_LK column of the ta_Archive_TranCustAccountDtls table

Let’s throw an index on this column and see what happens…

Not only was our cost cut in half again,

but the portion of the query that is taken up by our key lookup was reduced by almost 20%.


The additional cost ended up in the final sort – it had to go somewhere.

As you probably noticed, the bulk of our performance improvement came from blindly applying the “missing index” from the missing indexes report. Like the database tuning advisor, these index recommendations get you a long way with very little thought. In some cases it may be more cost effective to just go with what’s there.

However, we have just seen how easy it is to spend a few additional minutes (that’s literally all it takes) to correlate the missing index to a specific query/procedure and thoroughly tune that object. This will keep it off the missing index report so we only have to tune the query one time. If we had blindly applied the recommendations, we would have had an index supporting this query end up in the missing index report again, doubling the cost of our non-thinking tuning effort.





08 July 2009

The TRY... CATCH Penalty

Is there a performance penalty for using TRY... CATCH? The short answer is "not really."

The detailed answer involves a quick comparison test. I used set @variable = @variable + 1 within a TRY... CATCH and again without a TRY... CATCH. I did something trivial like this so I would only be measuring the impact of the TRY... CATCH block.

Each pass looped 1,000,000 times so, we need to divide our resultant times by one million to see the performance effect on each individual query. I did 5 runs of 1,000,000 with the TRY... CATCH and 5 without, so we'll do some averging. Here are the raw times in milliseconds:

TRY NON-TRY

8093 2483

7733 2470

8220 2376

7626 2483

7936 2530

7931.6 2468.4

So, for the TRY... CATCH, we averaged 7.932 seconds for one million executions. For the NON-TRY... CATCH version, we averaged 2.468 seconds for one million executions. If we divide both numbers by one million then take the difference, we get the performance impact of using TRY... CATCH. In this case, it works out to about a 0.000005464 second penalty per query (about 5 one-millionths of a second).

Those are the SQL Server 2005 numbers. The penalty and corresponding execution times are even smaller in SQL Server 2008.

Here are the test scripts I used:

------ WITH THE TRY-CATCH BLOCK

declare @i int

set @i = 0

declare @startTime datetime, @endTime datetime

set @starttime = getDate()

while @i < 1000000

begin

begin try

set @i = @i + 1

end try

begin catch

-- nothing

end catch

end

set @endTime = getdate()

print cast(datediff(ms, @starttime, @endtime) as varchar) + ' ms'

go 5

------ WITHOUT THE TRY-CATCH BLOCK

declare @i int

set @i = 0

declare @startTime datetime, @endTime datetime

set @starttime = getDate()

while @i < 1000000

begin

set @i = @i + 1

end

set @endTime = getdate()

print cast(datediff(ms, @starttime, @endtime) as varchar) + ' ms'

go 5