T-SQL Tuesday #49 – Wait for it…

T-SQL Tuesday is a recurring blog party, that is started by Adam Machanic (Blog | @AdamMachanic). Each month a blog will host the party, and everyone that want’s to can write a blog about a specific subject.

This month the subject is “Wait for it…”. If you want to read the opening post, please click the image below to go to the party-starter: Robert Davis (Blog | @SQLSoldier).



 
Explaining developers how SQL Server works is something we all do. Maybe not on a daily basis, but you’re asked questions like “why is my index not working”, or “what’s the best way to add multiple columns to a table”. And most of the time, these questions lead to whole other bunch of questions you need to answer. And the one question we all are asked more than once: “why is my query running slow?”. So where do you start explaining?

 
Wait Types
There are lots and lots of wait types that can be found in SQL Server. In SQL Server 2005 there are 230 different wait types, 475 in SQL Server 2008 and 491 in SQL Server 2008 R2. In SQL Server 2012 they added another 197 new ones to the list. The wait types can be found by running this query:

SELECT wait_type
FROM sys.dm_os_wait_stats
ORDER BY wait_type ASC

 
These wait types can tell you what SQL Server is doing to execute your statement, and what the possible delays are. I’m not going to sum up all the wait types, but here’s a short list of common wait types you’ll see on your SQL server:

 
SOS_SCHEDULER_YIELD
Yielding processor time

LCK_M_*
Waiting for a lock

OLEDB
Wait on the OLEDB provider (Linked servers, Full-Text Search)

WRITELOG
Writing transaction log to disk

RESOURCE_SEMAPHORE
Waiting for a query memory grant

CXPACKET
Query parallelism

PAGEIOLATCH_*
Latch on a memory address while data is retrieved from disk

LAZYWRITER_SLEEP
System process waiting to start

 
All these different wait types could indicate a problem with your statement or the server. Some are more informative, while others show you a real issue. But what I really would like to show you, is how you can find these wait types.

 
DIY or just ask for help…
One of the ways to find the wait types on your SQL server, is to dive into the seemingly endless list of DMV’s. You could use the “sys.dm_exec_requests” and “sys.dm_os_waiting_tasks” DMV’s to find what you want, or you could take the easy way out: sp_WhoIsActive by Adam Machanic (Blog | @AdamMachanic ).

Adam (also the party starter of T-SQL Tuesday) wrote a no less than brilliant script to find problems on your server. But how does it work?

 
Installing sp_WhoIsActive
The “installation” of sp_WhoIsActive couldn’t be easier. You just need to download the script, and run it. This creates a stored procedure in the database of your choice. Usually I just create it in the master database. But if you have a DBA database with useful scripts, it’s okay to create it there.

 
Running it for the first time
The stored procedure can be executed without any parameters. That way, you use the default options. Just run the statement shown below:

EXEC master.dbo.sp_WhoIsActive

 
If you need it, or just like to see more information, you can also configure the procedure with a lot of parameters. If you want to see all the options you can configure, just set the documentation parameter to 1 (true):

EXEC master.dbo.sp_WhoIsActive
  @help = 1

 
The options
If you start using sp_WhoIsActive more and more, you’ll get your own set of favorite options. It all depends on the purpose you’re using the procedure for. Most of the time, I use it to determine why queries run slow, or why the performance of the SQL server is so low.

The information sp_WhoIsActive retrieves gives you a good indication of what SQL Server is doing, or what queries are bugging each other. I’ll list my favourite options below:

First, I set @show_own_spid on, so I can see my own query in the resultset.

The second option I love is @get_plans. This shows you the execution plans of the running queries:

 
Another great parameter to set is @get_outer_command. That way, you won’t just see the query running at the moment, but also the outer-command of the query (in the example below, the INSERT INTO the temp table is executed from within the stored procedure you see in the right column):

 
To see which transaction logs are used when running your query, set @get_transaction_info to 1:

 
Information regarding locks can be found, by setting @get_locks to 1:

 
If you click the XML, you’ll see which locks are granted, pending or denied:

 
The last option I’d like to set, is @get_additional_info. This will show you more information regarding the connection settings, session variables, etc:

 
Clicking the XML shows you the properties I mentioned above:

 
So this is what the query looks like, the way I personally like to use it:

EXEC master.dbo.sp_WhoIsActive
  @show_own_spid = 1,
  @get_plans = 1,
  @get_outer_command = 1,
  @get_transaction_info = 1,
  @get_locks = 1,
  @get_additional_info = 1

 
Conclusion
Wait types are your keys to open the door of the next level of SQL Server. Not all wait types are that easy to read and understand, but there are plenty of resources to be found online. For example, just take a look at the rest of the posts today. Most of the posts for T-SQL Tuesday can be found on Twitter, when you search for #TSQL2sDay.


I want to say thanks to the employees at Coeo for the easy explanation of some of the wait types!

SQL Server Agent Jobs Schedules – A hidden evil…

One of the many beautiful features of SQL Server, is the possibility to schedule a query or job within a few minutes by using SQL Server Agent Jobs. Especially for people who want to run a query at a quiet moment of the day, but don’t want to stay up, or wake up at 4:00 AM in the morning.

 
Types of schedules
If you dig into the SQL Server Agent Jobs, you can create a job-specific schedule, or create a shared schedule. Both are essentially the same type of schedule, except the fact that the shared schedule can be a part of a lot of jobs.

 
What type do you need?
If you create a new SQL Server Agent Job, you need to figure out which type of schedule you want to use. If you want to start a number of jobs all at the same time, you want to go for a shared schedule. But if you easily want to move jobs in time (change the start date or start time), then a single (job specific) schedule is what you want.

 
How it works
Once you’ve created a job, you can click on “Schedules”, and then either choose “New” (single schedule), or “Pick” (shared schedule):

 
New schedule
If you decide to create a new schedule, you’ll get a window like this:

In this window you can set the runtime of the schedule, start- and end date, frequency, etc.

 
Pick a schedule
If you go for the “Pick” option, you’ll see a window like this:

In this window, you can select a previously made schedule, so you could run it in specifically created time windows for example.

 
Edit schedule
But what’s the catch? If you change a schedule, this might effect other jobs in the same schedule! This is where my issue comes in. A few weeks back a schedule was changed (we wanted to run the job 2 hours earlier every day), and suddenly several other jobs changed schedule. Without checking, the schedule was changed, and that resulted in a really strange situation with parallel running jobs that almost locked out all other processes.

So if you want to change the schedule of a job, click the “Jobs in schedule” button. Double check if your schedule isn’t turned into a shared schedule by you, or one of your colleagues or customers:

 
Manage schedules on server
But that’s not all. Last week I’ve read an article about shared job schedules, that was tweeted by Pieter Vanhove (Blog | @Pieter_Vanhove). In that article I discovered that you can also right-click on “Jobs”, and click on “Manage Schedules”:

This results in this window, where you can see the shared schedule overview, the amount of jobs per schedule, etc:

 
Conclusion
As I’ve said in previous posts: Always check, double check, and triple check before you make any changes before you run into more issues than you solve with your change!

T-SQL Tuesday #43 – Hello, Operator?

T-SQL Tuesday is a recurring blog party, that is started by Adam Machanic (Blog | @AdamMachanic). Each month a blog will host the party, and everyone that want’s to can write a blog about a specific subject.

This month the subject is “Hello, Operator?”. If you want to read the opening post, please click the image below to go to the party-starter: Rob Farley (Blog | @rob_farley).



 
Execution Plan
Everybody that ever got serious with SQL Server, knows about execution plans. And like Rob said in his opening post, if you don’t, start looking into that! But if you start working with execution plans, it gets confusing really fast. All those weird “building blocks” that tell you what SQL Server did with your query?! It almost looks like magic…!

But when you dive into it, they become easier to read and they suddenly are somewhat understandable. But hey, even after all those years looking at execution plans, some things still amaze me.

Last week I was asked to look at a query, that was used in an SSIS package. This package was one of the slowest from the nightly ETL processes. So I started of by looking at the execution plan:

 
Start pinpointing the first issue
Then I started digging, and I noticed this:

LEFT JOIN DB.dbo.T1 m
	ON m.Col1 = ISNULL(LOWER(t.Col1) COLLATE Latin1_General_CS_AS, '(N/A)')
LEFT JOIN DB.dbo.T2 s
	ON s.Col2 = ISNULL(LOWER(t.Col2) COLLATE Latin1_General_CS_AS, '(N/A)')
LEFT JOIN DB.dbo.T3 k
	ON k.Col3 = ISNULL(LOWER(t.Col3) COLLATE Latin1_General_CS_AS, '(N/A)')

 
But wait a minute… The collations we use aren’t Case Sensitive, so why use the LOWER() function anyway? And the collations are practically the same for the 2 databases used in the query (“SQL_Latin1_General_CP1_CI_AS” and “Latin1_General_CI_AS”). But the column collations are both the same (“SQL_Latin1_General_CP1_CI_AS”). SQL Server doesn’t throw an error if I join both sets, and the results aren’t different if I use COLLATE or not. So we learned another thing: the column collation is used in the ON clause of the JOIN. So both functions aren’t necessary because the collations are equal, and thus their wasting our precious CPU cycles…

Now here comes the moment you’ve all been waiting for. How does the execution plan differ? This is what the new execution plan looks like:

So what are we missing? You guessed right! We’re missing this bad boy:

So what does the “Compute Scalar” actually do? According to MSDN:

 

The Compute Scalar operator evaluates an expression to produce a computed scalar value

So that means it calculates a new value for every input column. This calculation can be as simple as a conversion of the value, or a concatenation of multiple values. Most of the time these “Compute Scalar” aren’t a major cost in your execution plan, but it can become an issue that uses a lot of CPU.

But what does this “Compute Scalar” do in this query? If I remove the ISNULL() from the query, it’s still there. It disappears if I remove the COLLATE. So in this case, the “Compute Scalar” is “calculating” a new value for every record. This “calculation” actually is a conversion from 1 collation to the other.

 
But what’s the difference?
If we compare both versions of the query, the modified version has a lot less impact on the server:

And this is only without the LOWER() and COLLATE functions. So with a little bit more effort, you can rewrite a part of the query, maybe add an index, etc. So as you can see, small changes can have a huge impact!

 
Conclusion
A “Compute Scalar” isn’t that hard to understand, but you’ll find it very often in your execution plans. So even though it isn’t the worst performance blocker in the world, it’s an interesting one to get familiar with.

If you don’t use SQL Sentry Plan Explorer yet, download your copy now!

Get column headers without retrieving data

A few days ago Pinal Dave (Blog | @pinaldave) retweeted an old blog post about retrieving columns without data. Reading this blog post, I started thinking of other ways to achieve this.

One of the downsides of using SET FMTONLY ON is that it’s a session-setting. This means that every query you run in that session, doesn’t return results. This could end up in very confusing situations. You could end up debugging a query which doesn’t return any result, and eventually you discover that this is a result of SET FMTONLY ON. So this may not be the preferred way to get the column headers from a specific query. Also, this option doesn’t return estimated or actual execution plans!

An example of such a query could be:

SET FMTONLY ON

SELECT *
FROM sys.tables

SET FMTONLY OFF

Another way of getting this information, without changing your connection options, is a simple WHERE clause.

SELECT *
FROM sys.tables
WHERE 1 = 0

This query will return the same results as the query with SET FMTONLY ON. And looking at performance, there’s no real difference between the two. I ran statistics on both solutions, and the I/O generated is pretty neglectable.

Conclusion
My personal favorite is the WHERE clause. This is because it doesn’t involve session modifiers, and (if you want to) still returns an execution plan. The SET FMTONLY ON doesn’t return an estimated or actual execution plan.

On the other hand, SET FMTONLY is on the list of deprecated features in future SQL Server versions. So for all new functionality, I’d advise you not to use it.

But maybe the other options suits you more. As always: it depends. Depending on your own preferences and the specific situation. But if you have any other options, please let me know. Post a comment, send me an e-mail, tweet me, whatever you want :).

Using SQLCMD to your advantage

A few weeks ago, I came across something called SQLCMD (or SQLCommand). I’ve never heard of this, so the curious developer in me wanted to know what it was. The official definition according to MSDN is this:

The sqlcmd utility lets you enter Transact-SQL statements, system procedures, and script files at the command prompt, in Query Editor in SQLCMD mode, in a Windows script file or in an operating system (Cmd.exe) job step of a SQL Server Agent job. This utility uses ODBC to execute Transact-SQL batches.

You can either run a SQLCMD query via command-line or in SQL Server Management Studio (SSMS). If you want to run it via SSMS, open a new query and press the hotkey combination: ALT+Q, ALT+M. You could also use the menu Query -> SQLCMD Mode.

Once you’ve done that, you can run queries in SQLCMD mode. This means you can use your query window as a commandline tool. The advantage of this, is that you can query multiple servers and instances in one query.

To run queries in SQLCMD mode you need the commands listed below:

Command Parameters Description
:!! <command> Directly executes a cmd command from SQLCMD
:CONNECT <server>(\instance) Connects to the specified servers default or specified instance
  [-l timeout]  
  [-U user]  
  [-P password]  
:ERROR <destination> Redirects error output to a file, stderr or stdout
:EXIT   Quits SQLCMD immediately
  (<query>) Executes the specified query and returens numeric result
GO [n] Executes the specified query (parameter: x times)
:ONERROR <exit / ignore> Specifies which action to take if the query encounteres error
:OUT <filename> Redirects query output to a file, stderr or stdout
  [stderr / stdout]  
: PERFTRACE <filename> Redirects timing output to a file, stderr or stdout
  [stderr / stdout]  
:QUIT   Quits SQLCMD immediately
:R <filename> Append a file to statement cache (ready to execute)
:RESET   Discards the statement cache (reset session)
:SERVERLIST   Lists local and network SQL servers
:SETVAR <varname> <"value"> Sets a SQLCMD scripting variable

Remember, not all commands are listed above, but just the once I found usefull at this time. Parameters listed in are mandatory, and in [ ] are optional.

To write all queries out would be too much, so I created some scripts for you to download. Here’s a list of files you can download and try on your local SQL server:

Get database info with screen output, from multiple SQL servers
Execute script file, and output to file
Get directory content via commandline
Collection of commands in 1 script file
Script file used in the example above

The first script file is the most interesting if you ask me. In this script, I connect to multiple instances of SQL server to retrieve data. You can use this instead of a Multiserver query I blogged about earlier. Instead of registering a group of servers and running your query against that, you can now specify the servers you want by server name or IP-address.

With this information you could start writing your own SQLCMD queries. If you want, you can do this via SSMS or command line. The command line utility can be found at this location:

C:\Program Files\Microsoft SQL Server\100\Tools\Binn\SQLCMD.exe

If you have any questions or need some help, just contact me. I’d love to help you out! :)

Reporting Services – Query database

After reading a tweet by Ray Martinez, I decided to share the scripts for Reporting Services that I often use.

One of the scripts I used a lot, is a script that shows the Reports that were successfully executed in 2012. For these Reports you will see the last execution time, rows and bytes returned, and how long it took to gather and show the results:

USE ReportServer

SELECT
	CL.Name						AS ReportName,
	CL.Description				AS ReportDescription,
	CL.Path						AS ReportPath,
	CL.CreationDate				AS ReportCreationDate,
	SUM(1)						AS TotalNumberOfTimesExecuted,
	MAX(EL.TimeStart)			AS LastTimeExecuted,
	AVG(EL.[RowCount])			AS AVG_NumberOfRows,
	AVG(EL.TimeDataRetrieval)	AS AVG_DataRetrievalTime,
	AVG(EL.TimeProcessing)		AS AVG_TimeProcessing,
	AVG(EL.TimeRendering)		AS AVG_TimeRendering
FROM ExecutionLog EL
JOIN Catalog CL
	ON CL.ItemID = EL.ReportID
WHERE 1 = 1
AND CL.Name IS NOT NULL
AND EL.Status ='rsSuccess'
GROUP BY
	CL.Name,
	CL.Path,
	CL.CreationDate,
	CL.Description
HAVING YEAR(MAX(EL.TimeStart)) = 2012
ORDER BY COUNT(EL.ReportID) DESC

The second script I want to share with you, shows the first 1000 successfully executed Reports. The data that is returned includes Report format, parameters used to query the data, information about the returned resultset and time needed to return and render the data and Report, etc. I uses a top 1000 because our Report server returned a lot of rows, where I only needed a few for my analysis.

USE ReportServer

SELECT TOP 1000
	EL.InstanceName				AS SQLInstanceName,
	EL.UserName					AS ExecuterUserName,
	EL.Format					AS ReportFormat,
	EL.Parameters				AS ReportParameters,
	EL.TimeStart				AS TimeStarted,
	EL.TimeEnd					AS TimeEnded,
	EL.TimeDataRetrieval		AS TimeDataRetrieval,
	EL.TimeProcessing			AS TimeProcessing,
	EL.TimeRendering			AS TimeRendering,
	EL2.Source					AS Source,
	EL.ByteCount				AS ReportInBytes,
	EL.[RowCount]				AS ReportRows,
	CL.Name						AS ReportName,
	CL.Path						AS ReportPath,
	CL.Hidden					AS ReportHidden,
	CL.CreationDate				AS CreationDate,
	CL.ModifiedDate				AS ModifiedDate,
	EL2.Format					AS RenderingFormat,
	EL2.ReportAction			AS ReportAction,
	EL2.Status					AS ExectionResult,
	DS.Name						AS DataSourceName,
	DS.Extension				AS DataSourceExtension
FROM ExecutionLog EL
JOIN Catalog CL
	ON CL.ItemID = EL.ReportID
LEFT JOIN ExecutionLog2 EL2
	ON EL2.ReportPath = CL.Path
JOIN DataSource DS
	ON DS.ItemID = CL.ItemID
WHERE 1 = 1
AND EL.Status = 'rsSuccess'
ORDER BY EL.TimeStart DESC

The next script returns an overview of the folder and Reports on your Report server. For every object on your server, you can see the creation- and modify date. The joined Executionlog table is used to get the number of times the Report was executed, by which user, and how much data was returned in which time:

USE ReportServer

SELECT
	CASE CL.Type
		WHEN 1 THEN 'Folder'
		WHEN 2 THEN 'Report'
		WHEN 3 THEN 'Resource'
		WHEN 4 THEN 'Linked Report'
		WHEN 5 THEN 'Data Source'
	END									AS ObjectType,
	CP.Name								AS ParentName,
	CL.Name								AS Name,
	CL.Path								AS Path,
	CU.UserName							AS CreatedBy,
	CL.CreationDate						AS CreationDate,
	UM.UserName							AS ModifiedBy,
	CL.ModifiedDate						AS ModifiedDate,
	CE.CountStart						AS TotalExecutions,
	EL.InstanceName						AS LastExecutedInstanceName,
	EL.UserName							AS LastExecuter,
	EL.Format							AS LastFormat,
	EL.TimeStart						AS LastTimeStarted,
	EL.TimeEnd							AS LastTimeEnded,
	EL.TimeDataRetrieval				AS LastTimeDataRetrieval,
	EL.TimeProcessing					AS LastTimeProcessing,
	EL.TimeRendering					AS LastTimeRendering,
	EL.Status							AS LastResult,
	EL.ByteCount						AS LastByteCount,
	EL.[RowCount]						AS LastRowCount,
	SO.UserName							AS SubscriptionOwner,
	SU.UserName							AS SubscriptionModifiedBy,
	SS.ModifiedDate						AS SubscriptionModifiedDate,
	SS.Description						AS SubscriptionDescription,
	SS.LastStatus						AS SubscriptionLastResult,
	SS.LastRunTime						AS SubscriptionLastRunTime
FROM Catalog CL
JOIN Catalog CP
	ON CP.ItemID = CL.ParentID
JOIN Users CU
	ON CU.UserID = CL.CreatedByID
JOIN Users UM
	ON UM.UserID = CL.ModifiedByID
LEFT JOIN ( SELECT
				ReportID,
				MAX(TimeStart) LastTimeStart
			FROM ExecutionLog
			GROUP BY ReportID) LE
	ON LE.ReportID = CL.ItemID
LEFT JOIN ( SELECT
				ReportID,
				COUNT(TimeStart) CountStart
			FROM ExecutionLog
			GROUP BY ReportID) CE
	ON CE.ReportID = CL.ItemID
LEFT JOIN ExecutionLog EL
	ON EL.ReportID = LE.ReportID
	AND EL.TimeStart = LE.LastTimeStart
LEFT JOIN Subscriptions SS
	ON SS.Report_OID = CL.ItemID
LEFT JOIN Users SO
	ON SO.UserID = SS.OwnerID
LEFT JOIN Users SU
	ON SU.UserID = SS.ModifiedByID
WHERE 1 = 1
ORDER BY CP.Name, CL.Name ASC

The last query is a short one. This returns the Reports that don’t inherit permissions, that are set in the site settings menu.

USE ReportServer


SELECT
	Path,
	Name
FROM Catalog
WHERE PolicyRoot = 1 

Hopefully there will be more where this came from. So follow my blog, or contact me by email or Twitter!

Data Type and Operator Precedence

In SQL Server you will encounter a lot of cases, in which an operator combines two expressions of different data types. The rules that specify which value is converted to another data type, can be found on MSDN. But the precedence of data types are different for the different versions of SQL Server. Therefore I created the schema below, so you can compare the different versions:

  SQL Server 2005 SQL Server 2008 / 2008 R2 SQL Server 2012
1 user-defined data types (highest) user-defined data types (highest) user-defined data types (highest)
2 sql_variant sql_variant sql_variant
3 xml xml xml
4 datetime datetimeoffset datetimeoffset
5 smalldatetime datetime2 datetime2
6 float datetime datetime
7 real smalldatetime smalldatetime
8 decimal date date
9 money time time
10 smallmoney float float
11 bigint real real
12 int decimal decimal
13 smallint money money
14 tinyint smallmoney smallmoney
15 bit bigint bigint
16 ntext int int
17 text smallint smallint
18 image tinyint tinyint
19 timestamp bit bit
20 uniqueidentifier ntext ntext
21 nvarchar
(including nvarchar(max))
text text
22 nchar image image
23 varchar
(including varchar(max))
timestamp timestamp
24 char uniqueidentifier uniqueidentifier
25 varbinary
(including varbinary(max))
nvarchar
(including nvarchar(max))
nvarchar
(including nvarchar(max))
26 binary
(lowest)
nchar nchar
27 varchar
(including varchar(max))
varchar
(including varchar(max))
28 char char
29 varbinary
(including varbinary(max))
varbinary
(including varbinary(max))
30 binary
(lowest)
binary
(lowest)

The same counts for Operators. There are differences in the precedence between SQL Server versions. I took the data from different versions of SQL Server, and created the schema below:

  SQL Server 2005 SQL Server 2008 / 2008 R2 SQL Server 2012
1 ~ (Bitwise NOT) ~ (Bitwise NOT) ~ (Bitwise NOT)
2 * (Multiply), / (Division),
% (Modulo)
* (Multiply), / (Division),
% (Modulo)
* (Multiply), / (Division),
% (Modulo)
3 + (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), &
(Bitwise AND)
+ (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), &
(Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
+ (Positive), – (Negative), + (Add), (+ Concatenate), – (Subtract), &
(Bitwise AND), ^ (Bitwise Exclusive OR), | (Bitwise OR)
4 =, >, =, <=, ,
!=, !>, !< (Comparison operators)
=, >, =, <=, ,
!=, !>, !< (Comparison operators)
=, >, =, <=, ,
!=, !>, !< (Comparison operators)
5 ^ (Bitwise Exlusive OR),
| (Bitwise OR)
Text Text
6 NOT NOT NOT
7 AND AND AND
8 ALL, ANY, BETWEEN, IN, LIKE, OR, SOME ALL, ANY, BETWEEN, IN, LIKE, OR, SOME ALL, ANY, BETWEEN, IN, LIKE, OR, SOME
9 = (Assignment) = (Assignment) = (Assignment)


This post was inspired by a SQL Server session by Bob Beauchemin (Blog | @bobbeauch), at the last version of TechDays NL. Bob, thank you for that! ;)

SQL Server system database – mssqlsystemresource

The database mssqlsystemresource is a read-only database that is shipped with SQL Server from version SQL Server 2005. It contains all the system objects that are included in SQL Server. An example of this are the sys.objects. These are stored in the Resource database, but appear in every other database. The sys.objects from a user database refer to the Resource database.

SQL Server cannot backup this database. You can create a backup yourself, but don’t back it up like a normal .MDF file. It also can’t be restored via SQL Server, but you can do it manually.

The Resource database makes upgrading to a new version easier and faster.

The mssqlsystemresource database is invisible for users, even for the System Administrator (sa). The database can be restored by copying the .ldf and .mdf files from the folder “[Drive]\Program Files\Microsoft SQL Server\MSSQL10_50.[Instance Name]\MSSQL\Binn” to a folder of your choice (in this case D:\Databases). After that, start a query with the user sa, and run the following script:

USE [master]
GO

CREATE DATABASE [mssqlsystemresource_RESTORED] ON 
	(FILENAME = N'D:\Databases\mssqlsystemresource.mdf'),
	(FILENAME = N'D:\Databases\mssqlsystemresource.ldf')
FOR ATTACH
GO

Once you’ve done this, you can query the restored version of the Resource database.

Comparison: Ad-hoc vs Stored Procedure vs Dynamic SQL

Sometimes when you are creating a SQL query, you wonder if it’s more sensible to use a standard T-SQL, or some other possibility. In this post I will try to compare the three different possibilities, so that you can make your own decision. If you read the comparison below, keep in mind that there is no right or wrong in this. Each situation requires a different point of view, and may offer it’s own difficulties.

  Ad-Hoc Query Stored Procedure Dynamic SQL
Use it for Long, complex queries
(OLAP; for example Reporting or Analysis)
Short, simple queries
(OLTP; for example Insert, Update, Delete, Select)
Short, simple queries
(OLTP; for example Insert, Update, Delete, Select)
Performance Compiled at runtime, Execution Plan stored in Cache

Changed data is no issues because of re-compile

Compiled once at first run, and stored in Procedure Cache

Changed data might be a performance bottleneck. Can be
solved with recompile

Compiled at runtime, and execution plan is not stored

(unless
using the more efficient sp_executesql)

Changed data is no issue because of re-compile

Security Permissions (read/write) on all objects (database(s)/table(s)) Execute permissions on SP are enough Permissions (read/write) on all objects (database(s)/table(s))
Flexibility If changed, your application needs to be recompiled If changed, only need to change the SP in the database If changed, your application needs to be recompiled
Number of Statements Only 1 statement possible Multiple statements possible Multiple statements possible
Memory Usage Uses more memory then an SP Uses less memory then an ad-hoc query Uses more memory then an SP
Network traffic Query executes server side

Query and resultset are send across
network/internet

Query executes server side

Execute statement
and resultset are send
across network/internet

Query executes server side

Statement
and resultset are send across
network/internet

Separation Database logic and business logic are mostly combined in the query Seperate database logic from business logic Seperate database logic from business logic
Troubleshoot Relatively easy to troubleshoot Relatively easy to troubleshoot Difficult to troubleshoot
Maintaining Difficult because of several locations in applications and database Easy because of single location Difficult because of several locations in
applications and database
Access Difficult to access multiple objects in different databases,
or in dynamic databases
Difficult to access multiple objects in different databases,
or in dynamic
database
Allows any object (database, table, columns, etc) to be referenced
WHERE clause Fairly static WHERE clause Fairly static WHERE clause Dynamic WHERE clause (add/remove), based on parameters
Versioning Only possible via Source Controlling your application Possible via Source Controlling your database, and by commenting your SP Only possible via Source Controlling your application
CRUD Can be created by getting all your  queries together, and looking for
specific
keywords

(Update, Delete, Select, Etc)
Difficult to catch in a CRUD

(Create, Read, Update, Delete) diagram
Difficult to catch in a CRUD (Create, Read, Update, Delete) diagram
Structure Update Can be changed
simultaneously with structure changes
Needs to be altered when the underlying structure is changed Can be changed simultaneously with structure changes
Searching No standard way to search through Possible to use sys.procedures to search through SP contents.

Dependency window in SSMS shows SP content

No standard way to search through
Testing Can be compiled/tested in code Impossible to automatically compile without 3rd party tools Difficult to test in code
Mapping ORM (Object-relational mapping) is possible ORM (Object-relational mapping) is impossible ORM (Object-relational mapping) is impossible
Compiling Compiles the whole statement Compiles the whole statement Only static elements can be compiled

For the design of this comparison chart, I need to thank my buddy and colleague Pascal (Blog | @pdejonge). For the record: I’m not a designer, and my “design” was what you guys might call Fugly.

These comparison chart covers the main reasons for me to use or not use a specific option. These are my personal beliefs. If you have any suggestions to add, please don’t hesitate to contact me.

Join vs Sub Query

If you work with SQL Server you will eventually encounter a JOIN statement. This JOIN statement can be stated as an ordinary JOIN, or as a sub select. In most cases I try to use JOINS (even though it’s just for readability), but in some cases it’s quicker to do it in a sub select (for example with production code, in which you don’t want to change the output but filter output).

Curious about the performance differences between the two, I tried to create a test. I created 2 tables with country information:


 
 
In this case I will JOIN both tables, based on DefaultAbbreviation. The queries I used look like this:

SELECT CA.UNAbbreviation
FROM CountryAbbreviation CA
WHERE 1 = 1
AND CA.DefaultAbbreviation IN
    (SELECT DefaultAbbreviation
     FROM CountryInformation) 
SELECT CA.DefaultAbbreviation
FROM CountryAbbreviation CA
INNER JOIN CountryInformation CI
    ON CI.DefaultAbbreviation = CA.DefaultAbbreviation
WHERE 1 = 1

 
Both queries are executed, and the Execution Plans look like this:

Query 1:

And for query 2:


 
As you can see the Execution Plans are practically the same. But if you look at the Execution Plan if you execute both statements simultaneously:
 

 
So in this case it proves that a JOIN is slower than a sub select. But what if you want a specific value…? I used the same statements, except in both versions I ask for the specific “CountryProper” value “Netherlands”:
 

 
So the Execution Plans show that if you fetch a specific value, the costs of both queries are the same.
 
The conclusion is the same as always: check the Execution Plan if you’re not sure about the performance of your statements.
 

Sources:
CreateSources.sql
SelectList.sql
SelectSpecificValue.sql

Follow

Get every new post delivered to your Inbox.

Join 34 other followers