T-SQL Tuesday #37 – Join me in a Month of Joins

T-SQL Tuesday is a recurring blog party, that is started by Adam Machanic (Blog | @AdamMachanic). Each month a blog will host the party, and everyone that want’s to can write a blog about a specific subject.

This month the subject is “Join me in a month of joins”. If you want to read the opening post, please click the image below to go to the party-starter: Sebastian Meine (Blog | @sqlity).

When I read this months invitation by Sebastian, I thought about a situation last week. A developer asked me to review a bunch of T-SQL queries and procedures that one of the other colleagues wrote, because they didn’t work. The first thing that I noticed was the readability of the scripts. I’ve seen some bad ones in my life, but these were just monstrous!

Thinking about the situation I’m guessing it’s just a lack of knowledge about databases. They don’t see what the connection is between data sets. They don’t know what specific joins do, and so they used what they see everyone uses: just JOIN. But they don’t realize that this implicitly means INNER JOIN for the engine.

One of the biggest issues in the script was the fact that a MERGE was used instead of an UPDATE FROM. I’ll try to explain this by using an example of a car factory. The code is exactly the same as the code I was asked to debug, except the objects are renamed.

MERGE INTO Factory.dbo.Stock
USING #TMP_NewDelivery D
	ON D.Brand = Factory.dbo.Stock.Brand
	AND D.PartID = Factory.dbo.Stock.PartID
		UPDATE SET DeliveredAmount = D.DeliveredAmount
INTO @Logging

One of the issues with this code that I noticed first was the fact that it only updates values. So why didn’t they use an UPDATE FROM? This isn’t too bad, except the JOIN clause isn’t declared once, but twice: in the USING, and in the WHEN MATCHED part. The issue is that the ON clause is joined on Brand (let’s say Seat), and that the PartID is added at a later stage. In the end, the query started updating all rows with the same PartID (let’s say Engine). So instead of updating the stock of Seat Engines, it updated the stock for all Engine parts.

And unfortunately I couldn’t do anything with the logging data that is generated by the script. Instead of storing it in a table, the logging information wasn’t used at all. It was stored in a memory table (why they used a memory table, I don’t know), and this wasn’t saved into another object. So why use precious CPU cycles to “store” information you don’t use?

Looking at this reminded me of something I tend to forget: the database is often some side-track for developers. They quickly write a query that isn’t that good and isn’t that fast, but it does the trick. At least, that’s what they think!

Please start thinking in collections and sets if you work with SQL Server, and don’t use a cursor for everything. If you don’t know the difference between a LEFT, RIGHT and INNER JOIN , please ask for help. Your DBA or SQL developer won’t make fun of you for asking. And if you don’t ask for help, please don’t be mad if we use your code as an example! 😉

Are nested Inserts possible?

Two weeks ago I got an interesting questions. Two of my colleagues thought of a funny way to tease me, by asking me this question:

If we have an insert into table X, can we nest another insert into the query to fill table Y with the identity generated by insert X?

After thinking about this a few minutes, I told them I thought this was possible. Eventually I found a few spare minutes, and came up with this solution. In all honesty, this isn’t completely my solution. A few forums I came across mentioned this option, and I took those ideas and created my own solution.

So first, let’s create the 2 tables we need for this example:

	 String VARCHAR(100))

	 String VARCHAR(100),
	 Action VARCHAR(100))

So I’ve created T1 (table X in the question), and T2 (table Y). Let’s assume that T1 contains your product data, and for every product you insert, you want to insert a log-record into T2.

This can’t be done with actual nested inserts. If you want to do this, the easiest ways is to create a Stored Procedure (SP) that uses SCOPE_IDENTITY to retrieve the generated ID, and insert that into your second table. But because an SP isn’t always the best ways to do this, I wanted to see if I could transform it into a single query.

Since SQL Server 2008 we have the opportunity to use the MERGE statement. What the MERGE does is synchronize two tables by inserting, updating and deleting rows in the destination table, based on a set of rules and/or joins.

So, how would this statement look like? After some trial and error, I created this statement:

USING (SELECT -1 AS ID, 'This is a string' AS String) Q
	ON Q.ID = T1.ID
	THEN INSERT (String) VALUES ('This is a string...')
	inserted.ID, inserted.String, CONCAT('<<<<< Inserted in T1 @ ', CONVERT(DATE, GETDATE()) ,' >>>>>')

As you can see the destination table is T1. This is the first table the record is inserted into, and the table that generates the identity. Because we only want to insert the record, and not update or delete anything, I only created a “WHEN NOT MATCHED BY TARGET” clause.

Because the ID columns don’t match, the record gets inserted into T1. After that, I use the OUTPUT clause of the merge statement to insert the same record (but with the generated identity) into T2. As a reference, I also insert the action-description that contains a date.

So as you can see, you can use nested inserts in SQL Server, only via another statement. But remember, this is just written to prove it’s possible in a single statement, and not for production usage. So if you decide to use this in production or on your system, consider your other options first!

Code samples:
Are nested Inserts possible.sql