Calculating Running Totals

When you work with data and database systems, eventually you need to calculate running totals on (for example) product sales or financial data. There are several methods to calculate these amounts. In this post I’ll try to show the pros and cons to the different solutions.

Let’s start with creating the resources for the examples. The most basic example I could think of, is one with only the necessary information: Date and Turnover.

CREATE TABLE Dough
	(Date DATE,
	 Turnover FLOAT)

And then insert some data:

INSERT INTO Dough
	(Date, Turnover)
VALUES
	('2011-01-01', 1000),
	('2011-02-01', 1250),
	('2011-03-01', 1500),
	('2011-04-01', 1750),
	('2011-05-01', 2000),
	('2011-06-01', 2250),
	('2011-07-01', 2250),
	('2011-08-01', 2000),
	('2011-09-01', 1750),
	('2011-10-01', 1500),
	('2011-11-01', 1250),
	('2011-12-01', 1000)

INSERT INTO Dough
	(Date, Turnover)
VALUES
	('2012-01-01', 100),
	('2012-02-01', 125),
	('2012-03-01', 150),
	('2012-04-01', 175),
	('2012-05-01', 200),
	('2012-06-01', 225),
	('2012-07-01', 225),
	('2012-08-01', 200),
	('2012-09-01', 175),
	('2012-10-01', 150),
	('2012-11-01', 125),
	('2012-12-01', 100)

With this resource, we can start on the examples.

Different solutions


When looking at this question, you’ll notice that there are more solutions to return the correct result. The following queries return the same result, but all the solutions are written for a specific version of SQL Server.

SQL 2000
If you’re using SQL Server 2000 (and I certainly hope you don’t have to anymore ;)), you can use the query with the INNER JOIN. This can be used on all SQL Server versions:

SELECT
	A.Date,
	A.Turnover,
	SUM(B.Turnover)		AS RunningTotal
FROM Dough A
INNER JOIN Dough B
	ON YEAR(B.Date) = YEAR(A.Date)
	AND B.Date <= A.Date
GROUP BY A.Date, A.Turnover
ORDER BY A.Date ASC

SQL 2005
In SQL Server 2005 they entered a new join type, called CROSS JOIN:

SELECT
	A.Date,
	A.Turnover,
	SUM(B.Turnover)		AS RunningTotal
FROM Dough A
CROSS JOIN Dough B
WHERE YEAR(B.Date) = YEAR(A.Date)
AND B.Date <= A.Date
GROUP BY A.Date, A.Turnover
ORDER BY A.Date ASC

The example with the INNER JOIN and the CROSS JOIN generate the same execution plan.

SQL 2012
With the release of SQL Server 2012 they handed us (SQL developers) a whole new “bag of tricks”. One of these “tricks” is the window function.

The first time I saw the window function, was at a Techdays NL 2012 session. This session was hosted by Bob Beauchemin (Blog | @bobbeauch). The sessions (T-SQL improvements in SQL Server 2012) is worth watching. Even if you’re using SQL Server 2012 already!

With the window function you can compute and group data, and this is done with the rows you specify.

SELECT
	Date,
	TurnOver,
	SUM(TurnOver) OVER (PARTITION BY YEAR(Date)
						ORDER BY Date ASC
						ROWS BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW)		AS RunningTotals
FROM Dough

Performance


Seeing all these different solutions for the same question, I (and you probably will too) wonder about the performance of these queries. One very quick conclusion: they all return the same records😉.

When using SET STATISTICS IO, you can see the amount of disk activity generated by your statement. If you run this for the queries above, you will get the following results:

INNER JOIN:

Table ‘Dough’. Scan count 2, logical reads 25, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

CROSS JOIN:

Table ‘Dough’. Scan count 2, logical reads 25, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

OVER:

Table ‘Worktable’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table ‘Dough’. Scan count 1, logical reads 1, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

In the OVER query, you see a table called “Worktable”. This is an “extra object” that is generated by SQL Server because you use the OVER statement.

Conclusion


As shown above, there are several different ways to get to the same result. In this example I didn’t show you the cursor solution. This because it’s a bad practice, a bad performer, and a little bit to preserve my good name😉. If you do want to see this, please leave me a comment, and I’ll add it to this post.

But with every solution you’ll see as much discussion about reasons to use is, as discussions on why NOT to use it. And in this case, you might be bound to a specific SQL Server version, so you can’t use a specific approach.

But if you ask me for my opinion, I’ll go for the last option. Not only because I’ve got the privilege to work with SQL Server 2012 in my daily work, but also because it’s the best performer and you’ll end up with the most readable code.

I’m guessing you have a totally different opinion, so please leave a comment with your ideas and/or approaches to this challenge! Also, comments and questions are also greatly appreciated!

3 Responses to Calculating Running Totals

  1. Thank you for this enlightening comparison🙂

  2. Pingback: Write readable and high-performance queries with Window Functions | SQL from the Trenches

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: