The curious math problem in SQL Server
May 21, 2015 Leave a comment
Yesterday, I came across a curious problem in SQL Server. I tried to calculate something, and it threw a strange exception I’ve never seen before. After trying it several times (just to make sure I wasn’t going crazy) and after asking some colleagues to check it out (thanks again guys, for helping me!), I decided it was weird enough to post it on twitter with the hashtag #SQLHelp, hoping I would get some help:
— Jeffrey Verheul (@DevJef) May 19, 2015
The help I needed came from Pieter Vanhove (Blog | @Pieter_Vanhove) and Kenneth Fisher (Blog | @sqlstudent144), who were kind enough to try it out on their SQL Servers. Both couldn’t reproduce the issue. After trying some different things, I came to the conclusion that I couldn’t reproduce it myself on another instance! So it must be an instance- or database specific issue…
Taking the bull by the horns
After leaving the problem alone for an evening, I came into the office the next morning, and decided to take the bull by the horns. The first thing I did was dive into the problem again. And yes, it was still there. After trying to find some workarounds (explicitly casting all objects/properties to the same data type and length/precision, try implicit conversions, etc), I still faced this weird problem. Then I tried to reproduce it on another instance. The query executed without a problem… Okay, now it’s getting weird!
The first thing I did was try and create a script that could be used to reproduce it. The query I ended up with is this script:
DECLARE @temp TABLE (size INT) INSERT INTO @temp (size) VALUES (707072), (1024000) SELECT (100 / SUM((((size) * 8.00) / 1024))) FROM @temp AS T
Running that query showed me the exception every time I executed it on the first instance. Executing the same piece of code on another database on the same instance (yes, this small thing took me a while to figure out!) wasn’t a problem. So then I knew it’s was database problem.
Checking the settings
To compare database settings, I used the catalog view (or system view) sys.databases:
SELECT * FROM sys.databases WHERE name IN ('Sandbox', 'Sandbox2')
There were only 2 settings that could pose a problem: ARITHABORT and NUMERIC_ROUNDABORT. After a short search, I found out these could also be set in your query (for just that session). Let’s try to set these options ON and OFF in the query:
SET ARITHABORT ON; SET NUMERIC_ROUNDABORT OFF; DECLARE @temp TABLE (size INT) INSERT INTO @temp (size) VALUES (707072), (1024000) SELECT (100 / SUM((((size) * 8.00) / 1024))) FROM @temp AS T
The query still works, and the result is returned in the correct way. Let’s try and set the other option to ON:
SET ARITHABORT ON; SET NUMERIC_ROUNDABORT ON; DECLARE @temp TABLE (size INT) INSERT INTO @temp (size) VALUES (707072), (1024000) SELECT (100 / SUM((((size) * 8.00) / 1024))) FROM @temp AS T
BINGO!!! The query doesn’t work anymore, and you see the strange exception again. So why is this an issue?
According to MSDN, NUMERIC_ROUNDABORT “Specifies the level of error reporting generated when rounding in an expression causes a loss of precision.”. So this means that every time you calculate something (divide or multiply), and the precision might change, this throws an exception.
When trying to find out why you would use it, I found a possible explanation in the book “Pro SQL Server 2012 Relational Database Design and Implementation” by Louis Davidson (Blog | @drsql) and Jessica Moss (Blog | @jessicammoss):
There is a setting -SET NUMERIC_ROUNDABORT ON- that causes an error to be generated when a loss of precision would occur from an implicit data conversion. (…) SET NUMERIC_ROUNDABORT can be quite dangerous to use and might throw off applications using SQL Server if set to ON. However, if you need guaranteed prevention of implicit round-off due to system constraints, it’s there.
And in my case, this setting was set to ON on the database level, and only for a single database on the instance.
Using the MSDN “workaround”
When you read the Remarks section of the MSDN article, you get an exception when you set ARITHABORT and NUMERIC_ROUNDABORT both to ON. But if you set ARITHABORT to ON, and NUMERIC_ROUNDABORT to OFF, you should get a warning, and the resultset should be NULL. After trying that out, I noticed that doesn’t work. There’s no case possible where you get a warning and no exception. Even the code posted on that same page (in the Examples section) doesn’t work:
-- SET NOCOUNT to ON, -- SET NUMERIC_ROUNDABORT to ON, and SET ARITHABORT to ON. SET NOCOUNT ON; PRINT 'SET NUMERIC_ROUNDABORT ON'; PRINT 'SET ARITHABORT ON'; SET NUMERIC_ROUNDABORT ON; SET ARITHABORT ON; GO DECLARE @result DECIMAL(5, 2), @value_1 DECIMAL(5, 4), @value_2 DECIMAL(5, 4); SET @value_1 = 1.1234; SET @value_2 = 1.1234 ; SELECT @result = @value_1 + @value_2; SELECT @result; GO -- SET NUMERIC_ROUNDABORT to ON and SET ARITHABORT to OFF. PRINT 'SET NUMERIC_ROUNDABORT ON'; PRINT 'SET ARITHABORT OFF'; SET NUMERIC_ROUNDABORT ON; SET ARITHABORT OFF; GO DECLARE @result DECIMAL(5, 2), @value_1 DECIMAL(5, 4), @value_2 DECIMAL(5, 4); SET @value_1 = 1.1234; SET @value_2 = 1.1234 ; SELECT @result = @value_1 + @value_2; SELECT @result; GO
So to me, this seems like a really niche feature. And when I see the demo code on MSDN isn’t working as expected, and there are no comments posted about this, my careful conclusion is that this isn’t a heavily used feature.
It’s always the next morning…
To be honest, I’ve never seen this setting being used in practice, and I still don’t see the benefits of using it after I’ve encountered this issue. But in this case, it was a 3rd party vendor application, and I wasn’t aware of the rather “special” kind of configuration. But I’m glad I stepped away from the problem for a while, and looked at it the next morning again. It’s always the next morning that you solve issues like this, isn’t it?🙂