Time traveling with SQL Server 2016: Temporal tables

Since the release of the first publicly available SQL Server 2016 CTP2, we have a whole list of new and cool features we can start using. One of those features is called temporal tables (or system-versioned tables). So what are temporal tables, and how can you use them?

 
What is a temporal table?
To understand the functionality of temporal tables, we first need to agree on the definition of a table in SQL Server. If you run a query on a table that contains data, that query can only return the current version of a record. This means you can only see the current “truth”, and there’s no way to travel back in time to see older versions of this record.

Temporal tables give you the opportunity to time travel in SQL Server. For every data change (Update, Delete and Merge) in your table, a historical copy is stored in a history table. The table on which you enabled this data tracking is changed to a “System-Versioned” table.

 
How does it work?
For every update or delete on a table where system_versioning is enabled, a historical copy of the record is stored:

The original image can be found in this blog post, written by Manoj Pandey (Blog | @manub22).

 
But how does that work? Let’s start by creating a table we will use to test system-versioned tables:

CREATE TABLE dbo.BankAccount
	(AccountNumber INT CONSTRAINT PK_BankAccount_TransactionID PRIMARY KEY,
	 AccountBalance FLOAT);

 
In order to make a table a system-versioned table, we need to add 2 datetime2 columns to our newly created table. SQL Server will use these columns to store a from- and to-date that will indicate the valid period of time of the record. By telling SQL Server to use these 2 columns with the keywords “PERIOD FOR SYSTEM_TIME”, SQL Server knows that it needs to use these columns when querying data from the history table:

ALTER TABLE dbo.BankAccount
ADD ValidFrom datetime2 GENERATED ALWAYS AS ROW START HIDDEN NOT NULL,
	ValidTo datetime2 GENERATED ALWAYS AS ROW END HIDDEN NOT NULL,
	PERIOD FOR SYSTEM_TIME (ValidFrom, ValidTo);

 

In the above script, the “HIDDEN” keyword is used. This can only be used from CTP 2.1 and higher. What HIDDEN does, is hide the columns in your original table to make sure application- and database code isn’t going to fail because you added these 2 new columns. You can still query the columns, but if you run a “SELECT *” query these columns won’t be returned.

 
When that’s done we can enable system-versioning on the table:

ALTER TABLE dbo.BankAccount
SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.MyBankAccountHistory));

 
Data changes & history
Now we can take a look at how data changes are captured, and history is recorded. Before you execute any of the statements below, enable the execution plan in your SSMS. If we insert a new record, no history is created:

INSERT INTO dbo.BankAccount
	(AccountNumber, AccountBalance)
VALUES
	(2147483647, 10000)

 
We just see the expected Clustered Index Insert, nothing special:

 
Now, let’s update the bank account balance (the WAITFOR is added just to make sure we have a clear difference in change time for the next few steps):

UPDATE dbo.BankAccount
SET AccountBalance = AccountBalance - 2500

WAITFOR DELAY '00:01:00'

UPDATE dbo.BankAccount
SET AccountBalance = AccountBalance + 1000

 
If you look at your execution plan, you’ll see an extra DML command in the execution plan:

 
Besides the expected Clustered Index Update, there is an extra Clustered Index Insert. This is the insert statement that is generated by SQL Server, to insert a record in the history table. There’s no user interaction needed to store historical data. SQL Server will take care of that process for you.

 
Time travelling
Now that we store historical data, it’s time to start the actual time travelling. To paint a complete picture, this is a timeline of the data inserts and updates:

 
Now that we have a clear timeline, let’s start our time travel. Besides creating and maintaining the history table, you also get to use a bit of new syntax in your query. For system-versioned tables, you can use the “FOR SYSTEM_TIME” clause in your query. In order to show you the timeline in SQL Server, let’s use this new syntax to query our history per minute:

SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF '2015-07-09 09:42:00' --No data available
SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF '2015-07-09 09:43:00' --First insert
SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF '2015-07-09 09:44:00' --First update
SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF '2015-07-09 09:45:00' --Second update
SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF '2015-07-09 09:46:00' --No other changes

 
Remember, this is based on my data. So if you want to query your timeline, change the datetime string based on your data!

So basically, you can start querying your data by using a specific point in time:

DECLARE @PointInHistory DATETIME2
SET @PointInHistory = '2015-07-09 09:45:00'

SELECT * FROM dbo.BankAccount FOR SYSTEM_TIME AS OF @PointInHistory

 
Besides the actual point in time, you can use 3 other ways to query your data:

SELECT AccountNumber, AccountBalance, ValidFrom, ValidTo
FROM dbo.BankAccount
FOR SYSTEM_TIME FROM '2015-07-09 09:43:00' TO '2015-07-09 09:45:00'


SELECT AccountNumber, AccountBalance, ValidFrom, ValidTo
FROM dbo.BankAccount
FOR SYSTEM_TIME BETWEEN '2015-07-09 09:43:00' AND '2015-07-09 09:45:00'


SELECT AccountNumber, AccountBalance, ValidFrom, ValidTo
FROM dbo.BankAccount
FOR SYSTEM_TIME CONTAINED IN ('2015-07-09 09:43:00', '2015-07-09 09:45:00')

 
Table changes
In practice you’ll not only see data changes happening on tables, but you’ll also have schema changes on your table once in a while. But how does a temporal table cope with schema changes? Let’s try to add a column to our table:

ALTER TABLE dbo.BankAccount ADD NewColumn VARCHAR(10)

This results in an error:

 
In order to add a column, we need to disable the system-versioning first, before we add the new column:

ALTER TABLE dbo.BankAccount SET (SYSTEM_VERSIONING = OFF)
GO

ALTER TABLE dbo.BankAccount ADD NewColumn VARCHAR(10)
GO

 
Now let’s enable system-versioning again:

ALTER TABLE dbo.BankAccount
SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.MyBankAccountHistory))
GO

 
This results in another error:

 
So we first need to add the same column to the history table, before enabling system-versioning again:

ALTER TABLE dbo.MyBankAccountHistory
ADD NewColumn VARCHAR(10)
GO

ALTER TABLE dbo.BankAccount
SET (SYSTEM_VERSIONING = ON (HISTORY_TABLE = dbo.MyBankAccountHistory))
GO

 

Disabling system-versioning means that for a short period of time, you could loose historical data. A possible way to get around this, is by setting your database in single user mode before you disable it. This prevents other user settings from changing data in that table. If you run this in your production environment, this might not be your best option though! By disabling system-versioning, historical data will not be removed.

 
Indexing historical data
Performance isn’t only important when you query your “normal” tables, but also if you want to query your newly created temporal tables. If you want to write a business process that uses your historical data, you need to make sure this doesn’t slow down your application. But how do you achieve that?

Just like on a normal table, you can create indexes no the history table. This table is basically a normal table, so there’s no reason why you can’t create indexes on it:

CREATE NONCLUSTERED INDEX IDX_MyHistoricalData
ON dbo.MyBankAccountHistory (AccountNumber, AccountBalance)

 
Historical data cleanup
It’s really great to store historical versions of your records in a separate table, but this also provides you with new challenges. You need more storage for your data, and you might want to remove data after a certain period of time. At this point in time, system-versioned tables don’t have a cleanup process. This might change in the future, but right now you need to cleanup these tables yourself. So if you want (or need) to cleanup old data, you need to write a small process yourself. This can be as easy as writing a query to run through all system-versioned tabled and delete old data, and schedule that in a SQL Server Agent job. But again, this is something to keep in mind when you start working with temporal tables.

I had a really good conversation about this with Borko Novakovic (@borko_novakovic), who is the Program Manager for this part of SQL Server. The code below is an example of such a process, that Borko provided:

BEGIN TRAN 

    /*Drop versioning to allow deletion of historical data*/
       ALTER TABLE dbo.BankAccount
              SET (SYSTEM_VERSIONING = OFF)
 
       /*DELETE MyBankAccountHistory table*/
       DELETE
       FROM dbo.MyBankAccountHistory
       WHERE ValidTo < '2015.01.01'
       
       /*Re-establish versioning*/
       ALTER TABLE dbo.BankAccount
              SET (SYSTEM_VERSIONING = ON
                  (HISTORY_TABLE  = dbo.MyBankAccountHistory,
                   DATA_CONSISTENCY_CHECK = OFF));

COMMIT

 
Deleting data without disabling system-versioning doesn’t work, because you’ll get an error like this:

 
I do want to thank Borko again for his time. He helped me out a lot, and I’m thankful he took the time to do that. Thanks Borko!!!

 
Remove system-versioning
Removing system-versioned tables from your systems takes one extra step, because just executing a “DROP TABLE” statement doesn’t work. SQL Server will prevent you from dropping temporal tables, without you first disabling it. Let’s start by disabling system-versioning:

ALTER TABLE dbo.BankAccount SET (SYSTEM_VERSIONING = OFF)
GO

 
This disables the system-versioning on your original table, and transforms the history table to a normal table. If you want to get rid of the tables, you can just drop them both like you would normally do:

DROP TABLE dbo.BankAccount
GO
DROP TABLE dbo.MyBankAccountHistory
GO

 
Resources
Besides this blog, there are 2 fantastic posts you need to read about this topic, written by Itzik Ben-Gan (Blog | @ItzikBenGan), that you can read at SQLMag.com (Part 1, Part 2).

One thing you definitely MUST SEE is a video with Borko and Scott Klein (Blog | @SQLScott) on Channel 9. In this video, Borko explains the idea behind temporal tables, and shares some insights on this new feature.

Other information on temporal tables can be found on MSDN:

Temporal Tables
sys.tables
SQL Server 2016 (MSDN root)

2 Responses to Time traveling with SQL Server 2016: Temporal tables

  1. Gustavo says:

    Hello Jef!

    Great post!

    When changing the schema of your table, instead of putting the database to single user only, you could make the changes within a transaction. This way you can keep your data consistent.

    Keep up the good work!

  2. Pingback: Tabelas Temporais – Parte 2: Alteração de schema e Indexação | Comunidade SQL Server

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: