SSISDB: Why bigger isn’t always better

Two weeks ago I encountered some strange issues with the SSISDB. Packages and processes started failing, and all we could find is this error message:

 
The image above is an extract of the standard SSMS “All Executions” report. You can find this by right-clicking your SSISDB, click on Reports, then Standard Reports, and choose the “All Executions” report.

While the packages started failing, we couldn’t find a cause. Deploying new packages wasn’t possible either. There were no events logged in the report, in the Windows event viewer, or in the SQL Server error log. So there was no starting point to further investigate this issue.

 
Finding the cause
Checking the size of the SSISDB showed some possible cause of the issue: it was 72GB big! This seemed odd, because we don’t store a lot of packages in the SSISDB. I knew SSISDB contained some logging information, but I didn’t knew how much. Checking the “Disk usage per table” report (another standard report in SSMS), the problem became a bit more obvious:

 

 
So basically there were over 67 million records in the database! This shouldn’t have happened, because the log retention should be set so 14 days. But we quickly found the problem:

 

 
The retention was set to 365 days. This could only mean the the SSISDB was re-created, and this setting wasn’t changed after the deploy. This resulted in a lot of extra logging data in the database. Because of the amount of record, the standard “SSISDB Maintenance job” ran for 40+ minutes, instead of a maximum of 8 minutes (which it normally needs to clean up logging). But setting the retention to 14 days and running the maintenance job would just result in an endless running job. So how could you prevent that?

 
Grabbing the bigger hammer
In most cases, grabbing a bigger hammer to solve the problem isn’t your best option. In this case, I didn’t see another way. After some searching, I found this blog post about truncating the log tables in SSISDB. Below you find the copy of the script. The credits for this script go to Ibrahim Naji (Blog | @thinknook).


CREATE PROCEDURE [internal].[cleanup_server_retention_window_truncateall]
AS  
     
SET NOCOUNT ON
     
DECLARE @enable_clean_operation bit
DECLARE @retention_window_length INT
     
DECLARE @caller_name nvarchar(256)
DECLARE @caller_sid  varbinary(85)
DECLARE @operation_id BIGINT
     
EXECUTE AS CALLER
    SET @caller_name =  SUSER_NAME()
    SET @caller_sid =   SUSER_SID()
REVERT
          
     
BEGIN TRY
    SELECT @enable_clean_operation = CONVERT(bit, property_value) 
        FROM [catalog].[catalog_properties]
        WHERE property_name = 'OPERATION_CLEANUP_ENABLED'
         
    IF @enable_clean_operation = 1
    BEGIN
        SELECT @retention_window_length = CONVERT(INT,property_value)  
            FROM [catalog].[catalog_properties]
            WHERE property_name = 'RETENTION_WINDOW'
                 
        IF @retention_window_length <= 0 
        BEGIN
            RAISERROR(27163    ,16,1,'RETENTION_WINDOW')
        END
             
        INSERT INTO [internal].[operations] (
            [operation_type],  
            [created_time], 
            [object_type],
            [object_id],
            [object_name],
            [STATUS], 
            [start_time],
            [caller_sid], 
            [caller_name]
            )
        VALUES (
            2,
            SYSDATETIMEOFFSET(),
            NULL,                     
            NULL,                     
            NULL,                     
            1,      
            SYSDATETIMEOFFSET(),
            @caller_sid,            
            @caller_name            
            ) 
        SET @operation_id = SCOPE_IDENTITY() 


        -- Remove all [internal].[executions] dependancies
        TRUNCATE TABLE [internal].[executable_statistics]
        TRUNCATE TABLE [internal].[execution_component_phases]
        TRUNCATE TABLE [internal].[execution_data_statistics]
        TRUNCATE TABLE [internal].[execution_data_taps]
        TRUNCATE TABLE [internal].[execution_parameter_values]
        TRUNCATE TABLE [internal].[execution_property_override_values]


        -- Remove all [internal].[event_message_context] dependancies
        TRUNCATE TABLE [internal].[event_message_context]

        -- Remove all non-dependant tables
        TRUNCATE TABLE [internal].[operation_os_sys_info]
        TRUNCATE TABLE [internal].[operation_permissions]
        TRUNCATE TABLE [internal].[validations]
        TRUNCATE TABLE [internal].[extended_operation_info]

        -- Deal with [internal].[event_messages] and [internal].[operation_messages]
        ALTER TABLE [internal].[event_message_context] DROP CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages]
         
        TRUNCATE TABLE internal.event_messages
         
        ALTER TABLE [internal].[event_message_context]  WITH CHECK ADD  CONSTRAINT [FK_EventMessageContext_EventMessageId_EventMessages] FOREIGN KEY([event_message_id])
        REFERENCES [internal].[event_messages] ([event_message_id])
        ON DELETE CASCADE

        ALTER TABLE [internal].[event_messages] DROP CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage]
         
        TRUNCATE TABLE [internal].[operation_messages]

        ALTER TABLE [internal].[event_messages]  WITH CHECK ADD  CONSTRAINT [FK_EventMessages_OperationMessageId_OperationMessage] FOREIGN KEY([event_message_id])
        REFERENCES [internal].[operation_messages] ([operation_message_id])
        ON DELETE CASCADE

        -- Deal with [internal].[executions]

        ALTER TABLE [internal].[executable_statistics] DROP CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions]
        ALTER TABLE [internal].[execution_component_phases] DROP CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions]
        ALTER TABLE [internal].[execution_data_statistics] DROP CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions]
        ALTER TABLE [internal].[execution_data_taps] DROP CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions]
        ALTER TABLE [internal].[execution_parameter_values] DROP CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions]
        ALTER TABLE [internal].[execution_property_override_values] DROP CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions]

        TRUNCATE TABLE [internal].[executions]

        ALTER TABLE [internal].[execution_property_override_values]  WITH CHECK ADD  CONSTRAINT [FK_ExecutionPropertyOverrideValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE

        ALTER TABLE [internal].[execution_parameter_values]  WITH CHECK ADD  CONSTRAINT [FK_ExecutionParameterValue_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE

        ALTER TABLE [internal].[execution_data_taps]  WITH CHECK ADD  CONSTRAINT [FK_ExecDataTaps_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE

        ALTER TABLE [internal].[execution_data_statistics]  WITH CHECK ADD  CONSTRAINT [FK_ExecDataStat_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE
         
        ALTER TABLE [internal].[execution_component_phases]  WITH CHECK ADD  CONSTRAINT [FK_ExecCompPhases_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE
         
        ALTER TABLE [internal].[executable_statistics]  WITH CHECK ADD  CONSTRAINT [FK_ExecutableStatistics_ExecutionId_Executions] FOREIGN KEY([execution_id])
        REFERENCES [internal].[executions] ([execution_id])
        ON DELETE CASCADE
         

        -- Deal with [internal].[operations]
        DECLARE @deleted_ops TABLE(operation_id BIGINT, operation_type SMALLINT)

        DELETE --TOP (@delete_batch_size)
        FROM [internal].[operations] 
        OUTPUT DELETED.operation_id, DELETED.operation_type INTO @deleted_ops
        WHERE operation_id != @operation_id

             
             
        DECLARE @execution_id BIGINT
        DECLARE @sqlString              nvarchar(1024)
        DECLARE @key_name               [internal].[adt_name]
        DECLARE @certificate_name       [internal].[adt_name]
             
             
        DECLARE execution_cursor CURSOR LOCAL FOR 
            SELECT operation_id FROM @deleted_ops 
            WHERE operation_type = 200
             
        OPEN execution_cursor
        FETCH NEXT FROM execution_cursor INTO @execution_id
             
        WHILE @@FETCH_STATUS = 0
        BEGIN
            SET @key_name = 'MS_Enckey_Exec_'+CONVERT(VARCHAR,@execution_id)
            SET @certificate_name = 'MS_Cert_Exec_'+CONVERT(VARCHAR,@execution_id)
            SET @sqlString = 'IF EXISTS (SELECT name FROM sys.symmetric_keys WHERE name = ''' + @key_name +''') '
                +'DROP SYMMETRIC KEY '+ @key_name
                EXECUTE sp_executesql @sqlString
            SET @sqlString = 'IF EXISTS (select name from sys.certificates WHERE name = ''' + @certificate_name +''') '
                +'DROP CERTIFICATE '+ @certificate_name
                EXECUTE sp_executesql @sqlString
            FETCH NEXT FROM execution_cursor INTO @execution_id
        END
        CLOSE execution_cursor
        DEALLOCATE execution_cursor

        END
END TRY
BEGIN CATCH
         
         
    IF (CURSOR_STATUS('local', 'execution_cursor') = 1 
        OR CURSOR_STATUS('local', 'execution_cursor') = 0)
    BEGIN
        CLOSE execution_cursor
        DEALLOCATE execution_cursor            
    END
         
    UPDATE [internal].[operations]
        SET [STATUS] = 4,
        [end_time] = SYSDATETIMEOFFSET()
        WHERE [operation_id] = @operation_id;       
    THROW
END CATCH
     
RETURN 0

 
After running this on the test environment, I found out it worked as I expected. I started the script on production (where the problem was found in the first place), and when it finished I changed the retention from 365 to 14 days:

 

 
After doing that, I ran the “SSISDB Maintenance job” (which ran fine), and an SSIS package as a test (worked fine as well). And looking at the record counts, it worked fine:

 

 
I’m not saying you should run this on your system, but I’m just saying this works fine on my system, and in my specific case. If you find any code on the internet, test it on a non-critical system first, before you run it in production!!!

 
Conclusion
Having error logging is a good thing. It can help you determine problems, and you can see if and how process grow over time. But too much logging for SSIS causes performance issues, prevents you from deploying new packages, and can even cause unexpected termination of SSIS packages. One way to prevent that is by changing the history retention of your SSISDB.

4 Responses to SSISDB: Why bigger isn’t always better

  1. Pingback: Avoid a big SSISDB by logging less…? | SQL from the Trenches

  2. EC says:

    “The retention was set to 365 days. This could only mean the the SSISDB was re-created.” This is really a deployment issue then? The DBA forgot to change the retention period after deploying SSISDB? I just want to make sure you are not saying the retention days had changed by itself. Thanks!

    • DevJef says:

      Yes, you’re right. There’s no way the SSISDB retention changed by itself. The SSISDB was either recreated by a member of the BI team, or another DBA. Deploying an SSIS solution to SQL Server does NOT cause the recreation of the SSISDB.

      Another cause of the problem is that you can’t change the SSISDB retention on creation. You need to change it in the db properties AFTER you’ve created it. So there’s a really big chance you’re overlooking it when creating or recreating SSISDB.

  3. Pingback: BPOTW 2015-03-21 | SQL Notes From The Underground

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: