SQL Server Transparent Data Encryption (TDE)

Transparent Data Encryption

First of all it’s important to understand the scope of TDE, as it’s not a complete end to end encryption solution. TDE will encrypt the data files and transaction log files (.mdf, .ndf and .ldf files) and the backup files (.bak files). This means that so called “data at rest” is encrypted, however traffic between the database and application is not encrypted (at least not by TDE, but you can use SSL to achieve this), and data held within the application is also not encrypted. TDE is implemented at the database level and is an all or nothing solution – so all data within the database will be encrypted – you can’t just encrypt the sensitive columns.

Another point to watch is that even if only one database on a server has TDE enabled then TempDB will be encrypted, so the performance of other non-encrypted databases on the same server may be affected. However although there’s inevitably a performance impact when using TDE on a database, Microsoft claims this is only 2 – 4% compared to a non-encrypted database.

Performance analysis will be observed based on the below 3 parameters:

  • CPU time
  • Physical_io
  • Elapsed time

I thought it would be useful to summarise some of the pros and cons of TDE :

Advantages of TDE

Fairly simple to implement.

No changes to the application tier required.

Is invisible to the user.

Works with high availability features, such as mirroring, AlwaysOn and log shipping.

Enable on standard edition from SQL Server 2019. 

Disadvantages of TDE

Only encrypts data at rest, so data in motion or held within an application is not encrypted.

All data in the database is encrypted – not just the sensitive data.

Requires the more expensive Enterprise Edition (or Developer or DataCenter Edition) of SQL Server.

The amount of compression achieved with compressed backups will be significantly reduced.

There is a small performance impact.

FileStream data is not encrypted.

Some DBA tasks require extra complexity, for instance restoring a backup onto another server.

As TempDB is encrypted, there is potentially an impact on non-encrypted databases on the same server.

The master database, which contains various metadata, user data and server level information is not encrypted.

Can Transparent Data Encryption Impact Performance?

After implementing TDE on any user database, be aware that tempdb is permanently encrypted.  Even if you remove TDE from the user database, you’ll still have the encryption overhead on tempdb.  This is a great reason why all testing should be done in development environments first rather than trying it in production.

I did a simple comparison on a few DML statements and database backup to see TDE performance impacts whether has any effect on database performance or not. We can see Transparent Data Encryption has increased overhead on CPU and physical io on each of this execution we have tested in this analysis. I have also given a line-by-line comparison for each execution in the below table. Have a look at them and you will come to know that this encryption will put some extra burden on your database system.

How to enable TDE:

Step 1: backup the database & Create Database Master Key

======

Backup database test to disk='E:\Backup_Encryption_DoNotDelete\test_full.bak' with stats=10

GO

USE master;

GO

CREATE MASTER KEY ENCRYPTION BY PASSWORD='QI@#51!&rxu96';

GO

Step 2: Create a Certificate to support TDE

======

USE master;

GO 

CREATE CERTIFICATE UPENN_Cert_TDE WITH SUBJECT='Cert_TDE_CHASSHRDBPRW01(replace with server name)',EXPIRY_DATE = '2099-12-31';

GO

Step 3: Create Database Encryption Key

======

USE test

GO

CREATE DATABASE ENCRYPTION KEY

WITH ALGORITHM = AES_128

ENCRYPTION BY SERVER CERTIFICATE UPENN_Cert_TDE;  

Step 4: Enable TDE on Database

======

ALTER DATABASE test SET ENCRYPTION ON;

GO

Step 5: Backup the Certificate

=======

USE master;

GO

BACKUP CERTIFICATE UPENN_Cert_TDE

TO FILE = 'E:\Backup_Encryption_DoNotDelete\UPENN_Cert_TDE.cer'

WITH PRIVATE KEY (file='E:\Backup_Encryption_DoNotDelete\UPENN_Cert_TDE_Key.pvk',

ENCRYPTION BY PASSWORD='TF@#83!&wpz28');

GO

=========================================================================

---> TO verify certificate details.

USE Master

GO

Select * from sys.certificates where pvt_key_encryption_type <> 'NA'

GO

----> to verify Encryption key details 

use master

GO

select encryptor_type, key_length, key_algorithm, encryption_state, create_date FROM sys.dm_database_encryption_keys

GO

-- Check that the certificate was created above

select name

   , pvt_key_encryption_type_desc

   , issuer_name

   , subject

   , expiry_date

   , start_date

from sys.certificates

where name = 'TDECertificate';



How many rows are progressed while doing operations like below from SSMS?

 

How many rows are progressed while doing operations like below from SSMS?


I would like to share one more very useful technique believe me this is very much needed when you want to estimate how many rows are progressed while doing operations like below

1) insert into select * from 

2) select * into 

3) NonClustered Indexes on a Heap

4) Clustered Index (no NonClustered Indexes exist)

5) NonClustered Indexes on the Clustered Index/Table

6) Clustered Index when NonClustered Indexes already exist

7) Unique NonClustered Indexes on the Clustered Index/Table

8) ALTER TABLE [schema_name].[table_name] REBUILD; (only Clustered Index shows up when using this method)

9) ALTER INDEX ALL ON [schema_name].[table_name] REBUILD;

10)ALTER INDEX [index_name] ON [schema_name].[table_name] REBUILD;


What you need to do is just enable set statistics profile on in the window where you want to execute above commands

Now to track the progress you can use the below query(credits to Solomon Rutzky) and the output would be like 


DECLARE @SPID INT = 127;

;WITH agg AS
(
     SELECT SUM(qp.[row_count]) AS [RowsProcessed],
            SUM(qp.[estimate_row_count]) AS [TotalRows],
            MAX(qp.last_active_time) - MIN(qp.first_active_time) AS [ElapsedMS],
            MAX(IIF(qp.[close_time] = 0 AND qp.[first_row_time] > 0,
                    [physical_operator_name],
                    N'')) AS [CurrentStep]
     FROM sys.dm_exec_query_profiles qp
     WHERE qp.[physical_operator_name] IN (N'Table Scan', N'Clustered Index Scan', N'Sort', 'Index Insert','Index Scan')
     AND
 qp.[session_id] = @SPID
), comp AS
(
     SELECT *,
            ([TotalRows] - [RowsProcessed]) AS [RowsLeft],
            ([ElapsedMS] / 1000.0) AS [ElapsedSeconds]
     FROM   agg
)
SELECT [CurrentStep],
       [TotalRows],
       [RowsProcessed],
       [RowsLeft],
       CONVERT(DECIMAL(5, 2),
               (([RowsProcessed] * 1.0) / [TotalRows]) * 100) AS [PercentComplete],
       [ElapsedSeconds],
       (([ElapsedSeconds] / [RowsProcessed]) * [RowsLeft]) AS [EstimatedSecondsLeft],
       DATEADD(SECOND,
               (([ElapsedSeconds] / [RowsProcessed]) * [RowsLeft]),
               GETDATE()) AS [EstimatedCompletionTime]
FROM   comp;



I hope you will start utilizing above query from now on.

Azure SQL Elastic Pools: a way for saving costs

  Azure SQL Elastic Pools: a way for saving costs  Elastic pools enable you to purchase resources for a pool shared by multiple databases to...