65.9K
CodeProject is changing. Read more.
Home

Encrypted columns and SQL Server performance

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.36/5 (10 votes)

Jun 11, 2007

3 min read

viewsIcon

77093

Examining the effects of column encryption on SQL Server performance

Introduction

For legal reasons it is very important to encrypt table columns containing sensitive data, like SSNs. SQL Server 2005 allows you to encrypt data using different algorithms that employ symmetric and asymmetric keys. Alternatively, you can also use password-based encryption, where the password must be supplied by the client to encrypt/decrypt data. However, everything has a price, so we want to know how expensive this is.

With and without encryption

Let's prepare some test data, 100K "customers:"

SET NOCOUNT ON go create table Customers_Data ( name varchar(128) not null, SSN varchar(10) not null) go declare @cnt int set @cnt=100000 while @cnt>0 begin set @cnt=@cnt-1 insert into Customers_Data (name,SSN) select 'Cust_'+convert(varchar,@cnt), '010'+convert(varchar,@cnt) end go

Our artificially-generated fake SSNs are all unique. They have a format "010xxxxxx," but they are shorter in some cases. At first, let's perform our experiments without any encryption:

-- Not encrypted create table Customers ( id int identity primary key, name varchar(128) not null, SSN varchar(10) not null) go

In our first experiment, we copy raw Customers_Data into the target table and measure the elapsed time:

-- inserts declare @t1 datetime, @cnt int truncate table Customers set @t1=getdate() insert into Customers (name,SSN) select * from Customers_Data select datediff(ms,@t1,getdate()) go

In the second experiment, we find one record by the given SSN. This is done using a table scan comparing all SSNs. This test is repeated multiple times to make it more precise:

-- table scan declare @t1 datetime, @name varchar(128), @cnt int set @t1=getdate() set @cnt=30 while @cnt>0 begin set @cnt=@cnt-1 select @name=max(name) from Customers with(index(0)) – forced table scan where SSN='010'+convert(varchar,@cnt) end select datediff(ms,@t1,getdate())/30. go

Keys

To encrypt data, we must create encryption keys. Here is an example:

create master key encryption by password = 'p@sswOrd' go create asymmetric key AsymKey With Algorithm = RSA_1024 go

Now if a database is stolen, encrypted data cannot be recovered because the master key is not backed up within a user database. When data is encrypted, the result is stored as varbinary(128). Encrypted data cannot be too long.

create table encr_Customers ( id int identity primary key, name varchar(128) not null, SSN varbinary(128)) go

Transparency and decryption

We want to make our encryption transparent to other SQL code and the front end. Our view customers must behave exactly like old table customers, except that truncate table cannot be used.

create view Customers as select id, name, convert(varchar(10), DecryptByAsymKey(AsymKey_ID('AsymKey'), SSN)) as SSN from encr_Customers go

So, when we read data we decrypt column SSN on the fly. How can we encrypt data when it is inserted into this view? We can use an instead of trigger:

create trigger IT1 on Customers instead of insert as begin insert into encr_Customers (name,SSN) select name, EncryptByAsymKey(AsymKey_ID('AsymKey'),SSN) from inserted end GO

We should create a similar trigger for the updates if the encrypted column is involved in the updates. Our view works exactly like a table from before, except that you cannot say truncate table Customers; you should use truncate table encr_Customers instead. We cannot make a password-based encryption transparent, as we would have to hardcode a password into a view. This would, of course, ruin the very idea of encryption.

Comparing keys and algorithms

Now we can compare different keys and different algorithms of encryption:

Key type

Algorithm/length

Table scan

Test insert

Without Encryption

2

93

AsymKey

1024

102656

2623

AsymKey

512

21716

1486

AsymKey

2048

747766

6563

SymKey

DES

173

623

SymKey

TRIPLE_DES

200

580

SymKey

RC2

173

606

SymKey

RC4

250

576

SymKey

DESX

203

516

SymKey

AES_128

173

736

SymKey

AES_192

173

576

SymKey

AES_256

186

610

Note that asymmetric keys are much longer to encrypt. This is especially the case when decrypting data. Now comes the funny part. Say we want to find a customer by SSN:

select * from Customers where SSN=@value

Usually, SSN is indexed and the record is found immediately. However, if you tried it in our encrypted view, you would see that SQL Server uses a table scan even if an index is created on encr_Customers.SSN. Why? It appears as though the statement above is equivalent to (in simplified code):

select * from encr_Customers where decrypt(SSN)=@value

If we rewrite this as...

select * from encr_Customers where SSN=encrypt(@value)

.. then it would work much faster and use an index, but it would not find our record! This happens because all encryption functions are non-deterministic every time they generate another value (values truncated):

select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') (values truncated) 0x9FD1DDA8A6ACED67C0C24CDD566CAD655E7B7E3D76197896… 0xE18FFFB9EC75CD2093089A5DDB83220A244346F77AA548BF… 0x98D563BA855573A442A278B2565D9216192AD5BC7B664637…

For that reason, unfortunately, all indices on encrypted data are absolutely useless.

History

  • 11 June, 2007 -- Original version posted
close