Encrypted columns and SQL Server performance






4.36/5 (10 votes)
Jun 11, 2007
3 min read

77093
Examining the effects of column encryption on SQL Server performance
Introduction
For legal reasons it is very important to encrypt table columns containing sensitive data, like SSNs. SQL Server 2005 allows you to encrypt data using different algorithms that employ symmetric and asymmetric keys. Alternatively, you can also use password-based encryption, where the password must be supplied by the client to encrypt/decrypt data. However, everything has a price, so we want to know how expensive this is.
With and without encryption
Let's prepare some test data, 100K "customers:"
SET NOCOUNT ON go create table Customers_Data ( name varchar(128) not null, SSN varchar(10) not null) go declare @cnt int set @cnt=100000 while @cnt>0 begin set @cnt=@cnt-1 insert into Customers_Data (name,SSN) select 'Cust_'+convert(varchar,@cnt), '010'+convert(varchar,@cnt) end go
Our artificially-generated fake SSNs are all unique. They have a format "010xxxxxx," but they are shorter in some cases. At first, let's perform our experiments without any encryption:
-- Not encrypted create table Customers ( id int identity primary key, name varchar(128) not null, SSN varchar(10) not null) go
In our first experiment, we copy raw Customers_Data
into the target table and measure the elapsed time:
-- inserts declare @t1 datetime, @cnt int truncate table Customers set @t1=getdate() insert into Customers (name,SSN) select * from Customers_Data select datediff(ms,@t1,getdate()) go
In the second experiment, we find one record by the given SSN. This is done using a table scan comparing all SSNs. This test is repeated multiple times to make it more precise:
-- table scan declare @t1 datetime, @name varchar(128), @cnt int set @t1=getdate() set @cnt=30 while @cnt>0 begin set @cnt=@cnt-1 select @name=max(name) from Customers with(index(0)) – forced table scan where SSN='010'+convert(varchar,@cnt) end select datediff(ms,@t1,getdate())/30. go
Keys
To encrypt data, we must create encryption keys. Here is an example:
create master key encryption by password = 'p@sswOrd' go create asymmetric key AsymKey With Algorithm = RSA_1024 go
Now if a database is stolen, encrypted data cannot be recovered because the master key is not backed up within a user database. When data is encrypted, the result is stored as varbinary(128)
. Encrypted data cannot be too long.
create table encr_Customers ( id int identity primary key, name varchar(128) not null, SSN varbinary(128)) go
Transparency and decryption
We want to make our encryption transparent to other SQL code and the front end. Our view customers must behave exactly like old table customers, except that truncate table
cannot be used.
create view Customers as select id, name, convert(varchar(10), DecryptByAsymKey(AsymKey_ID('AsymKey'), SSN)) as SSN from encr_Customers go
So, when we read data we decrypt column SSN on the fly. How can we encrypt data when it is inserted into this view? We can use an instead of
trigger:
create trigger IT1 on Customers instead of insert as begin insert into encr_Customers (name,SSN) select name, EncryptByAsymKey(AsymKey_ID('AsymKey'),SSN) from inserted end GO
We should create a similar trigger for the updates if the encrypted column is involved in the updates. Our view works exactly like a table from before, except that you cannot say truncate table Customers
; you should use truncate table encr_Customers
instead. We cannot make a password-based encryption transparent, as we would have to hardcode a password into a view. This would, of course, ruin the very idea of encryption.
Comparing keys and algorithms
Now we can compare different keys and different algorithms of encryption:
Key type | Algorithm/length | Table scan | Test insert |
Without Encryption | 2 | 93 | |
AsymKey | 1024 | 102656 | 2623 |
AsymKey | 512 | 21716 | 1486 |
AsymKey | 2048 | 747766 | 6563 |
SymKey | DES | 173 | 623 |
SymKey | TRIPLE_DES | 200 | 580 |
SymKey | RC2 | 173 | 606 |
SymKey | RC4 | 250 | 576 |
SymKey | DESX | 203 | 516 |
SymKey | AES_128 | 173 | 736 |
SymKey | AES_192 | 173 | 576 |
SymKey | AES_256 | 186 | 610 |
Note that asymmetric keys are much longer to encrypt. This is especially the case when decrypting data. Now comes the funny part. Say we want to find a customer by SSN:
select * from Customers where SSN=@value
Usually, SSN is indexed and the record is found immediately. However, if you tried it in our encrypted view, you would see that SQL Server uses a table scan even if an index is created on encr_Customers.SSN. Why? It appears as though the statement above is equivalent to (in simplified code):
select * from encr_Customers where decrypt(SSN)=@value
If we rewrite this as...
select * from encr_Customers where SSN=encrypt(@value)
.. then it would work much faster and use an index, but it would not find our record! This happens because all encryption functions are non-deterministic every time they generate another value (values truncated):
select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') select EncryptByAsymKey(AsymKey_ID('AsymKey'), 'This is my example') (values truncated) 0x9FD1DDA8A6ACED67C0C24CDD566CAD655E7B7E3D76197896… 0xE18FFFB9EC75CD2093089A5DDB83220A244346F77AA548BF… 0x98D563BA855573A442A278B2565D9216192AD5BC7B664637…
For that reason, unfortunately, all indices on encrypted data are absolutely useless.
History
- 11 June, 2007 -- Original version posted