Jmp Start

…where the Geek shall Inherit the Word

Proper use of Rfc2898DeriveBytes

Posted by CKret on September 29, 2009

When you need encryption in your application, the recommended way to create the key and Initialization Vector (IV) is to use Rfc2898DeriveBytes. It is located in the namespace System.Security.Cryptography. To encrypt some data you would do something like:

public static byte[] EncryptRfc(byte[] plainText, string password, byte[] salt)
{
  var keyGen = new Rfc2898DeriveBytes(password, salt);
  var key = keyGen.GetBytes(32);
  var iv = keyGen.GetBytes(16);

  var cipher = new RijndaelManaged { Key = key, IV = iv };

  byte[] cipherText;
  using (var encryptor = cipher.CreateEncryptor()) {
    using (var ms = new MemoryStream()) {
      using (var cs = new CryptoStream(ms, encryptor, CryptoStreamMode.Write)) {
        cs.Write(plainText, 0, plainText.Length);
        cs.FlushFinalBlock();
        cipherText = ms.ToArray();
      }
    }
  }
  return cipherText;
}

This is straightforward enough:

  • derive the encryption key and IV from the password and salt.
  • create a new instance of the encryptor with the key and IV.
  • encrypt the plaintext.

If you run this it seems to be quite fast. However, nothing can be further from the truth.

Analysis of Rfc2898DeriveBytes

I created a console application with the above method and called it from main.

Then I profiled the application and analyzed the result.

One method call stood out like a sore thumb:

Function Name Inclusive Samples Exclusive Samples Inclusive Samples % Exclusive Samples %
System.Security.Cryptography.Rfc2898DeriveBytes.GetBytes(int32) 944 0 91,12 0,00

The call to GetBytes took up 91.12% of the execution time!!

“So what!?”, you say.

When encrypting a single batch of data this isn’t really an issue since most of the time the encryption takes longer than generating the key and IV. If you however need to encrypt a lot of different data and the data sizes are small then you will have a lot of overhead like in this case. In the above example encrypting the data took at most 8% of the execution time while generating the key and IV took over 91%.

How does it concern me?

This example takes about 80ms on my computer. That’s not bad but that is a single encryption. If I would encrypt 1000 messages then it would run in 80 seconds. Which of course is unacceptable.

Why does it perform so badly?

Rfc2898DeriveBytes uses a pseudo-random number generator based on HMACSHA1. When calling GetBytes it initializes a new instance of HMAC which takes some time. (More than 50% of the execution time in the above example). Subsequent calls to GetBytes does not need to do this initialization.

What can we do to fix this?

There is a fundamental flaw in the above code example. There’s no need create a new instance of Rfc2898DeriveBytes for each call to encrypt. If you have to send 1000 separate messages to a recipient then you’ll want to use the same key but change the IV for each message. Lift out Rfc2898DeriveBytes from the Encrypt method and pass along the key and IV instead of the password and salt. For each new message you call GetBytes() to generate a new IV.

Conclusion

Using Rfc2898DeriveBytes incorrectly will have a severe performance impact on your application. Taking proper steps to avoid this is easy enough but you should always make sure you understand why things work the way they do.

5 Responses to “Proper use of Rfc2898DeriveBytes”

  1. Kayo said

    How do you properly Decrypt?

    • CKret said

      The same way. There is no need to have Rfc2898DeriveBytes inside the decryption method as long as the key stays the same. If the key changes then you’ll need to create a new instance of Rfc2898DeriveBytes. However, when communicating you’ll want to use the same key per session or a certain time period before changing the key.

  2. In your example I presume that the salt is different for each record/string that you’re encrypting.

    Why not use a constant salt but pass in a randomized IV? What is the point of having ANOTHER field (the salt). Are they just different means to the same end?

    • CKret said

      That was the whole point of this blog article.

      From the article:
      “…Lift out Rfc2898DeriveBytes from the Encrypt method and pass along the key and IV instead of the password and salt…”.

      I’ve seen (variations) of the above code many times and it is a common mistake to make.
      You’re absolutely correct about the salt and IV and that is exactly what this article demonstrates although the focus is more on performance.
      In my experience this type of error occurs when develpers over simplify things.

    • If the salt is constant then Rfc2898DeriveBytes will always return the same key, then you might as well just use a constant key and a randomized IV !

      But then you will have to append the IV to your encrypted message, because without it you won’t be able to decrypt … which means the attacker will have access to it (IV is almost always considered to be public information, so that’s okay) but with your constant key, that’d be a problem ! This defeats the whole purpose of Rfc2898DeriveBytes which is to have a unique key and IV for every message !

      Security always comes with a performance cost.

Leave a comment