Friday, May 16, 2014

Dev basics: Using cryptography

Cryptography is different from other fields of programming. It is very very tricky and you cannot check whether your code is correct by just running the program (like you can do with, for example, graphics). Security holes are pretty much invisible but at the same time you need to be the first person to discover them. If someone beats you and misuses the discovery, you're in trouble.

And because cryptographic errors cannot be tested in any simple way, developers need to focus on correct implementation in the first place. What does this mean?

1. Don't invent your own algorithms

We have AES, RSA, DSA, ECDSA and so on. If you are making a real world application that is going to protect some sensitive data, don't even think about making your own encryption algorithm. Doing that is fine if you're just playing with encryption for fun or are a seasoned cryptographic researcher who eats algebra for breakfast. In production, you can just as well go juggling grenades.

2. Assume the attacker knows what algorithms you use

Relying on the attacker not knowing which cryptographic primitives you are using is about as safe as jumping off a cliff. Security is ensured by using secret keys and correct algorithms. If you make the mistake of violating this principle and an attacker discovers how you app works (every kid can use disassembly of Python, Java, .NET nowadays), what are you going to do about it? Try to change the implementation in 10 hours and ship a critical update? Building the security of your system on trying to hide implementation details ensures nothing but headache.

3. Use crypto primitives in exactly the way they are designed to be used

To stay secure (and enjoy the feeling of extra protection, dryness and safety!), you need to use cryptographic primitives in exactly the same way they were designed to be used and in no other way. There is a number of tools in the cryptographer's toolbox and each of them has a different purpose. Let's list some of them:
  • symmetric encryption (AES, Twofish, ...)
    purpose: Hide data from an attacker using a key known to both parties.
    incorrect use: trying to use encryption to prove the author of a message
  • asymmetric encryption (RSA, ...)
    purpose: hiding data from an attacker using an asymmetric key
  • hash functions (SHA)
    purpose: get a fixed length data digest from a long message that doesn't allow getting any info about the original message
    incorrect use: trying to ensure that an attacker hasn't changed the message along the way
    incorrect use: checking if the entered encryption password is correct
  • password based key derivation functions (PBKDF2, scrypt...)
    purpose: generating a key for encryption algorithm from a password
    incorrect use: not using PBKDF at all and putting the password directly into the encryption algorithm
  • digital signatures (DSA, ECDSA, RSA, ...)
    purpose: ensure that the holder of private key really authored the message
    incorrect use: out of ideas how to abuse this
  • message authentication codes (MAC)
    purpose: making sure that the message has not been changed on the way when both parties share a secret key
    incorrect use: not using it when data integrity is important
  • key agreement protocol (Diffie-Helman, ...)
    purpose: You have two parties and they know each other's public key. This allows you to generate a shared secret using the public keys.

4. Handle parameters correctly

For example, encryption algorithms usually work in block modes and it's up to you to choose the block operation mode according to your needs. If you use ECB, I'm sorry and you can say "bye bye" to your privacy (unless your data is smaller than 128 bits). 

Block modes also need a random IV and that data must be transmitted together with the cryptotext. It's a hassle, but if you come up with a "solution" such as generating the IV from the password, you're violating the rule number 3 (do what you are told to do and don't make your own solutions) and defeating the entire purpose of the IV (which is to ensure that you always generate different cryptotexts even when reusing parts of data or password).

More complicated encryption schemes (such as ECIES) need some other data to be also transmitted with the message. Again, you have to implement this exactly according to the spec and refrain from custom inventions such as reusing the ephemeral curve, otherwise you are compromising your own data security.


This article is merely a short summary, it doesn't provide the why for the rules. Usually, the reason is that incorrect use either enables some unexpected way for the attacker to get your data, to tamper with them or to calculate statistical information about them, which is also more serious than most people would expect. You can always find the answers are out there on the internet in specifications of the individual primitives.

A very useful article is Cryptographic right answers.

No comments:

Post a Comment