Adam Luptak

AES in Python & Obj-C

Encryption
Python
Objective-C
Aug. 13, 2012

Despite having a bunch of new toys, I haven't been tinkering with Arduino because I've been working on a more involved (as-yet undisclosed) side project.

I've been doing some work writing custom binary files, developing a scheme that I'm not going to be talking about. However, I also wanted to encrypt my files for an extra layer of protection. Catch is, of course, that I knew basically nothing about modern encryption. I would have been completely lost, but Michael linked me over to Protecting resources in iPhone and iPad apps by Robin Summerhill. This gave me an idea of what I should be looking at - AES symmetric encryption.

Knowing what my workflow would be down the line as the project nears launch, I wanted to encrypt my files using Python and decrypt them in Objective-C (for iOS). Robin's solution is great for the iOS side, but I figured I might as well not bother with the shell script. I started digging into doing the encryption in Python, which is where things started to get interesting.

The go-to cryptography library for Python is PyCrypto. It's fairly low-level; it encrypts a chunks of data for you, but you have to do the rest of the work. There seem to be a lot of wrappers, but they seem overcomplicated. Fortunately, I found Eli Bendersky's lightweight implementation of reading in a file, breaking it into chunks, encrypting and writing out to a file. This code was the basis of what I needed, but I wasn't done yet.

Looking at Robin Summerhill's Objective-C code, I had spent a good amount of time puzzling over the arguments that were passed into the CCCryptorCreate method. Of particular interest was kCCOptionPKCS7Padding. A bit of Googling later and I learned that since we're encrypting chunks of data, our chunks have to be complete, which means padding the last chunk - in this case, we need it to be a multiple of 16. That's where PKCS7 comes in. My understanding is that PKCS7 is the same as PKCS5, except it can handle 64-bit encryption instead of 32 (but I won't be needing it to). It's quite simple: you see how many bytes you need to add to chunk, and then set all those bytes to that number. Eli Bendersky's code doesn't follow this padding scheme, but it was quite easy to substitute in.

The other catch was that the Python snippet used (and it seems like PyCrypto might require) an instantiation vector (or IV). The IV is used in coordination with the private key to encrypt data. Apparently your nefarious neighborhood black hat could learn something useful by examining similar files encrypted with the same private key. However, using different IVs for every file keeps the manner in which they're encrypted unique. Since the IV is different for every file, we need to put the IV into the files so we can use it to build a cryptor.

So, at the end of the day, we create an IV, write it onto the beginning of the encrypted file, and then encrypt our data in chunks, padding the final one using the PKCS5 scheme. (Eli was also writing the original size of the file, but since we're using a standardized padding scheme that Objective-C knows how to trim, we don't need that.) Then, Robin's code needed a quick modification to read off the first 16 bytes and use them as the IV (her code didn't use an IV), then proceed as normal.

If you check out the Xcode project I've included, you'll see a JPG that I encoded using Python, and it reads out the first 100 bytes of the encrypted file and compares them to the bytes from the unencrypted file to verify that they're the same.