0:00

In this segment, we're gonna construct authenticated encryption systems. Since we

Â already have CPA secured encryption, and we have secure MACs, the natural question

Â is whether we can combine the two somehow, in order to get authenticated encryption.

Â And if, that's exactly what we're gonna do in this segment. Authenticated encryption

Â was introduced in the year 2000, in two independent papers that I point to at the

Â end of this module. But before then, many crytpolibraries provided an API that

Â separately supported CPA secure encryption, and MAC-ing. So there was one

Â function for implementing CPA secure encryption. For example, CBC with a random

Â IV. And another function for implementing a MAC. And then every developer that

Â wanted to implement encryption, had to, himself, call separately the CPA secure

Â encryption scheme and the MAC scheme. In particular, every developer had to invent

Â his own way of combining encryption and MAC-ing to provide some sort of

Â authenticated encryption. But since the goals of combining encryption and MAC-ing

Â wasn't well understood since authenticated encryption hasn't yet been defined, it

Â wasn't really clear which combinations of encryption and MAC-ing are correct and

Â which aren't. And so, every project as I said had to invent its own combination.

Â And in fact, not all combinations were correct. And I can tell you that the most

Â common mistake in software projects were basically incorrectly combining the

Â encryption and integrity mechanisms. So hopefully, by the end of this module, you

Â will know how to combine them correctly, and you won't be making these mistakes

Â yourself. So let's look at some combinations of CPA secure encryption and

Â MAC, that were introduced by different projects. So here are three examples. So,

Â first of all, in all three examples, there's a separate key for encryption, and

Â a separate key for MACing. These two keys are independent of one another, and

Â both are generated at session setup time. And we're gonna see how to generate these

Â two keys later on in the course. So the first example is the SSL protocol. So the

Â way SSL combines encryption and MAC in the hope of achieving authenticated encryption

Â is the following. Basically you take the plain text, m, and then you compute a MAC

Â on the plain text, m. So you use your MAC key, kI, to compute tag for this message

Â m. And then you can concatenate the tag to the message and then you encrypt the

Â concatenation of the message and the tag and what comes out is the actual final cipher text.

Â So that's option number one. The second option is what IPsec does. So

Â here, you take the message. The first thing you do is you encrypt the message.

Â And then, you compute a tag on the resulting cipher text. So you notice the

Â tag itself is computed on the resulting cipher text. A third option is what the

Â SSH protocol does. So here, the SSH takes the message, and encrypts it using a CPA

Â secure encryption scheme. And then, to it, it concatenates a tag of the message. The

Â difference between IPsec and SSH, is that in IPsec, the tag is computed over the

Â cipher text, whereas, in SSH, the tag is computed over the message. And so these

Â are three completely different ways of combining encryption and MAC. And the

Â question is, which one of these is secure? So, I will let you think about this for a

Â second, and then when we continue we'll do the analysis together.

Â 3:13

Okay. So let's start with the SSH method. So in the SSH method you notice that the tag

Â is computed on the message and then concatenated in the clear to the cipher text.

Â Now this is actually quite a problem because MACs themselves are not designed

Â to provide confidentiality. MACs are only designed for integrity. And in fact, there's

Â nothing wrong with a MAC that as part of the tag outputs a few bits of the plain

Â text. Outputs a few bits of the message M. That would be a perfectly fine tag. And yet if

Â we did that, that would completely break CPA security here, because some bits of

Â the message are leaked in the cipher text. And so the SSH approach, even though the

Â specifics of SSH are fine and the protocol itself is not compromised by

Â this specific combination, generally it's advisable not to use this approach. Simply

Â because the output of the MAC signing algorithm might leak bits of the message. So

Â now let's look at SSL and IPsec. As it turns out, the recommended method actually

Â is the IPsec method. Because it turns out no matter what CPA secure system and MAC

Â key you use the combination is always gonna provide authenticated encryption.

Â Now let me very, very briefly explain why. Basically what happens is once we encrypt

Â the message well the message contents now is hidden inside the cipher text and now

Â when we compute a tag of the cipher text basically we're locking, this tag locks

Â the cipher text and makes sure no one can produce a different cipher text that would

Â look valid. And as a result this approach ensures that any modifications to the

Â cipher text will be detected by the decrypter simply because the MAC isn't

Â gonna verify. As it turns out, for the SSL approach, there actually are kind of

Â pathological examples, where you combine CPA secure encryption system with a secure

Â MAC. And the result is vulnerable to a chosen cipher text attack, so that it does

Â not actually provide authenticated encryption. And basically, the reason that

Â could happen, is that there's some sort of a bad interaction between the encryption

Â scheme and the MAC algorithm. Such that, in fact, there will be a chosen cipher

Â text attack. So if you're designing a new project the recommendation now is to

Â always use encrypt then MAC because that is secure no matter which CPA secure

Â encryption and secure MAC algorithm you're combining. Now, just to set the

Â terminology, the SSL method is sometimes called MAC-then-encrypt. And the

Â IPsec method is called encrypt-then-MAC. The SSH method even though you're

Â not supposed to use it, is called encrypt-and-MAC. Okay, so I'll often refer to

Â encrypt-then-MAC, and MAC-then-encrypt to differentiate SSL and IPsec. Okay, so

Â just to repeat what I've just said. The IPsec method encrypt-then-MAC always

Â provides authenticated encryption. If you start from a CPA secure cipher and a secure MAC

Â you will always get authenticated encryption. As I said, MAC-then-encrypt in

Â fact, there are pathological cases where the result is vulnerable to CCA attacks and

Â therefore does not provide authenticated encryption. However, the story's a little

Â bit more interesting than that, in that, it turns out, if you're actually using

Â randomized counter mode or randomized CBC, then it turns out, for those particular

Â CPA secure encryption schemes, MAC-then-encrypt actually does provide authenticated

Â encryption and therefore it is secure. In fact, there's even a more interesting

Â twist here in that if you're using randomized counter mode. Then, it's enough

Â that your MAC algorithm just be one time secure. It doesn't have to be a fully

Â secure MAC. It just has to be secure when a key is used to encrypt a single message,

Â okay? And when we talked about message integrity, we saw that there are actually

Â much faster MACs that are one time secure than MACs that are fully secure. As a

Â result, if you're using randomized counter mode MAC-then-encrypt could actually

Â result in a more efficient encryption mechanism. However, I'm going to repeat

Â this again. The recommendation is to use encrypt-then-MAC and we're going to see a

Â number of attacks on systems that didn't use encrypt-then-MAC. And so just to make

Â sure things are secure without you having to think too hard about this. Again, I am

Â going to recommend that you always use encrypt-then-MAC. Now, once the concept of

Â authenticated encryption became more popular, a number of standardized

Â approaches for combining encryption and MAC turned up. And those were even

Â standardized by the National Institute of Standards. So I'm just gonna mention three

Â of these standards. Two of these were standardized by NIST. And these are

Â called Galois counter mode and CBC counter mode. And so let me explain what they do.

Â Galois counter mode basically uses counter mode encryption, so a randomized counter

Â mode with a Carter-Wegman MAC, so a very fact Carter-Wegman MAC. And the way the

Â Carter-Wegman MAC works in GCM is it's basically a hash function of the message

Â that's being MACed. And then the result is encrypted using a PRF. Now this hash

Â function in GCM is already quite fast to the point where the bulk of the running

Â time of GCM is dominated by the counter mode encryption and it's even made more so

Â in that Intel introduces a special instruction PCLMULQDQ specifically

Â designed for the purpose of making the hash function in GCM run as fast as possible.

Â Now CCM counter mode is another NIST standard. It uses a CBC MAC and

Â then counter mode encryption. So this mechanism, you know, this uses MAC, then

Â encrypt, like SSL does. So this is actually not the recommended way of doing

Â things, but because counter mode encryption is used. This is actually a

Â perfectly fine encryption mechanism. One thing that I'd like to point out about

Â CCM, is that everything is based on AES. You notice, it's using AES for the CBC

Â MAC, and it's using AES for the counter mode encryption. And as a result, CCM can

Â be implemented with relatively little code. Cause all you need is an AES engine

Â and nothing else. And because of this, CCM actually was adopted by the Wi-Fi

Â alliance, and in fact, you're probably using CCM on a daily basis if you're using

Â encrypted Wi-Fi 802.11i then you're basically using CCM to encrypt traffic

Â between your laptop and the access point. There's another mode called a EAX that

Â uses counter mode encryption, and then CMAC. So, again you notice encrypt-then-MAC

Â and that's another fine mode to use. We'll do a comparison of all these

Â modes in just a minute. Now I wanted to mention that first of all, all these modes are

Â nonce-based. In other words, they don't use any randomness but they do take as

Â input a nonce and the nonce has to be unique per key. In other words, as you

Â remember, the pair (key, nonce) should never ever, ever repeat. But the

Â nonce itself need not be random, so it's perfectly fine to use a counter, for

Â example, as a nonce. And the other important point is that, in fact, all

Â these modes are what's called authenticated encryption with associated

Â data. This is an extension of authenticated encryption, that comes

Â up very often in networking protocols. So the idea between AEAD is that, in fact,

Â the message that's provided to the encryption mode is not intended to be fully

Â encrypted. Only part of the message is intended to be encrypted, but all of the

Â message is intended to be authenticated. A good example of this is a network packet.

Â Think of like a IP packet where there's a header and then there's a payload. And

Â typically the header is not gonna be encrypted. For example, the header might

Â contain the destination of the packet, but then the header had better not be

Â encrypted otherwise routers along the way wouldn't know where to route the packet.

Â And so, typically the header is sent in the clear, but the payload, of course, is

Â always encrypted, but what you'd like to do is have the header be authenticated.

Â Not encrypted but authenticated. So this is exactly what these AEAD modes do. They

Â will authenticate the header and then encrypt the payload. But the header and

Â the payload are bound together in the authentication so they can't

Â actually be separated. So this is not difficult to do. What happens is in these

Â three modes GCM, CCM, and EAX, basically the MAC is applied to the entire data. But

Â the encryption is only applied to the part of the data that needs to be encrypted.

Â So I wanted to show you what an API to these authenticated encryption with

Â associated data encryption schemes look like. So here's what it looks like in OpenSSL.

Â For example, this is, an API for GCM. So what you do is you call the

Â init function to initialize the encryption mode, and you notice you give it a key and

Â the nonce. The nonce again, doesn't have to be random, but it has to

Â be unique. And after initialization, you would call this encrypt function, where

Â you see that you give it the associated data that's gonna be authenticated, but

Â not encrypted. You give it the data, and it's gonna be both authenticated and

Â encrypted. And it gives you back the full cipher text, which is an encryption of the

Â data, but of course does not include the AEAD, because the AEAD is gonna be sent in

Â the clear. So now that we understand this mode of encrypt-then-MAC, we can go

Â back to the definition of MAC security and I can explain to you something that might

Â have been a little obscure when we looked at that definition. So if you remember,

Â one of the requirements that followed from our definition of secure MACs meant that

Â given a message-MAC pair on a message M, the attacker cannot produce another tag on

Â the same message M. In other words, even though the attacker already has a tag for

Â the message M, he shouldn't be able to produce a new tag for the same message M.

Â And it's really not clear, why does that matter? Who cares, if the adversary already

Â has a tag on the message M, who cares if he can produce another tag? Well, it turns

Â out if the MAC didn't have this property. In other words, given a message-MAC pair

Â you can produce another MAC on the same message, then that MAC would

Â result in an insecure encrypt-then-MAC mode. And so if we want our encrypt-then-MAC to

Â have cipher text integrity, it's crucial that our MAC security would imply this strong

Â notion of security, which, of course, it does because we defined it correctly.

Â So let's see what would go wrong, if, in fact, it was easy to produce this type of

Â forgery. So what I'll do is I'll show you a chosen cipher text attack on the

Â resulting encrypt-then-MAC system. And since the system has a chosen cipher text

Â attack on it, it necessarily means that it doesn't provide authenticated

Â encryption. So let's see. So the adversary's gonnna start by sending two

Â messages, M0 and M1. And he's gonna receive, as usual, the encryption of one

Â of them, either the encryption of M0 or the encryption of M1. And since we're

Â using encrypt-then-MAC, the adversary receives the cipher text we'll call it C0

Â and a MAC on the cipher text C0. Well now we said that given the MAC on

Â a message the adversary can produce another MAC on the same message. So what

Â he's gonna do is he's gonna produce another MAC on the message C0. Now he has

Â a new cipher text (C0,T'), which is a perfectly valid cipher text. T' is a

Â valid MAC of C0. Therefore, the adversary now can submit a chosen cipher text query

Â on C' and this is a valid chosen cipher text query because it's different

Â from C. It's a new cipher text. The poor challenger now is forced to decrypt this

Â cipher text C' so he's going to send back the decryption of C'. It's a

Â valid cipher text therefore the decryption of C prime is the message Mb but now the

Â attacker just learned the value of B because he can test whether Mb is equal to

Â M0 or MB is equal to M1. As a result he can just output B and he gets advantage

Â one in defeating the scheme. And so again if our MAC security did not imply

Â this property here. Then, there would be a chosen cipher text attack on encrypt-then-MAC.

Â And therefore, it would not be secure. So the fact that we define MAC security correctly

Â means that encrypt-then-MAC really does provide authenticated encryption. And

Â throughout all the MACs that we discussed actually do satisfy this strong notion of

Â unforgeability. So, interestingly, this is not the end of the story. So, as we said

Â before the concept of authenticated encryption was introduced everyone was

Â just combining MACs and encryption in various ways in the hope of achieving

Â some authenticated encryption. After the notion of authenticated encryption

Â became formalized and rigorous, people kind of started scratching their heads and said,

Â hey, wait a minute. Maybe we can achieve authenticated encryption more efficiently

Â than by combining a MAC and an encryption scheme. In fact, if you think about how

Â this combination of MAC and encryption works, let's say we combine counter mode

Â with CMAC, then for every block of plaintext, you first of all have to use

Â your block cipher for counter mode, and then you have to use to your block cipher

Â again, for the CBC-MAC. This means that if you're combining CPA secure encryption with a

Â MAC, for every block of plaintext, you have to evaluate your block cipher twice,

Â once for the MAC and once for the encryption scheme. So the natural question

Â was, can we construct an authenticated encryption scheme directly from a PRP,

Â such that we would have to only evaluate the PRP once per block? And it turns out

Â the answer is yes, and there's this beautiful construction called OCB, that

Â pretty much does everything you want, and is much faster than constructions that are

Â separately built from an encryption and a MAC. So I wrote down, kind of a schematic

Â of OCB. I don't want to explain it in detail. I'll just kind of explain it at a

Â high level. So here we have our input plain text, here at the top. And you

Â notice that, first of all, OCB is parallelizable, completely parallelizable.

Â So every block can be encrypted separately of every other block. The other thing to

Â notice is that as I promised, you only evaluate your block cipher once per plain

Â text block. And then you evaluate it one more time at the end to build your

Â authentication tag and then the overhead of OCB beyond just a block cipher is

Â minimal. All you have to do is evaluate a certain very simple function P. The

Â nonce goes into the P you notice, the key goes into this P and then there is a

Â block counter that goes into this P. So you just evaluate this function P, twice

Â for every block and you XOR the result before and after encryption using the

Â block cipher and that's it. That's all you have to do and then you get a very fast

Â and efficient authenticated encryption scheme built from a block cipher. So OCB

Â actually has a nice security theorem associated with it and I am going to point

Â to a paper on OCB when we get to end of this module where I list some further

Â reading papers that you can take a look at. So you might be wondering if OCB is so

Â much better than everything you've seen so far, all these three standards CCM, GCM and

Â EAX why isn't OCB being used or why isn't OCB the standard? And the answer is a

Â little sad. The primary answer that OCB is not being used is actually because

Â of various patents. And I'll just leave it at that. So to conclude this section I

Â wanted to show you some performance numbers. So here on the right I listed

Â performance numbers for modes that you shouldn't be using. So this is for

Â randomized counter mode, and this is for randomized CBC. And you can see also the

Â performance of CBC MAC is basically the same as the performance of CBC encryption.

Â Okay. Now here are the authenticated encryption modes, so these are the ones

Â that you're supposed to using, these you're not supposed to be using on their

Â own, right. These two, you should never ever use these two because they only

Â provide CPA security, they don't actually provide security against active

Â attacks. You're only supposed to use authenticated encryption for encryption.

Â And so I listed performance numbers for the three standards. And let me remind

Â you that GCM basically uses a very fast hash. And then it uses counter mode for

Â actual encryption. And you can see that the overhead of GCM over counter mode is

Â relatively small. CCM and EAX both use a block cipher based encryption and a

Â block cipher based MAC. And as a result they're about twice as slow as counter

Â modes. You see that OCB is actually the fastest of these, primarily because it

Â only use the block cipher once per message block. So based on these performance

Â numbers, you would think that GCM is exactly the right mode to always use. But

Â it turns out if you're on the space constrained hardware, GCM is not ideal.

Â Primarily because its implementation requires larger code than the other two

Â modes. However, as I said, Intel specifically added instructions to speed

Â up GCM mode. And as a result, implementing GCM on an Intel architecture takes

Â very little code. But on other hardware platforms, say in smart cards or other

Â constrained environments, the code size for implementing GCM would be considerably

Â larger than for the other two modes. But if code size is not a constraint then GCM

Â is the right mode to use. So to summarize this segment I want to say it one more

Â time that when you want to encrypt messages you have to use an authenticated

Â encryption mode and the recommended way to do it is to use one of the standards,

Â namely one of these three modes for providing authenticated encryption.

Â Don't implement the encryption scheme yourself. In other words don't implement

Â encrypt-then-MAC yourself. Just use one of these three standards. Many crypto libraries

Â now provide standard API's for these three modes and these are the one's you should

Â be using and nothing else. In the next segment we're going to see what else can

Â go wrong when you try to implement authenticated encryption by yourself.

Â