0:00

In this segment, we're going to construct two classic MACS, the CBC-MAC and the

Â NMAC. Recall in the last segment, we said that if you give me a secure PRF, then

Â that secure PRF can actually be used to construct a secure MAC, simply by defining

Â the signature on the message m as the value of the function at the point m. The

Â only caveat was that the output of the PRF F had to be large. For example, it

Â could be 80 bits or 128 bits, and that would generate a secure MAC Now we also

Â said that because AES is a secure PRF, essentially AES already gives us a secure

Â MAC, except that it can only process sixteen byte messages. And the question

Â now is, given a PRF for short messages like AES for sixteen byte messages, can we

Â construct a PRF for long messages that are potentially gigabytes long? And this is

Â shorthand for what's coming. I'm going to denote by X, the set {0,1} to the N where N

Â is the block size for the underlying PRF. So since we're always going to be thinking

Â of AES as the underlying PRF, you can think of N as essentially 128 bits. So

Â the first construction is called the encrypted CBC MAC, or ECBC for short.

Â Encrypted CBC MAC. So ECBC uses a PRF that takes messages in the set X {0,1} to the N

Â and outputs messages in the same set X. And what we're going to be building is a

Â PRF that basically takes pairs of keys. It takes very long messages, in fact

Â arbitrarily long messages, and I'll explain this in just a second and it

Â outputs also tags in {0,1} to the N. So that's our goal. Now what is this X to the

Â less than or equal to L? The point here is that in fact CBC MAC can take very long

Â messages up to L blocks. L could be a million or a billion. But it could also

Â take variable length messages as inputs. In other words, X less than or equal to L

Â means that we allow the input to be messages that contain an arbitrary number

Â of blocks between one and L. So each CBC can process messages that are one block

Â long, two blocks long, ten blocks long, 100 blocks long. It's perfectly fine to

Â give it variable size inputs. But just to keep the discussion simple, we up our

Â bounds the maximum length by capital L. So let's see how ECBC works. Well, we start

Â by taking our message and breaking it into blocks, each block is as long as a block

Â of the underlying function f, and then essentially we run through the CBC chain

Â except that we don't output intermediate values. So you notice we basically encrypt

Â the first block and then feed the results into the XOR with the second block and

Â then feed that into f again, and we do that again and again and again and finally

Â we get a value out here. Which is called the CBC outputs of this long chain. And

Â then, I would like to point your attention to the fact that we do one more encryption

Â step. And this step is actually done using an independent key, K1. That's different

Â and chosen independently of the key K, and finally the output gives us the tag. So in

Â this case the tag would be N bits long, but we also mentioned in the previous

Â segment that it's fine to truncate the tag to less than N bits long as long as one

Â over two to the end is negligible. So now you can see that FECBC takes a pair of

Â keys as inputs, it can process variable length messages and it produces

Â an output in the set X. So you might be wondering what this last encryption step

Â is for. And I'll tell you that the function that's defined without this last

Â encryption step is called the raw CBC function. In other words, if we simply

Â stop processing over here, and we take that as the output, that's called raw CBC.

Â And as we'll see in a minute, raw CBC is actually not a secure MAC. So this last

Â step is actually critical for making the MAC secure. So another class of

Â construction for converting a small PRF into a large PRF is called NMAC,

Â for Nested MAC. Now, the NMAC starts from PRF that, as before, takes inputs

Â in X, but outputs elements in the key space. And remember that for CBC, the

Â output has to be in the set X. Here, the output needs to be in the key space K. And

Â again, we basically obtain the PRF F-NMAC, which takes pairs of keys as inputs.

Â Again, can process variable length messages up until L blocks. And the output

Â is an element in the key space. And the way NMAC works is kind of, starts as

Â before. We take our message, and we break it into blocks. Each block is, again, as

Â big as the block length of the underlying PRF. And now we take our key and we feed

Â our key as the key input into the function F. And the message block is given as the

Â data input into the function F. What comes out is the key for the next block of NMAC.

Â So now we have a new key for the next evaluation of the PRF. And the data for

Â the next evaluation is the next message block and so on and so forth until we

Â reach the final output. The final output is gonna be an element in K. Okay? And

Â just as before, in fact, if we stop here. The function that we obtain is called a

Â cascade function. And we're gonna look at cascade in more detail in just a minute.

Â So cascade will output an element in K. However, that, as we'll see again, is not

Â a secure MAC. To get a secure MAC, what we do is we need to map this element T,

Â which is in K, into the set X. And so, typically, as we'll see, NMAC is used

Â with, PRFs, where the block length, X, is much bigger than the key length. And so

Â what we do is we simply append fixed pad. fpad is called a fixed pad that gets

Â appended to this tag T. And then this becomes, this input here, this block here

Â becomes an element of X. We feed this into the function, and again, notice here that

Â there's an independent key that's being used for the last encryption step. And

Â then finally, the last tag is an element of K which we output as the output of

Â NMAC. So remember without the last encryption step, the function is called a

Â cascade. With the last step as we'll see which is necessary for security, we

Â actually get a PRF which outputs elements in K, and can process variable length

Â messages that are up to L blocks long. Alright so that is the NMAC

Â construction. So now we have two MACs. That we can use to build a large PRF, from AES,

Â basically. So before I analyze the security of MAC constructions, I'd like

Â you to understand better what the last encryption step is for. So, let's start

Â with NMAC. So I explained that it is actually very easy too see that if we

Â omitted the last encryption step. In other words, if we just use the cascade

Â function, then the MAC will be completely insecure. Okay so suppose we look at this

Â MAC to find over here. In other words, all we do is we output directly the cascade

Â applied to m without the last encryption step. So let me ask you how do you forge

Â tags in this MAC. And I guess I've kinda given it away that this answer isn't

Â correct. So I hope everybody sees that the answer is, that, in fact, given one chosen

Â message query, you can mount an existential forgery. And the reason is,

Â I'll show you in a second in the diagram, but let me write it out in symbols first.

Â The reason if, if you're given the output of the cascade function applied to a

Â message M. I can derive from it, me being the adversary, I can derive from it the

Â cascade applied to the message M concatenated W for any message W, for any

Â W. So first of all, it should be clear that this is enough to mount an

Â existential forgery because essentially by asking for a tag on this message I obtain

Â the tag on this longer message which I can then output as my forgery. Okay, so the

Â MAC is insecure because given a MAC in one message, I can produce the MAC in another

Â message. But let's go back to the diagram describing cascade, and see why this is

Â true. And so let's see what happens if this last step isn't there. As an

Â attacker, what I can do, I can add one more block here, which we called W. And

Â then, then basically, I can take the output of cascade, which is this value T.

Â And we can simply apply the function F to it again. So here I'll take this T value.

Â I'll plug it in to F. And plug my last block W into it and what I'll get is T

Â prime which is, well, the evaluation of cascade on the message M concatenated W.

Â In [inaudible] calculated t prime, which I can use for my existential forgery. Okay, so this

Â kind of shows you why this property of cascade holes. This is called an extension

Â attack, Where giving a tag of the message m I can deduce the tag for the extension

Â of m. And in fact for any extension of my choice. So basically, Cascade is

Â completely insecure if we don't do this last encryption step, and the last

Â encryption step here basically prevents this type of extension attack. I can tell

Â you by the way that in fact extension attacks are the only attacks on a cascade

Â and that could be made to precise. But I'm not gonna do that here. The next question

Â is, why did we have the extra encryption block in the ECBC construction? So again,

Â let me show you that without this extra encryption block ECBC is insecure. So

Â let's define a map that uses raw CBC. In other words, it's the same as CBC MAC, but

Â without the last encryption step. And let's see that, that MAC is also insecure.

Â Except now, the attack is a little bit more difficult than a simple extension

Â attack. So suppose the attacker is given this value, the raw CBC value for a

Â particular message M. And now, the attacker wants to extend and compute the

Â MAC on some message M, concatenated on with some word W. So here's our W well the

Â core attacker can take this value and try to XOR the two together but now you

Â realize he has to evaluate the function. At this point. But he doesn't know what

Â this key K is. And as a result, he doesn't know what the output of the function is.

Â So he simply can't just depend on block W, and compute the value of

Â raw CBC on N concatenated W. However, it turns out they he can actually evaluate

Â this function. By using the chosen message attack. And I wanna show you how to do

Â that. Okay, so we said that basically so our goal is to show raw CBC is insecure.

Â So let's look at a particular attack. So in the attack, the adversary is going to

Â start by requesting the tag on a particular message m that's a one-block

Â message. So what does it mean to apply CBC to a one-block message? Well basically,

Â all you do is you just apply the function f directly. So what you get is the tag,

Â which is just the application of f directly to the one-block message m. Good,

Â so now the adversary has this value T. And now I claim that he can define this

Â message, M prime, which contains two blocks. The first block is M, the second

Â block is T XOR M. And I claim that the value T that he just received is a valid

Â tag for this two block message, M Prime. So let's see why that's true. Well, so

Â suppose we apply the raw CBC construction to this message M prime. So if you plug it

Â in directly what, let's see. The way it would work is first of all, the first

Â message M is processed by encrypting it directly using the function F. Then we XOR

Â the results with the second block with is T-XOR-M. And then we apply F to

Â the results of that. That is the definition of raw CBC. Now what do we know about

Â F(k,m)? F(k,m) is simply this value T by definition. So the next step we get is

Â essentially T-XOR-T-XOR-M. But this T- XOR-T simply goes away and what

Â we get is F(k,m). Which is, of course, T. And as a result, T is a valid MAC for the

Â two block message, (M, T-XOR-M). So the adversary was able to produce this

Â valid tag T for this two block message that he never queried. And therefore, he

Â was able to break the MAC. So let's look at the ECBC diagram for just one more

Â second. And let me point out that if you don't include this last encryption step in

Â the computation of the MAC, essentially, the MAC would be insecure because of the

Â attack that we just saw. And I'll tell you that there are many products that do this

Â incorrectly. And, in fact, there are even standards that do this incorrectly, so

Â that the resulting MAC is insecure. You now know that this needs to be done, and

Â you won't make this mistake yourself. So let's state the CBC and NMAC security

Â theorems. And so the statement is as usual for any message length that we'd like to,

Â apply the MAC to. Essentially for every PRF adversary A, there's an efficient

Â adversary B and, you know, these are kind of the usual statements. Where, the facts

Â that you need to know are the error terms which are kind of similar in both cases.

Â By the way, I'd like to point out that the analysis for CBC actually uses the fact

Â that F is a PRP even though we never had to invert F during the computation of

Â ECBC, the analysis is better if you assume that F is actually a PRP. In other words,

Â it's invertible, not just a function. For a MAC, the PRF need not be invertible.

Â So what these error terms basically say is that these MACs are secure, as long as,

Â key is not used to MAC more than square root of X or square root of K messages. So

Â for AES of course this would be a two to the 64. But I want to show you an example

Â of how you would use these bounds. So here, I've stated the security theorem again

Â for the CBC MAC, and Q here, again, is the number of messages that are MACed with a

Â particular key. So suppose we wanted to ensure that for all adversaries that the

Â adversary has an advantage of less than one over two to the 32 in distinguishing

Â the PRF from a truly random function. Suppose that is our goal. Well, by the

Â security theorem, what that means is we need to ensure that Q squared over X is

Â less than one over two to the 32, right. We want this term to be, well, I'm going

Â to ignore this two here just for simplicity. We want to ensure this term is

Â less than one over two to the 32 and this term, of course, is negligible, so we can

Â just ignore it. This would imply that this term is also less than one over two to the

Â 32. Okay, so if we want to ensure that the advantage is less than one over two to the

Â 32, we need to ensure that Q squared over X is less than one over two to the 32. For

Â AES, basically this means that after MACing two to the 48 messages, you have to

Â change your key. Otherwise, you won't achieve the security level. So you can

Â MAC, at most, two to the 48 messages. You notice that if I plug in triple DES, which

Â has a much shorter block, only 64 bits. The same result says you now have to

Â change your key every 65,000 messages. So this basically is quite problematic

Â whereas this is fine. This is actually a, a very fairly large number. For

Â AES this means you have to change your key only every 16 billion

Â messages which is perfectly reasonable. And so this is one of the reasons why AES

Â has a larger block, than triple DES. Some of these modes remain

Â secure and you don't have to change your key as often as you would with triple

Â DES. Okay, so I want to show you that in fact these attacks are not just in

Â the statements in the security theorem, in fact there really are real attacks that

Â correspond to these values. Now the macs really do become insecure after you sign,

Â you know, square root of X or square root of K messages. So I'm going to show you an

Â attack on both PRFs so either ECBC or NMAC. Assuming that the underlying

Â function is a PRP, is actually a block cipher like AES. Let's call F big, let's

Â say that F big is either F ECBC or F NMAC. So F big means that it's a PRF for large

Â messages. Now, it turns out both constructions have the following extension

Â property. Namely, if you give me a collision. On messages X and Y. Then, in

Â fact, that also implies a collision on an extension of X and Y. In other words, if I

Â append W to both X and Y, I'll also get a collision on the resulting words. So it's

Â fairly easy to convince yourself that this extension property holds, you do it just

Â by staring at the diagram for a second. And so imagine I give you two messages

Â that happen to collide at this point. Now remember, I assumed that F is a PRP. So

Â once you fix K1, it's a one to one function. So if the two messages happen to

Â map to the same value at the output. This means they also happen to map to the same

Â value at the output of the raw CBC function. But if they map to the same

Â value at the output of the raw CBC function, that means that if I add another

Â block, let's call it a W. And I take this output here. Then I'm computing the same

Â value for both messages. And I'll get, for both messages, I'll get the same value at

Â this point, too. And when I encrypt, again, with K1, I'll also get, you know,

Â so there's one more F here. I'll also get the same output, final output, after I've

Â appended the block W. Okay, so if the two values happen to be the same for two

Â distinct messages. If I appended block W to both messages, I'm still gonna get the

Â same value out. It's easy to convince yourself that the same is true for NMAC

Â as well. Okay, so both of these, PRFs have this extension property. So based on this,

Â we can define an attack. So here's the extension property stated again. And the

Â attack would work as follows. Suppose I issued square of Y chosen message

Â queries. So, for AES, remember, the value of Y is basically {0,1} to the 128. So this

Â would mean that I would be asking, two to the 64 shows in message queries. For just

Â arbitrary messages in the input space. Well, what will happen is, I'll obtain,

Â well, I'll obtain two to the 64 message MAC pairs. Now, we're gonna see in the

Â next module, actually, there's something called a birthday paradox. Some of you may

Â have heard of it already. It basically says that if I have two to the 64 random

Â elements of a space of size two to the 128, there's a good chance that two of

Â them are the same. So I'm gonna look for two distinct messages. MU and MV, for

Â which the corresponding MACs are the same. Okay, and as I said, by the birthday

Â paradox, these are very likely to exist. Once I have that, now I've basically found

Â MU and MV to have the same MAC. And as a result, what I can do is, I can extend MU

Â with an arbitrary word W, and ask for the tag for the word MU concatenated W. But

Â because NU and NV happen to have the same output, I know that NU concatenated W has

Â the same output as NV concatenated W. So now that I've obtained the output for NU

Â concatenated W, I also have the output for NV concatenated W. And therefore I have

Â obtained my forgery. Okay, so now T is also a forgery for the message MV

Â concatenated W which I've never asked before. And therefore, this is as valid as

Â a potential forgery. Okay, so this is kind of an acute attack and the bottom line

Â