0:00

In this segment, we're going to construct two classic MACS, the CBC-MAC and the

NMAC. Recall in the last segment, we said that if you give me a secure PRF, then

that secure PRF can actually be used to construct a secure MAC, simply by defining

the signature on the message m as the value of the function at the point m. The

only caveat was that the output of the PRF F had to be large. For example, it

could be 80 bits or 128 bits, and that would generate a secure MAC Now we also

said that because AES is a secure PRF, essentially AES already gives us a secure

MAC, except that it can only process sixteen byte messages. And the question

now is, given a PRF for short messages like AES for sixteen byte messages, can we

construct a PRF for long messages that are potentially gigabytes long? And this is

shorthand for what's coming. I'm going to denote by X, the set {0,1} to the N where N

is the block size for the underlying PRF. So since we're always going to be thinking

of AES as the underlying PRF, you can think of N as essentially 128 bits. So

the first construction is called the encrypted CBC MAC, or ECBC for short.

Encrypted CBC MAC. So ECBC uses a PRF that takes messages in the set X {0,1} to the N

and outputs messages in the same set X. And what we're going to be building is a

PRF that basically takes pairs of keys. It takes very long messages, in fact

arbitrarily long messages, and I'll explain this in just a second and it

outputs also tags in {0,1} to the N. So that's our goal. Now what is this X to the

less than or equal to L? The point here is that in fact CBC MAC can take very long

messages up to L blocks. L could be a million or a billion. But it could also

take variable length messages as inputs. In other words, X less than or equal to L

means that we allow the input to be messages that contain an arbitrary number

of blocks between one and L. So each CBC can process messages that are one block

long, two blocks long, ten blocks long, 100 blocks long. It's perfectly fine to

give it variable size inputs. But just to keep the discussion simple, we up our

bounds the maximum length by capital L. So let's see how ECBC works. Well, we start

by taking our message and breaking it into blocks, each block is as long as a block

of the underlying function f, and then essentially we run through the CBC chain

except that we don't output intermediate values. So you notice we basically encrypt

the first block and then feed the results into the XOR with the second block and

then feed that into f again, and we do that again and again and again and finally

we get a value out here. Which is called the CBC outputs of this long chain. And

then, I would like to point your attention to the fact that we do one more encryption

step. And this step is actually done using an independent key, K1. That's different

and chosen independently of the key K, and finally the output gives us the tag. So in

this case the tag would be N bits long, but we also mentioned in the previous

segment that it's fine to truncate the tag to less than N bits long as long as one

over two to the end is negligible. So now you can see that FECBC takes a pair of

keys as inputs, it can process variable length messages and it produces

an output in the set X. So you might be wondering what this last encryption step

is for. And I'll tell you that the function that's defined without this last

encryption step is called the raw CBC function. In other words, if we simply

stop processing over here, and we take that as the output, that's called raw CBC.

And as we'll see in a minute, raw CBC is actually not a secure MAC. So this last

step is actually critical for making the MAC secure. So another class of

construction for converting a small PRF into a large PRF is called NMAC,

for Nested MAC. Now, the NMAC starts from PRF that, as before, takes inputs

in X, but outputs elements in the key space. And remember that for CBC, the

output has to be in the set X. Here, the output needs to be in the key space K. And

again, we basically obtain the PRF F-NMAC, which takes pairs of keys as inputs.

Again, can process variable length messages up until L blocks. And the output

is an element in the key space. And the way NMAC works is kind of, starts as

before. We take our message, and we break it into blocks. Each block is, again, as

big as the block length of the underlying PRF. And now we take our key and we feed

our key as the key input into the function F. And the message block is given as the

data input into the function F. What comes out is the key for the next block of NMAC.

So now we have a new key for the next evaluation of the PRF. And the data for

the next evaluation is the next message block and so on and so forth until we

reach the final output. The final output is gonna be an element in K. Okay? And

just as before, in fact, if we stop here. The function that we obtain is called a

cascade function. And we're gonna look at cascade in more detail in just a minute.

So cascade will output an element in K. However, that, as we'll see again, is not

a secure MAC. To get a secure MAC, what we do is we need to map this element T,

which is in K, into the set X. And so, typically, as we'll see, NMAC is used

with, PRFs, where the block length, X, is much bigger than the key length. And so

what we do is we simply append fixed pad. fpad is called a fixed pad that gets

appended to this tag T. And then this becomes, this input here, this block here

becomes an element of X. We feed this into the function, and again, notice here that

there's an independent key that's being used for the last encryption step. And

then finally, the last tag is an element of K which we output as the output of

NMAC. So remember without the last encryption step, the function is called a

cascade. With the last step as we'll see which is necessary for security, we

actually get a PRF which outputs elements in K, and can process variable length

messages that are up to L blocks long. Alright so that is the NMAC

construction. So now we have two MACs. That we can use to build a large PRF, from AES,

basically. So before I analyze the security of MAC constructions, I'd like

you to understand better what the last encryption step is for. So, let's start

with NMAC. So I explained that it is actually very easy too see that if we

omitted the last encryption step. In other words, if we just use the cascade

function, then the MAC will be completely insecure. Okay so suppose we look at this

MAC to find over here. In other words, all we do is we output directly the cascade

applied to m without the last encryption step. So let me ask you how do you forge

tags in this MAC. And I guess I've kinda given it away that this answer isn't

correct. So I hope everybody sees that the answer is, that, in fact, given one chosen

message query, you can mount an existential forgery. And the reason is,

I'll show you in a second in the diagram, but let me write it out in symbols first.

The reason if, if you're given the output of the cascade function applied to a

message M. I can derive from it, me being the adversary, I can derive from it the

cascade applied to the message M concatenated W for any message W, for any

W. So first of all, it should be clear that this is enough to mount an

existential forgery because essentially by asking for a tag on this message I obtain

the tag on this longer message which I can then output as my forgery. Okay, so the

MAC is insecure because given a MAC in one message, I can produce the MAC in another

message. But let's go back to the diagram describing cascade, and see why this is

true. And so let's see what happens if this last step isn't there. As an

attacker, what I can do, I can add one more block here, which we called W. And

then, then basically, I can take the output of cascade, which is this value T.

And we can simply apply the function F to it again. So here I'll take this T value.

I'll plug it in to F. And plug my last block W into it and what I'll get is T

prime which is, well, the evaluation of cascade on the message M concatenated W.

In [inaudible] calculated t prime, which I can use for my existential forgery. Okay, so this

kind of shows you why this property of cascade holes. This is called an extension

attack, Where giving a tag of the message m I can deduce the tag for the extension

of m. And in fact for any extension of my choice. So basically, Cascade is

completely insecure if we don't do this last encryption step, and the last

encryption step here basically prevents this type of extension attack. I can tell

you by the way that in fact extension attacks are the only attacks on a cascade

and that could be made to precise. But I'm not gonna do that here. The next question

is, why did we have the extra encryption block in the ECBC construction? So again,

let me show you that without this extra encryption block ECBC is insecure. So

let's define a map that uses raw CBC. In other words, it's the same as CBC MAC, but

without the last encryption step. And let's see that, that MAC is also insecure.

Except now, the attack is a little bit more difficult than a simple extension

attack. So suppose the attacker is given this value, the raw CBC value for a

particular message M. And now, the attacker wants to extend and compute the

MAC on some message M, concatenated on with some word W. So here's our W well the

core attacker can take this value and try to XOR the two together but now you

realize he has to evaluate the function. At this point. But he doesn't know what

this key K is. And as a result, he doesn't know what the output of the function is.

So he simply can't just depend on block W, and compute the value of

raw CBC on N concatenated W. However, it turns out they he can actually evaluate

this function. By using the chosen message attack. And I wanna show you how to do

that. Okay, so we said that basically so our goal is to show raw CBC is insecure.

So let's look at a particular attack. So in the attack, the adversary is going to

start by requesting the tag on a particular message m that's a one-block

message. So what does it mean to apply CBC to a one-block message? Well basically,

all you do is you just apply the function f directly. So what you get is the tag,

which is just the application of f directly to the one-block message m. Good,

so now the adversary has this value T. And now I claim that he can define this

message, M prime, which contains two blocks. The first block is M, the second

block is T XOR M. And I claim that the value T that he just received is a valid

tag for this two block message, M Prime. So let's see why that's true. Well, so

suppose we apply the raw CBC construction to this message M prime. So if you plug it

in directly what, let's see. The way it would work is first of all, the first

message M is processed by encrypting it directly using the function F. Then we XOR

the results with the second block with is T-XOR-M. And then we apply F to

the results of that. That is the definition of raw CBC. Now what do we know about

F(k,m)? F(k,m) is simply this value T by definition. So the next step we get is

essentially T-XOR-T-XOR-M. But this T- XOR-T simply goes away and what

we get is F(k,m). Which is, of course, T. And as a result, T is a valid MAC for the

two block message, (M, T-XOR-M). So the adversary was able to produce this

valid tag T for this two block message that he never queried. And therefore, he

was able to break the MAC. So let's look at the ECBC diagram for just one more

second. And let me point out that if you don't include this last encryption step in

the computation of the MAC, essentially, the MAC would be insecure because of the

attack that we just saw. And I'll tell you that there are many products that do this

incorrectly. And, in fact, there are even standards that do this incorrectly, so

that the resulting MAC is insecure. You now know that this needs to be done, and

you won't make this mistake yourself. So let's state the CBC and NMAC security

theorems. And so the statement is as usual for any message length that we'd like to,

apply the MAC to. Essentially for every PRF adversary A, there's an efficient

adversary B and, you know, these are kind of the usual statements. Where, the facts

that you need to know are the error terms which are kind of similar in both cases.

By the way, I'd like to point out that the analysis for CBC actually uses the fact

that F is a PRP even though we never had to invert F during the computation of

ECBC, the analysis is better if you assume that F is actually a PRP. In other words,

it's invertible, not just a function. For a MAC, the PRF need not be invertible.

So what these error terms basically say is that these MACs are secure, as long as,

key is not used to MAC more than square root of X or square root of K messages. So

for AES of course this would be a two to the 64. But I want to show you an example

of how you would use these bounds. So here, I've stated the security theorem again

for the CBC MAC, and Q here, again, is the number of messages that are MACed with a

particular key. So suppose we wanted to ensure that for all adversaries that the

adversary has an advantage of less than one over two to the 32 in distinguishing

the PRF from a truly random function. Suppose that is our goal. Well, by the

security theorem, what that means is we need to ensure that Q squared over X is

less than one over two to the 32, right. We want this term to be, well, I'm going

to ignore this two here just for simplicity. We want to ensure this term is

less than one over two to the 32 and this term, of course, is negligible, so we can

just ignore it. This would imply that this term is also less than one over two to the

32. Okay, so if we want to ensure that the advantage is less than one over two to the

32, we need to ensure that Q squared over X is less than one over two to the 32. For

AES, basically this means that after MACing two to the 48 messages, you have to

change your key. Otherwise, you won't achieve the security level. So you can

MAC, at most, two to the 48 messages. You notice that if I plug in triple DES, which

has a much shorter block, only 64 bits. The same result says you now have to

change your key every 65,000 messages. So this basically is quite problematic

whereas this is fine. This is actually a, a very fairly large number. For

AES this means you have to change your key only every 16 billion

messages which is perfectly reasonable. And so this is one of the reasons why AES

has a larger block, than triple DES. Some of these modes remain

secure and you don't have to change your key as often as you would with triple

DES. Okay, so I want to show you that in fact these attacks are not just in

the statements in the security theorem, in fact there really are real attacks that

correspond to these values. Now the macs really do become insecure after you sign,

you know, square root of X or square root of K messages. So I'm going to show you an

attack on both PRFs so either ECBC or NMAC. Assuming that the underlying

function is a PRP, is actually a block cipher like AES. Let's call F big, let's

say that F big is either F ECBC or F NMAC. So F big means that it's a PRF for large

messages. Now, it turns out both constructions have the following extension

property. Namely, if you give me a collision. On messages X and Y. Then, in

fact, that also implies a collision on an extension of X and Y. In other words, if I

append W to both X and Y, I'll also get a collision on the resulting words. So it's

fairly easy to convince yourself that this extension property holds, you do it just

by staring at the diagram for a second. And so imagine I give you two messages

that happen to collide at this point. Now remember, I assumed that F is a PRP. So

once you fix K1, it's a one to one function. So if the two messages happen to

map to the same value at the output. This means they also happen to map to the same

value at the output of the raw CBC function. But if they map to the same

value at the output of the raw CBC function, that means that if I add another

block, let's call it a W. And I take this output here. Then I'm computing the same

value for both messages. And I'll get, for both messages, I'll get the same value at

this point, too. And when I encrypt, again, with K1, I'll also get, you know,

so there's one more F here. I'll also get the same output, final output, after I've

appended the block W. Okay, so if the two values happen to be the same for two

distinct messages. If I appended block W to both messages, I'm still gonna get the

same value out. It's easy to convince yourself that the same is true for NMAC

as well. Okay, so both of these, PRFs have this extension property. So based on this,

we can define an attack. So here's the extension property stated again. And the

attack would work as follows. Suppose I issued square of Y chosen message

queries. So, for AES, remember, the value of Y is basically {0,1} to the 128. So this

would mean that I would be asking, two to the 64 shows in message queries. For just

arbitrary messages in the input space. Well, what will happen is, I'll obtain,

well, I'll obtain two to the 64 message MAC pairs. Now, we're gonna see in the

next module, actually, there's something called a birthday paradox. Some of you may

have heard of it already. It basically says that if I have two to the 64 random

elements of a space of size two to the 128, there's a good chance that two of

them are the same. So I'm gonna look for two distinct messages. MU and MV, for

which the corresponding MACs are the same. Okay, and as I said, by the birthday

paradox, these are very likely to exist. Once I have that, now I've basically found

MU and MV to have the same MAC. And as a result, what I can do is, I can extend MU

with an arbitrary word W, and ask for the tag for the word MU concatenated W. But

because NU and NV happen to have the same output, I know that NU concatenated W has

the same output as NV concatenated W. So now that I've obtained the output for NU

concatenated W, I also have the output for NV concatenated W. And therefore I have

obtained my forgery. Okay, so now T is also a forgery for the message MV

concatenated W which I've never asked before. And therefore, this is as valid as

a potential forgery. Okay, so this is kind of an acute attack and the bottom line