0:00

In this segment, I want to tell you about another form of encryption, called format

preserving encryption. This is again something that comes up in practice quite

often, especially when it comes to encrypting credit card numbers. And, so,

let's see how it works. Remember that a typical credit card number is sixteen

digits, broken into four blocks of four digits each. You've probably heard before

that the first six digits are what's called the BIN number, which represent the

issuer ID. For example, all credit cards issued by VISA always start with the digit

four; all credit cards issued by MasterCard start with digits 51 to 55, and

so on and so forth. The next nine digits are actually the account number that is

specific to the, to the particular customer and the last digit is a check sum

that's computed from the previous fifteen digits. So there are about 20,000 BIN

numbers and so if you do the calculation it turns out there are about 2 to the 42

possible credit card numbers which corresponds to about 42 bits of

information that you need to encode if you want to represent a credit card number

compactly. Now the problem is this. Suppose we wanted to encrypt credit card

numbers, so that when the user swipes the credit card number at the point of sale

terminal, namely at the, you know, the merchant's cash register. The cash

register, this is called a point of sale terminal, goes ahead and encrypts the

credit card number using a key k and it's encrypted all the way until it goes

to the acquiring bank or maybe even beyond that, maybe even all the way when it goes

to Visa. Now, the problem is that these credit card numbers actually pass through

many, many, many processing points. All of them expect to get basically something

that looks like a credit card number as a result. So if we simply wanted to encrypt

something at the end point, at one end point, and decrypt it at the other end

point, it's actually not that easy because if all we did was encrypt it using AES,

even if we just used deterministic AES, the output of the encrypted credit card

number would be 128 bit block. Sixteen bytes that would be, that would need to be

sent from one system to the next, until it reaches its destination. But each one of

these processors, then, would have to be changed, because they're all expecting to

get credit card numbers. And so the question is, can we encrypt at the cash

register, such that the resulting encryption itself looks like a credit card

number. And as a result, none of these intermediate systems would have to be

changed. Only the endpoints would have to be changed. And in doing so we would

actually obtain end-to-end encryption without actually having to change a lot of

software along the way. So the question then is, again, can we have this mechanism

called format preserving encryption, where encrypting a credit card itself produces

something that looks like a credit card? So that's our goal. The encrypted credit

card number should look like a regular valid credit card number. So this is the

goal of format preserving encryption. More abstractly what it is we're trying to do,

is basically build a pseudo random permutation on the set zero to S minus one

for any given S. So for the set of credit card numbers, S would be roughly, you

know, two to the 42. In fact, it's gonna be, not exactly two to the 42. It's gonna

be some funny numbers that's around two to the 42. And our goal is to build a PRP

that acts exactly on the interval, zero to S minus one. And what we're given as input

is some PRF, say, something like AES, that acts on much larger blocks, say, acts on

sixteen byte blocks. So we're trying to, in some sense, shrink the domain of the

PRF to make it fit the data that we're given. And the question is basically how

to do that. Well, once we have such a construction we can easily use it to

encrypt credit card numbers. What we would do is we would take the, a given credit

card number. We would encode it such that now it's represented as a number between

zero and the total number of credit card numbers. Then we would apply our PRP so we

would encrypt this credit card number, you know, using construction number two from

the deterministic encryption segment. And then we would map the final number back to

be a regular, to look like val--, regular, valid credit card number and then send

this along the way. So this is, this is again non expanding encryption. The

best we can do, as we said before is to encrypt using a PRP except again the

technical challenge is we need a PRP that acts on this particular funny looking set

from zero to S-1 for this prespecified value of S. So my goal is to show you how

to construct this and once we see how to do it, you will also know how to encrypt

credit card number so that the resulting cipher text is itself a credit card

number. So the construction works in two steps. The first thing we do is we shrink

our PRF from the set {0,1} to the n, say {0,1} to the 128 in the case of AES,

to {0,1} to the t where t is the closest power of two to the value S.

4:44

So say S is some crazy number around two to the 41. T would basically be then 42

because that's the closest power of two that's just above the value S. So the

construction is basically gonna use the Luby-Rackoff construction. What we need

is a PRF F prime that acts on blocks of size t over two. So imagine for example

in the credit card case, t would be 42 because two to the 42 we said is the

closest power of two that's bigger than, than, than S. S is the number of credit,

total number of credit cards. Then we would need a PRF now that acts on 21-bit

inputs. So one way to do that is simply to take the AES block cipher, treat it as a

PRF on 128-bit inputs, and then simply truncate the output to the least

significant 21 bits. Okay, so we were given a 21 bit value. We would append

zeros to it so that we get 128 bits as a result. We would apply AES to that and

then we would truncate back to 21 bits. Now I should tell you that that's actually

a simple way to do it but it's actually not the best way to do it. There are

actually better ways to truncate a PRF on n bits to a PRF on t over two bits.

But for our pur--, for the purposes of this segment, the truncation method I just said

is good enough. So let's just use that particular method. Okay, so now, we've

converted our AES block cipher into a PRF on t over two bits, say, on 21 bits. But

what we really want is a PRP. And so what we'll do is we'll plug our new PRF, F prime,

directly into the Luby-Rackoff construction. If you remember, this is

basically a Feistel construction. We saw this when we talked about DES. It's a,

Luby-Rackoff used three rounds, and we know that this converts a secure PRF into

a secure PRP on twice the block size. In other words, we started from a secure PRF

on 21 bits, and that will give us, and Luby-Rackoff will give us a secure PRF on

42 bits, which is what we wanted. Now, I should tell you that the error parameters

in the Luby-Rackoff construction are not as good as they could be. And, in fact, a

better thing to do is to use seven rounds of Feistel. So in other words, we'll do

this seven times; every time we'll use a different key. So you notice, here, we're

only using three keys. We should be using, we should be doing this seven times using

seven different keys. And then there's a nice result, due to Patarin, that

shows that the seven-round construction basically has as good an error

terms as possible. So the error terms in the security theorem will basically be two

to the T. Okay. So now we have a pseudo random permutation that operates on close

power of two to the value of S. But that's not good enough. We actually want to get a

pseudo random permutation on the set zero to S minus one. So step two will take us

down from {0,1} to the T, to the actual set zero to the S minus one that we're

interested in. And this construction is, again, really, really cute, so let me show

you how it works. So, again, we're given this PRP that operates on a power of two.

And we wanna go down to a PRP that operates on something slightly smaller.

Okay, so here's on the construction works. Basically we're given input X in the set

zero to S minus one that we want. And what we're going to do is we're going to

iterate the following procedure again and again. So first of all we map X into

this temporary variable Y. And now we're just gonna encrypt the input X and put the

result into Y. If Y is inside of our target set, we stop and we output Y. If

not we iterate this again and again and again and again and again until finally Y

falls into our target set and then we output that value. So in a picture, let me

explain how this works. So we start from a point in our target set. And now, now we

apply the, the function E and we might be mapped into this point outside our target

set, so we're not gonna stop. So now we apply the function E again and we might

be mapped into this point and now we apply the function E again and now all of a

sudden we map into this point, and then we stop, and that's our output. Okay, so

that's how we map points between from zero to S minus one, to other points in the

zero to S minus one. It should be pretty clear that this is invertible. To invert,

all I'll do is I'll just decrypt and walk, kind of, in the opposite direction. So

I'll decrypt, and then decrypt, and then decrypt, until finally, I fall into the

set, zero to S minus one. And that gives me the inverse of the point that I wanted

to. So this is, in fact, a PRP. The question is whether it's a secure PRP, and

we'll see that in just a minute. But before that, let me ask you, how many iterations

do you expect that we're gonna need? And I wanna remind you again, before I ask you

that question, that, in fact, S is less than two to the T, but is more than two to the T-1.

So in particular S is greater than two to the T over two. And my question to you

now is how many iterations are we gonna need, on average, until this process converges?

9:32

So the answer is we're going to need two iterations,

so really just a small constant number of iterations. And so this

will give us a very, very efficient mechanism that will move us down from

{0,1} to the T to zero to the S-1. So now when we talk about security, it turns out

this step here is tight. In other words, there is really no additional security

cost to going down from two to the T to zero to S-1. And the reason that's true,

it's actually again a very cute theorem to prove, but I, I won't do it here. Maybe

I'll leave it as an exercise for you guys to argue. Which says that if you give me

any two sets Y and X, where Y is contained inside of X, then applying the

transformation that we just saw to a random permutation from X to X actually

gives a random permutation from Y to Y. So this means that if we started with a truly

random permutation on X, you'll end up with a truly random permutation on Y. And

since, well, the permutation we're starting with is indistinguishable from

random on X, we'll end up with a permutation that's indistinguishable from

random on Y. Okay, so this is a very tight construction and as I said, the first step

actually is basically, the analysis is the same as the Luby-Rackoff analysis. In

fact, it's better to use as I said, the Patarin analysis using seven rounds. So

actually here, there's a bit of cost in security. But, overall, we get a

construction that is a secure PRP for actually, not such good security

parameters, but nevertheless, this is a good secure PRP that we can construct that

actually will operate on the range zero to S minus one. This in turn will allow us to

encrypt credit card numbers such that the cipher text looks like a credit card

number. And again, I want to emphasize that there's no integrity here. The best

we can achieve here is just deterministic CPA security. No cipher text

integrity, and no randomness at all. Okay. So this concludes this module. And as

usual I want to point to a few reading materials that you can take a look at if

you're interested in learning more about anything that we discussed in this module.

So first of all, the HKDF construction that we talked about for key derivation is

described in a paper from 2010 by Hugo Krawczyk. Deterministic encryption, The

SIV mode that we described is discussed in this paper over here. The EME mode that we

described that allows us to build a, a wider block pseudo random permutation is

described in this paper here by Halevi and Rogaway. Tweakable block ciphers that

actually led to the XDS mode of operation that's used for disk encryption is

described in this paper here. And finally, a format preserving encryption is described

in this last paper that I point to over here. Okay so this concludes this module.

And in the next module we gonna start looking at how to do key exchange.