0:00

In this segment, I want to tell you about another form of encryption, called format

Â preserving encryption. This is again something that comes up in practice quite

Â often, especially when it comes to encrypting credit card numbers. And, so,

Â let's see how it works. Remember that a typical credit card number is sixteen

Â digits, broken into four blocks of four digits each. You've probably heard before

Â that the first six digits are what's called the BIN number, which represent the

Â issuer ID. For example, all credit cards issued by VISA always start with the digit

Â four; all credit cards issued by MasterCard start with digits 51 to 55, and

Â so on and so forth. The next nine digits are actually the account number that is

Â specific to the, to the particular customer and the last digit is a check sum

Â that's computed from the previous fifteen digits. So there are about 20,000 BIN

Â numbers and so if you do the calculation it turns out there are about 2 to the 42

Â possible credit card numbers which corresponds to about 42 bits of

Â information that you need to encode if you want to represent a credit card number

Â compactly. Now the problem is this. Suppose we wanted to encrypt credit card

Â numbers, so that when the user swipes the credit card number at the point of sale

Â terminal, namely at the, you know, the merchant's cash register. The cash

Â register, this is called a point of sale terminal, goes ahead and encrypts the

Â credit card number using a key k and it's encrypted all the way until it goes

Â to the acquiring bank or maybe even beyond that, maybe even all the way when it goes

Â to Visa. Now, the problem is that these credit card numbers actually pass through

Â many, many, many processing points. All of them expect to get basically something

Â that looks like a credit card number as a result. So if we simply wanted to encrypt

Â something at the end point, at one end point, and decrypt it at the other end

Â point, it's actually not that easy because if all we did was encrypt it using AES,

Â even if we just used deterministic AES, the output of the encrypted credit card

Â number would be 128 bit block. Sixteen bytes that would be, that would need to be

Â sent from one system to the next, until it reaches its destination. But each one of

Â these processors, then, would have to be changed, because they're all expecting to

Â get credit card numbers. And so the question is, can we encrypt at the cash

Â register, such that the resulting encryption itself looks like a credit card

Â number. And as a result, none of these intermediate systems would have to be

Â changed. Only the endpoints would have to be changed. And in doing so we would

Â actually obtain end-to-end encryption without actually having to change a lot of

Â software along the way. So the question then is, again, can we have this mechanism

Â called format preserving encryption, where encrypting a credit card itself produces

Â something that looks like a credit card? So that's our goal. The encrypted credit

Â card number should look like a regular valid credit card number. So this is the

Â goal of format preserving encryption. More abstractly what it is we're trying to do,

Â is basically build a pseudo random permutation on the set zero to S minus one

Â for any given S. So for the set of credit card numbers, S would be roughly, you

Â know, two to the 42. In fact, it's gonna be, not exactly two to the 42. It's gonna

Â be some funny numbers that's around two to the 42. And our goal is to build a PRP

Â that acts exactly on the interval, zero to S minus one. And what we're given as input

Â is some PRF, say, something like AES, that acts on much larger blocks, say, acts on

Â sixteen byte blocks. So we're trying to, in some sense, shrink the domain of the

Â PRF to make it fit the data that we're given. And the question is basically how

Â to do that. Well, once we have such a construction we can easily use it to

Â encrypt credit card numbers. What we would do is we would take the, a given credit

Â card number. We would encode it such that now it's represented as a number between

Â zero and the total number of credit card numbers. Then we would apply our PRP so we

Â would encrypt this credit card number, you know, using construction number two from

Â the deterministic encryption segment. And then we would map the final number back to

Â be a regular, to look like val--, regular, valid credit card number and then send

Â this along the way. So this is, this is again non expanding encryption. The

Â best we can do, as we said before is to encrypt using a PRP except again the

Â technical challenge is we need a PRP that acts on this particular funny looking set

Â from zero to S-1 for this prespecified value of S. So my goal is to show you how

Â to construct this and once we see how to do it, you will also know how to encrypt

Â credit card number so that the resulting cipher text is itself a credit card

Â number. So the construction works in two steps. The first thing we do is we shrink

Â our PRF from the set {0,1} to the n, say {0,1} to the 128 in the case of AES,

Â to {0,1} to the t where t is the closest power of two to the value S.

Â 4:44

So say S is some crazy number around two to the 41. T would basically be then 42

Â because that's the closest power of two that's just above the value S. So the

Â construction is basically gonna use the Luby-Rackoff construction. What we need

Â is a PRF F prime that acts on blocks of size t over two. So imagine for example

Â in the credit card case, t would be 42 because two to the 42 we said is the

Â closest power of two that's bigger than, than, than S. S is the number of credit,

Â total number of credit cards. Then we would need a PRF now that acts on 21-bit

Â inputs. So one way to do that is simply to take the AES block cipher, treat it as a

Â PRF on 128-bit inputs, and then simply truncate the output to the least

Â significant 21 bits. Okay, so we were given a 21 bit value. We would append

Â zeros to it so that we get 128 bits as a result. We would apply AES to that and

Â then we would truncate back to 21 bits. Now I should tell you that that's actually

Â a simple way to do it but it's actually not the best way to do it. There are

Â actually better ways to truncate a PRF on n bits to a PRF on t over two bits.

Â But for our pur--, for the purposes of this segment, the truncation method I just said

Â is good enough. So let's just use that particular method. Okay, so now, we've

Â converted our AES block cipher into a PRF on t over two bits, say, on 21 bits. But

Â what we really want is a PRP. And so what we'll do is we'll plug our new PRF, F prime,

Â directly into the Luby-Rackoff construction. If you remember, this is

Â basically a Feistel construction. We saw this when we talked about DES. It's a,

Â Luby-Rackoff used three rounds, and we know that this converts a secure PRF into

Â a secure PRP on twice the block size. In other words, we started from a secure PRF

Â on 21 bits, and that will give us, and Luby-Rackoff will give us a secure PRF on

Â 42 bits, which is what we wanted. Now, I should tell you that the error parameters

Â in the Luby-Rackoff construction are not as good as they could be. And, in fact, a

Â better thing to do is to use seven rounds of Feistel. So in other words, we'll do

Â this seven times; every time we'll use a different key. So you notice, here, we're

Â only using three keys. We should be using, we should be doing this seven times using

Â seven different keys. And then there's a nice result, due to Patarin, that

Â shows that the seven-round construction basically has as good an error

Â terms as possible. So the error terms in the security theorem will basically be two

Â to the T. Okay. So now we have a pseudo random permutation that operates on close

Â power of two to the value of S. But that's not good enough. We actually want to get a

Â pseudo random permutation on the set zero to S minus one. So step two will take us

Â down from {0,1} to the T, to the actual set zero to the S minus one that we're

Â interested in. And this construction is, again, really, really cute, so let me show

Â you how it works. So, again, we're given this PRP that operates on a power of two.

Â And we wanna go down to a PRP that operates on something slightly smaller.

Â Okay, so here's on the construction works. Basically we're given input X in the set

Â zero to S minus one that we want. And what we're going to do is we're going to

Â iterate the following procedure again and again. So first of all we map X into

Â this temporary variable Y. And now we're just gonna encrypt the input X and put the

Â result into Y. If Y is inside of our target set, we stop and we output Y. If

Â not we iterate this again and again and again and again and again until finally Y

Â falls into our target set and then we output that value. So in a picture, let me

Â explain how this works. So we start from a point in our target set. And now, now we

Â apply the, the function E and we might be mapped into this point outside our target

Â set, so we're not gonna stop. So now we apply the function E again and we might

Â be mapped into this point and now we apply the function E again and now all of a

Â sudden we map into this point, and then we stop, and that's our output. Okay, so

Â that's how we map points between from zero to S minus one, to other points in the

Â zero to S minus one. It should be pretty clear that this is invertible. To invert,

Â all I'll do is I'll just decrypt and walk, kind of, in the opposite direction. So

Â I'll decrypt, and then decrypt, and then decrypt, until finally, I fall into the

Â set, zero to S minus one. And that gives me the inverse of the point that I wanted

Â to. So this is, in fact, a PRP. The question is whether it's a secure PRP, and

Â we'll see that in just a minute. But before that, let me ask you, how many iterations

Â do you expect that we're gonna need? And I wanna remind you again, before I ask you

Â that question, that, in fact, S is less than two to the T, but is more than two to the T-1.

Â So in particular S is greater than two to the T over two. And my question to you

Â now is how many iterations are we gonna need, on average, until this process converges?

Â 9:32

So the answer is we're going to need two iterations,

Â so really just a small constant number of iterations. And so this

Â will give us a very, very efficient mechanism that will move us down from

Â {0,1} to the T to zero to the S-1. So now when we talk about security, it turns out

Â this step here is tight. In other words, there is really no additional security

Â cost to going down from two to the T to zero to S-1. And the reason that's true,

Â it's actually again a very cute theorem to prove, but I, I won't do it here. Maybe

Â I'll leave it as an exercise for you guys to argue. Which says that if you give me

Â any two sets Y and X, where Y is contained inside of X, then applying the

Â transformation that we just saw to a random permutation from X to X actually

Â gives a random permutation from Y to Y. So this means that if we started with a truly

Â random permutation on X, you'll end up with a truly random permutation on Y. And

Â since, well, the permutation we're starting with is indistinguishable from

Â random on X, we'll end up with a permutation that's indistinguishable from

Â random on Y. Okay, so this is a very tight construction and as I said, the first step

Â actually is basically, the analysis is the same as the Luby-Rackoff analysis. In

Â fact, it's better to use as I said, the Patarin analysis using seven rounds. So

Â actually here, there's a bit of cost in security. But, overall, we get a

Â construction that is a secure PRP for actually, not such good security

Â parameters, but nevertheless, this is a good secure PRP that we can construct that

Â actually will operate on the range zero to S minus one. This in turn will allow us to

Â encrypt credit card numbers such that the cipher text looks like a credit card

Â number. And again, I want to emphasize that there's no integrity here. The best

Â we can achieve here is just deterministic CPA security. No cipher text

Â integrity, and no randomness at all. Okay. So this concludes this module. And as

Â usual I want to point to a few reading materials that you can take a look at if

Â you're interested in learning more about anything that we discussed in this module.

Â So first of all, the HKDF construction that we talked about for key derivation is

Â described in a paper from 2010 by Hugo Krawczyk. Deterministic encryption, The

Â SIV mode that we described is discussed in this paper over here. The EME mode that we

Â described that allows us to build a, a wider block pseudo random permutation is

Â described in this paper here by Halevi and Rogaway. Tweakable block ciphers that

Â actually led to the XDS mode of operation that's used for disk encryption is

Â described in this paper here. And finally, a format preserving encryption is described

Â in this last paper that I point to over here. Okay so this concludes this module.

Â And in the next module we gonna start looking at how to do key exchange.

Â