This course will introduce you to the foundations of modern cryptography, with an eye toward practical applications.

Loading...

From the course by University of Maryland, College Park

Cryptography

424 ratings

This course will introduce you to the foundations of modern cryptography, with an eye toward practical applications.

From the lesson

Week 2

Computational Secrecy and Principles of Modern Cryptography

- Jonathan KatzProfessor, University of Maryland, and Director, Maryland Cybersecurity Center

Maryland Cybersecurity Center

[SOUND] In this lecture we're going to

talk about the very important concept of, Pseudorandomness.

This is going to serve as an important building block for

computationally secret encryption schemes.

And it's actually an important concept that comes up very frequently

in cryptography.

Let's first step back for a moment, and ask ourselves, what does random mean?

Or more precisely, what do we mean by uniform?

And actually I'm going to use the term, the terms random and

uniform interchangeably.

But, when I want to be precise, when I want to be careful I'll real,

I'll only use the term uniform, when I mean the uniform distribution.

Well we can ask ourselves the following question, if I give you three bit strings,

which of these three bit strings is a uniform string?

Well the first one, has a pattern.

010101 et cetera.

The third one, also has an even more obvious pattern.

It's just a string of all zeros.

So the second one, might look to you,

as if that's the only one of the three that should qualify as being uniform.

But in fact, if you generated a string uniformly at random,

then each of the three strings shown here,

occurs here with probability exactly 2 to the minus 16.

That is, each of these strings occurs with the same exact probability,

if we generate a 16-bit string uniformly at random.

So, we really can't say that any one of these strings is random,

and another one isn't, and in some sense they're all equally random.

What we see from this example, is that randomness is not a property of a string,

but rather it's a property of a distribution over strings.

What I mean by that is that no particular string, in the previous case no

particular 16-bit string, can be said to be anymore random or

anymore uniform, than any other 16-bit string.

But what we can talk about, is the uniformity of a distribution on strings,

a distribution that results in or generates, a 16-bit string.

Just to be a little bit more careful here, a distribution on n bit strings is

simply a function, called D here, that assigns to each possible

string of length n, a probability in the range of 0 to 1.

You can view this as defining, a random variable,

which takes on values that are n bit strings.

And furthermore we of course require the condition that the sum over all

these probabilities is equal to 1.

So we specify a distribution on n-bit strings, by specifying for

all, for each possible n-bit string,

what the probability is with which that string is sampled.

So a uniform distribution on n-bit strings is simply the distribution that I'll

denote by use of n, which assigns equal probability,

2 to the minus n, to every n-bit string.

That is the distribution in which Un of x, the probability with which the string x

occurs, is 2 to the minus n, for every string x of length n.

Give this preceding discussion, what should we say that pseudorandom means?

Well intuitively, what we're trying to capture is the idea,

that a particular string is pseudorandom,

if it can't be distinguished from a uniform or a random string.

But as we saw before, if we try to ask ourselves, which of some set of strings or

which of these three strings, is pseudorandom.

It doesn't really make sense,

the question doesn't make sense the same way it didn't make sense before.

And again, what we see here, is that pseudorandomness,

is not going to be defined as a property of any particular string, but

rather pseudorandomness is a property of a distribution on strings.

So we can't really speak, of a pseudorandom string, but

what we can talk about is a pseudorandom distribution on strings.

Our first definition or our first candidate definition,

of pseudorandomness, might be something like the follow, like the following.

And this is actually what people, did historically.

When they were looking at the case of pseudorandomness back in the 1950s.

Let's fix some distribution D, on n-bit strings.

So we just defined what a distribution is a, a moment ago.

And we'll just fix any such distribution.

By way of notation, I'll write x left arrow D,

to refer to sampling the string x, according to distribution D.

That is picking a string according to distribution D,

and assigning that string to the variable x.

Again, historically speaking, D was considered to

be a pseudorandom distribution, if it passed a bunch of statistical tests, so

what do I mean by that.

Well what I mean by that is, if we look at for example the probability,

with which the first bit of x is 1, when we sample x according to distribution D.

Then that probability should approximate, the probability with which the first bit

of x would be 1, if x were chosen from the uniform distribution.

If x were chosen from the uniform distribution,

the probability that it's first bit would be equal to 1, would be exactly one-half.

So we require that when we sample x according to distribution D,

the first bit of x should be 1 with probability roughly one-half.

Similarly, if we look at the probability with which the parody, of the bits of x,

that is the XOR, of all the bits in x, is 1.

Well, if D is a pseudorandom distribution then that probability should

approximate the probability with which,

the parody of x is 1, when x is sampled from the uniform distribution.

That probability of exactly one-half.

So therefore if D is pseudorandom, the probability with which the parody of

x is 1 when x is sampled from D, should also be close to one-half.

And more generally we can fix, some set, of statistical tests.

That is, any kind of predicate, we define on a string x.

And we can look at or we can compare, the probability with which Ai of x is equal to

1 when x is sampled from distribution D and the probability that A ek,

Ai of x is equal to 1, when x is sampled from the uniform distribution.

And those should be close.

And if we require that to hold, if we require closeness to,

to hold for say 20 different statistical tests A1 through A20,

then that can serve as our definition ,of what it means for D to be pseudorandom.

That is D is pseudorandom, if these probabilities are equal for

i equals 1 to 20 or are close rather, for i equals 1 to 20.

You should immediately see though, that this kind of a definition, while it may be

sufficient for some applications, is not going to be sufficient in cryptography,

where we have an adversary, who's specifically trying to distinguish.

Your string sampled from your distribution,

from string sampled from the uniform distribution.

That is, the definition on the previous slide,

is not going to suffice, in an adversarial setting.

And the reason for this is because we don't know,

what statistical test an attacker will come up with.

Right, we had defined on the previous slide that a distribution,

was pseudorandom, if it passed some set A1 through A20, of statistical tests.

But an attacker might come along with its own test, not a net set.

And if that test distinguishes the output the this string x chosen

from the uniform distribution from a string x sampled from your distribution D,

then D shouldn't count in the pseudorandom distribution.

So the cryptographic definition of pseudorandomness the modern definition,

is that a distribution D is pseudorandom, if it passes every,

efficient statistical test.

Now if we had left out the word efficient and

required it to be pseudorandom only if it passes every single statistical test,

it turns out that that's equivalent to requiring that D be uniform.

And therefore, uniform in pseudorandom would end up with the same meaning, and

we wouldn't get anywhere.

But as we've talked about, we're interested a computational relaxation, and

it's reasonable to restrict our attention, to only efficient attackers,

who can only perform efficient statistical tests.

And so, it'll suffice for our purposes if we define D in this way.

Concretely, if we try to formalize a definition on the previous slide,

using a concrete notion of security, we would get a definition like the following.

So fix some distribution D, on n bit strings.

Then we can say that the distribution D, is t epsilon pseudorandom.

If for all attackers A running in some, running in time at most t,

the probability with which A of x equals 1, when x is sampled according to D,

and the probability with which A of x is equal to 1,

when x is sampled from the uniform distribution.

Those two probabilities, are at most epsilon apart.

So the difference between those probabilities or

the absolute value of that difference, is at most epsilon.

This exactly corresponds to the intuitive notion we had on the previous slide.

If we view the attacker A, as being equivalent to

some kind of statistical test, on strings x sampled from the distribution.

Then we're requiring that no statistical test that runs in time, at most,

t, can distinguish between a string sampled from the uniform distribution, and

a string sampled from distribution D, with probability any better than epsilon.

The asymptotic definition of pseudorandomness which is the one we're

going to be using, from now on in the course, is a bit more complicated because

again we need to take into account this idea of having a security parameter.

So as before, we'll denote that security parameter by n.

And here the security parameter will correspond exactly,

to it will define for us some length of strings that we're considering.

So in addition to the security parameter n, we'll have some polynomial p, and

we'll let D sub n, be some distribution, over strings of length p of n.

So, that means that for every value of the security parameter n,

we're looking at distributions over string of length, over strings of length p of n.

You can think, if you like for now of p of n being equal to n.

But it will be more interesting later on if we let p of n be larger than n,

so you can think of p of n being n squared, for concreteness.

Now, in asymptotic setting,

pseudorandomness, will be a property of a sequence of dish, dis, distributions.

So rather than looking at any particular distribution,

DI, and looking at the probability with which some test can distinguish it.

What we're going to be interested in is the asymptotic probability with which any

efficient attacker, can distinguish D of n from a uniform distribution,

over strings of length p of n.

And again we're going to be interested in the asymptotic, performance here.

The asymptotic behavior of an algorithm or, or

of any algorithm trying to distinguish.

Or equivalently the asymptotic pseudorandomness,

of the distribution's Dn.

So I'm going to let this set Dn correspond to the set D1, D2, D3, et cetera.

So on up to infinity.

So we have now a sequence of distributions, defined or called Dn,

and we'll say that this sequence of distributions is pseudorandom.

If for every probabilistic, polynomial-time algorithm A,

there's some negligible function epsilon.

Such that if we look at the difference between the probability that A of x equals

1 when x is sampled from distribution Dn, and the probability that A of x outputs 1,

when x is sampled from the uniform distribution.

The difference or the absolute value of the difference will be at most, epsilon n.

Okay. Let's break this

down a little bit further.

So on the left hand side of this inequality, we have a term,

a difference or

an absolute value of a difference between two values, which is a function of n.

Fixing some algorithm A, for every value of the security parameter n,

we can explicitly compute or

evaluate the probability with which A of x equals 1, when x is sampled from D of n.

And the probability that A of x equals 1, when x

is sampled from the uniform distribution, over strings of length p of n.

Note that the input to A in both cases, is of the same length.

Because in one case we're sampling from the distribution D of n,

which is a distribution over strings of length p of n.

And in the other case, we're sampling from the uniform distribution over strings of

length p of n.

So for every value of the security parameter n,

we can compute the two probabilities, take their difference, take the absolute value.

And we get a number.

So what's on the left hand side is, something which is a function of n.

For every value of n, we get a real number.

And what we're asking, or what we're requiring, is that that function,

be negligible, meaning that asymptotically it will decay to zero,

faster than any inverse polynomial function.

So again, the sequence D of n is pseudorandom if for every efficient

algorithm A, there's some negligible function, which may depend on A.

But which will still be guaranteed to be negligible, such that the probability with

which A outputs 1, when given a string sample according to D of n,

and the probability with which A outputs 1,

when given something sampled according to the uniform distribution,

is at most epsilon of n, is at most that negligible function.

In the next lecture,

we'll talk about pseudorandom generators, which are deterministic functions,

that allow us to sample from a pseudorandom distribution.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.