This course will introduce you to the foundations of modern cryptography, with an eye toward practical applications.

Loading...

From the course by University of Maryland, College Park

Cryptography

389 ratings

This course will introduce you to the foundations of modern cryptography, with an eye toward practical applications.

From the lesson

Week 2

Computational Secrecy and Principles of Modern Cryptography

- Jonathan KatzProfessor, University of Maryland, and Director, Maryland Cybersecurity Center

Maryland Cybersecurity Center

[SOUND].

In the last lecture we defined the notion of pseudorandomness and

I'll just recall that definition here.

If we have a sequence of distributions, Dn where each distribution Dn

is a distribution over strings of length p(n), for some function p,

will say that that sequence of distributions is pseudorandom if for

all probabilistic polynomial time adversaries A.

There's some negligible function epsilon such that if we look at the difference in

probabilities, between the probability that A(x) outputs one when x is sampled

according to D(n) and the probability that A(x) outputs one when x is sampled

according to the uniform distribution over strings of length p(n), then

the absolute value of the difference of those probabilities is at most epsilon n.

So intuitively, for any statistical test A, running in pro probabilistic polynomial

time, the distinguishing gap, right, the probability with which that statistical

test can distinguish whether it's given a string sampled according to D(n), or, or

whether he's given a string sampled according to the uniform distribution, or

the strings of the appropriate length.

It's at most epsilon(M).

We now want to define the important notion of pseudorandom generators.

I'll abbreviate this by PRG.

In practice, you may also hear of pseudorandom number generators

abbreviated PRNG, and for our purposes those are going to be the same.

But I'm going to stick with the terminology PRG, pseudorandom generator.

So, a PRG, a pseudorandom generator, is an efficient deterministic algorithm

that expands a short, uniform seed or input into a longer pseudorandom output.

We'll define this more formally in a moment.

But intuitively, this is just an algorithm that is given a short input,

converts it into a longer output and it has the property that if you sample its

input uniformly, then its output will be pseudorandom.

And pseudorandom generators are very useful whenever you have a small amount of

true random bits, but you want lots of random looking bits.

So just as a toy example or a made up example, say we have some algorithm that

requires 10,000 random bits, 10,000 uniform and independent bits.

We said in an earlier lecture that actually sampling true

randomness is very difficult.

You have to collect mouse movements or

collect timings between network events or typing at the keyboard.

And it takes actually quite a bit of effort and

time to collect true random bits.

So imagine that we've collected, say, 100 random bits,

100 uniform and independent bits.

This assumes we've collected them and then processed them appropriately to

smooth them out and get some number, say, 100, of true random bits.

Well, what we can do is we can apply a pseudorandom generator to those 100 bits

and get a 10,000 bit output with the guarantee that as long as our

input bits are random or close to random, i.e, close to uniform,

the output of that pseudorandom generator will be random looking.

And if we then run our algorithm that requires those 10,000 random bits using

as its random bits, the output of the pseudorandom generator, then the output of

that algorithm will be as good as if we had run it on 10,000 uniform and

independent bits.

So pseudorandom generators are very,

very useful when you have a small amount of true randomness and you want to

extend it into a larger amount of things, of something, which is as good as random.

More formally, let G be a deterministic, polynomial time algorithm.

So, not only is this algorithm efficient but it's also deterministic.

If you run it twice on the same input, you get the same output.

We'll require also that G be expanding.

This just means that the output of G is longer than its input.

And in particular we can denote the output length of

G(x) to be equal to p of the length of x.

So this means that when you feed,

when you feed G with some input string x that has length n.

Then the output has length p(n), and

will require that p(n) be greater than n for all n.

In pictures, what we have is this algorithm G which

we're viewing here as box.

And it takes as input a short seed.

Okay, we're going to continue to refer to the input to G as a seed.

It takes that short input and produces a longer output.

Now what's interesting to note here is that this deterministic algorithm,

or this deterministic function G, defines naturally a sequence of distributions.

And it does that in the following way.

We can define distribution Dn to be the distribution on strings of length p(n).

Remember that p(n) was the output length of G when given an input of length n.

It's going to be the distribution on strings of length p(n) defined by

choosing a uniform seed x and then outputting G(x).

This is a valid distribution on strings of length p(n) where it

is defined by this process by which we sample a uniform seed and

then apply G to that seed to obtain a string of length p(n).

If you wanted to work through this formally, what it means,

well then it means that we're defining the following distribution d(n).

We're defining a distribution d(n) where the probability of any particular string y

of length p(n) is equal to the probability with

which G(x) is equal to y when we sample x uniformly.

That is the probability that D(n) assigns to some string y is exactly

the probability with which G(x) equals y, which is exactly the summation over

all inputs x, such that G(x) equals y of the probability of x.

And you can continue to further manipulate this and what you get in the end is that

it's the number of inputs x, such as G(x) equals y divided by 2 to the n, right?

It's the number of x's that map onto this string y multiplied by 2 to

the minus n because we're sampling x from the uniform distribution on n bit strings.

And so the probability of any particular x in that set being chosen is 2 to

the minus n.

So you can think about it either way, you can think about it formally, or you can

think about it in terms of the randomized experiment, which is also formal,

it's just that we're not, we're not explicitly writing out the distribution.

These are exactly equivalent ways of looking at the same randomized process.

We can then define this particular algorithm G,

to be a pseudorandom generator,

if this sequence of distributions Dn defined by G, is pseudorandom.

If we expand that out.

What this means, is that for all probabilistic polynomial time attackers A,

there's some negligible function epsilon such that,

if we look at the probability with which A when given input G(x) outputs one, and

compare that to the probability with which A, given a uniform string y outputs one.

The absolute value of the difference in those probabilities will be

at most epsilon.

So, in, more intuitively, this means that no efficient algorithm a can

distinguish with probability better than epsilon whether it's given G(x),

right the output of a pseudorandom generator,

when run on a uniform string x or whether it's given a completely uniform string y.

Right, so that means that our pseudorandom generator really is giving us

something that looks random.

It's producing for us strings of length p(n) that are indistinguishable for

any efficient attacker from a uniform string of that length.

Do pseudorandom generators exist?

Unfortunately, we don't know.

This is one of those things where we can't hope to prove the existence,

that we can't hope to prove the existence of without resolving,

in particular, the question of whether p is equal to np.

And in fact we require a, a even stronger assumptions than that.

Never the less, what we can do is that we can assume that certain concrete functions

G are pseudorandom generators and in fact this is exactly what we do in practice.

In practice, we have particular examples of deterministic functions G,

that people have designed and people have studied and people have analyzed and

have found no way with which to distinguish their output from uniform,

a uniform string of the appropriate length.

So it's reasonable therefore to assume that those functions G

are indeed pseudorandom generators.

And what we can then do is use those pseudorandom generators or

use those functions G to build other cryptographic schemes.

In addition from a more theoretical point of view.

It's known that we can construct pseudorandom generators from

weaker assumptions.

The assumption that some algorithm is a pseudorandom generator is a relatively

strong assumption.

It's not really clear a priori why it might be reasonable to

make that assumption about any particular algorithm.

However, there are weaker assumptions we can make.

And, starting from those weaker assumptions,

we can then build a pseudorandom generator,

where we can prove that the resulting algorithm is a pseudorandom generator

given some assumption that we start with, which is more reasonable.

We're not going to cover the details of any such construction here, but

I refer you to my textbook for more details on that.

Now, in either case,

we can assume that we have some G which is a pseudorandom generator,

whether we take as that G some practical construction or whether we take for

that G something that we've constructed based on even weaker assumptions.

But we can take that G.

We can start with the assumption that it's pseudorandom.

And we can then use it to construct an indistinguishable encryption scheme that

is a scheme that, that achieves computational secrecy.

And if you recall back to our discussion of the principles of modern cryptography.

Remember how we talked about the importance of assumptions?

And then using those assumptions to come up with constructions that we

can prove secure.

This will be an example of that.

We'll start with the assumption that sum G is a pseudorandom generator.

We'll build from that an encryption scheme.

And then we'll be able to prove that the encryption scheme satisfies our

definition of security under the assumption that the G we

started with is indeed a pseudorandom generator.

And we'll see exactly how that's done in the next two lectures.

Coursera provides universal access to the world’s best education,
partnering with top universities and organizations to offer courses online.