[MUSIC].

So, here in Lecture 12.2, we're going to start our exploration of the logic side

of timing analysis. And so, what do we need to know?

We need to know what the basic assumptions are about the timing universe

and so that first assumption is going to be synchronous logic.

Things that leave storage elements, like flip flops, go through a big block of

logic and return to storage elements like flip flops.

And in this lecture, we're really just going to talk about a combinational side

of things. next we have to talk about well, where's

the delay? And the first thing we're going to have

to talk about is, what do delay models for individual gates look like?

And, one of the things that's a little bit surprising is they can be really

complicated. And so we're going to talk about how

complicated they can be but then we're going to restrict things to a sort of a

simple, reasonably realistic universe of things that are, are, in a, in a

commercially viable but, but simple enough that we can actually, actually do

some examples. And then we are going to talk about the

fact that in the actual way we do timing for logic, we stop looking at the logic.

We stop looking at anything that looks like Boolean algebra.

And we just look at these things as big complicated graphs.

And so, we're going to talk about why, from a complexity point of view, we do

something called a topological timing analysis, and not logical timing

analysis. So, let's go start looking at sequential

things, the combinational parts thereof, the delays through the gates, and

topological timing analysis. >> So, when we say that we're

interested in doing timing analysis at the logic level, what are, what are we

actually talking about? Well, our goal is to verify the timing

behavior of our logic design. So, here's the scenario.

I give you a gate-level netlist and I give you some timing models of the gates

and, maybe after the placement and the routing, I give you some timing models of

the wires. And you have tools in place that can tell

me the following answers. when signals arrive at various points in

the network, the longest delay is through the gate level network.

does the network satisfy timing requirements?

So, suppose I tell you that I want this chip to run at 1 gigahertz, which means

there's one nanosecond between the edges of the clock that control the flip flops.

Is it the case that all of the logic, the combinational logic is such that, if a

signal enters a block of combinational logic, it arrives not longer than one

nanosecond later. All right.

That's the kind of questions I want to know.

And if we do this analysis on our logic and it turns out that the answer is, oh

yeah, the logic's too slow I can't get all of the paths to the logic in 1

nanosecond. Some of them are 1.05 nanoseconds.

where do I look? you know, modern design has millions and

millions and millions of gates. It would be great if the analysis

techniques come back and they pinpoint exactly where my problems are.

And so, I'm going to show you some techniques that can answer all of those

questions. And in particular, and maybe in a

surprising way, answer the question exactly where's my problem?

What should I go focus on to fix? Now, the thing that's unfortunate is

that, the, it is the nature of the, the way you know the, the you know, the

electrical and the physical models work that a lot of this delay stuff is just

complicated in the real world. So, we're going to talk about that for a

little bit. And we're going to, you know, talk about

how we sort of simplify that for the purposes of this lecture.

first, however, I just want to do a few acknowledgments.

very early versions of this lecture used some material from, from my friends Karem

Sakallah, who is now at the University of Michigan, and Tom Szymanski, who at the

time was at AT&T Bell labs. and this version has been benefited

extensively from inputs from my friend Dave Hathaway at IBM.

Dave is actually the principal designer of Einstimer, which is IBM's production

static timing tool. So, every, every processor, every ASIC,

every big chip that gets that gets built by IBM runs through Dave's Einstimer

tool, which is doing very, very sophisticated static timing analysis.

And you know, the current version also benefited from versions of this lecture,

these lectures, that were taught by, by John Cohn, my former Ph.D.

student at IBM and Dave who were teaching this material to some, some folks at the

University of Vermont, and also some folks at at IBM.

So, lots of thanks to everybody for actually giving me lots of, lots of

useful feedback, lots of useful criticisms very, very much help, the

believe the quality of this lecture. I just want to acknowledge all of them

for the help. So, lets talk about analyzing the, the

performance of a design. So the first thing where i have to

assume, this is really important is that the design is synchronous.

And so, that means all of this storage is in explicit sequential elements.

So, you know, things like flip flops. So, the simplest way to draw this sort of

thing is that there is whole bunch of flip flops.

and they are at, if you like the start of the combinational logic, the input.

And then, there's a whole bunch of flip flops that are at the outputs of the

combinatinal logic. And there's a common clock that's

connecting all of those things. And you know, the clock edge comes along.

And it, I'm going to write this carefully.

And it launches the data out of the flip flops into the combinational logic and

so, it goes through the delays. Right?

It takes however much time it takes to get through the logic.

And then it arrives at some flip flops, where we hope it is captured.

And so, I'm just writing capture. And you know, we often draw the clock in

you know, kind of a very special way. Right?

So, there's just one cycle of the clock, you know.

We often talk about you know, sort of the launch edge of the clock and the capture

edge of the clock, assuming that we are actually talking about something like you

know, a positive edge-triggered D flip flop.

And although, this is a highly stylized kind of of a diagram, you know, please

just, you know, be aware that, I mean, this just a finite state machine.

Right? You know, logic and some I'm sorry, you

know, flip flops and some logic that goes between the, the flip flops.

It's not necessarily the case that the flip flops on the left are different than

the flip flops on the right. I mean, you know, we really, you know, we

really could just have a flip flop that I can draw over here, you know, with a D

input and a Q output, Q output, you know, and a place where the clock goes in.

And you know, the the output comes from Q and it goes to this, you know, cloud of

logic here, you know, and it goes back in the D input.

you know, those are the kinds of circuits we're actually talking about.

We're just not be talking about any of the subtle timing things that happen

right at the inputs or the output of the flip flop.

I'll mention that again when we get to the end.

There are ways of incorporating all of those all of those effects into the

models that I'm showing you. We just really don't have time to talk

about that stuff. So here's, you know?

A question you're, you're very possibly thinking, if you haven't, you know, kind

of encountered this kind of timing analysis in, in a, you know, in a real

commercial ASIC design scenario. Can we just simulate this stuff?

You know, we have great simulation tools. You know, we, we have you know, we

have[INAUDIBLE] simulation tools, and we have VHDL simulation tools and we have

SystemC and all these other, all these other great things.

You know, if I want to know how fast the logic goes, can’t I just simulate it

really, really, really hard? You know, and run a really, really,

really lot of inputs into it and see how slow it is?

and the problem is that, you know, what, what logic simulation does is it

determines how the system will behave. it, it simulates the logical functions

so, you know, it gives the most accurate answer when you have good simulation

models. But it's practically impossible to give a

complete answer especially with respect to timing.

you know, in order to, to be really confident that I understand what the

worst case delay of a big block of a few million gates is, I'm, I need some

exponential number of inputs, because I don't want to just know that for all of

the inputs I tried. You know, the delay is such and such.

I need to know that under any possible scenario of inputs, the delay will never

be longer than some number. And you just can't get that sort of a

guarantee from simulation, you know, you need, you need a different kind of a

technique. So there's no way I can ex-, examine all

possible input vectors with all possible relative timing.

And there's some, you know, nasty stuff that happens on the nanoscale with how

you know, manufacturing imperfections change the timing behavior of transistors

that are, you know, a couple hundred atoms across.

we just need a whole different solution. So simulation is great.

We rely on it for functional correctness. We cannot rely on it for this kind of

timing. We need a whole different technology.

And it needs to be, you know, not only just different, it needs to be fast

because we're going to do this a lot. So, first the basic model for our timing

is that we know something about the clock cycle, right?

I need to know how fast this thing is supposed to run in order to understand if

it's running fast enough or if it's, you know, got a problem and if it's slow.

[COUGH]. So just, you know, concrete example.

let's say I assert that the clock is 1 gigahertz, which means there is one

nanosecond between the clock edges. And so, I'm just going to draw the little

picture over here. So, here's my clock and you know there's

a positive up going edge and then there's you know, it goes over, it goes down, it

goes back up... And, you know, the difference between

those clock edges is 1 nanosecond. And so, again, I've got my diagram of

the, the kind of the logic that I'm looking to analyze.

There's a bunch of flip flops going in on the left of this logic and a bunch of, of

flip flops on the output of this logic. The flip flops on the left are launching

data into the logic. The flip flops on the right are capturing

data from the logic. Like I said before they, they might not

be the different flip flops but this is just, you know, sort of conceptually a

nice way to think about this. You know, what do I know.

I know that for this logic to work successfully the longest delay through

this network of logic must be shorter than 1 nano second.

And so, I'm just going to you know, put a great big arrow, you know, over the top

of the logic. I know that when things show up at the

output of the flip flops, they better to be able to get through that big gray

cloud of logic in less than 1 nanosecond, because 1 nanosecond later, the positive

edge of the clock comes along again and grabs the output of that logic and

captures it in the flip flops. So, I better be able to get through that

logic in less than a nanosecond. That's the kind of question that we're

going to answer. You give me a million gates of logic.

You ask, so how fast can it go? You tell me actually, I'd like it to go

at a gigahertz, please. I'll analyze it and I'll come back and

I'll be able to tell you things like, Yes, I can qualify this, that all the

paths are shorter than 1 nanosecond. Or, No this 35,000 paths are longer than

one nanosecond and, by the way, here are your problem points.

These are the places in your logic you really ought to go look.

If you fix this, maybe you can fix the whole thing.

So, that's what we're about here. what do we need to do this?

Well the first thing we need are Gate Delay Models, right?

So I've got you know, another picture of a cloud of logic here.

And I just got a bunch of N Gates, kind of in a row.

and I've also got some AND gates, where I've got the wires that are connecting

the AND gates sort of sort of ball. I've got a little fan out here, one AND

gate feeds a couple of AND gates. And, you know, at the top there's a big

question mark that says, so, what's the network delay?

Well, you know, in order to answer that sort of a question the, the, the most

straight forward thing, first, they've got to be able to answer is, what's the

delay of one gate? All righ?

So, I've got one gate here that's sort of highlighted and I'm just gona say, well,

let's say that the delay through that gate is a number and that number is

delta. Right?

That delta is you know, it's probably measured in picoseconds.

You know thousands of a nanoseconds these days.

And I'd like to be able to answer the question.

You know, how, how long does it take to get through one gate?

If you give me a million gates you know maybe I can figure it out.

I can figure out how long it takes to get through one gate.

And you might think, okay, how hard can that be?

And gosh the answer is really just surprisingly hard, really surprisingly

amazingly complex. So I'm going to give you a just sort of

the high level tour of what's going on here without a lot of details.

just so that we can get into the, you know, the interesting heart of the

problem. Okay.

So, you know, what matters when we're talking about logic delay.

Well, the gate type affects the logic delay.

Not all gates are created equal. So, I've got a picture a picture of an

AND gate here and I've got a picture of an OR gate and the little picture says

that the AND gate with the delta, the delay, is not equal to OR gate delta

delay. That, that certainly makes some sense.

You know, different gates have different transistor level electrical components,

contents you know, you expect may be an inverter is pretty quick.

As a gate, you expect maybe you know, a great big exclusive OR gate with a lot of

transistor level contents is kind of slow.

Maybe an And-Or-Inverter, an Or-And inverter is kind of slow.

Yes, correct. All right.

So, you know, what kind of a gate it is, that affects the type of delay.

So, you know, you have a few thousand gates in your technology library.

they've all got potentially different delays.