And now we're going to move on to talking about basic pipelines, and we will start
off by just talking about how to build a pipeline and how to name things in a
pipeline, and how instructions flow down a pipeline, and why you want to build the
pipeline. And we start off with a very gentle
introductions to this, instead of my talking about, a idealized pipeline, not
necessarily in a processor. Pipelines show up many places in the
world. They show up in car factories, toy
factories. I know some people will say, you know,
when you go use the washing machine in a laundry-mat you're pipelining cause you
first put your, laundry in the washing machine and then take it out.
Then you can put it in the dryer, we can put more work in the, or more laundry in
the first washing machine. So we can, we see pipelines in lots of
different places in the world. But we also see it, in, in this class
we're gonna care about using pipelines in microprocessors.
So let's think about an idealized pipeline.
So I have, have a picture here of an idealized pipeline.
What are some of the good, or what are some of the requisite.
Conditions for an idealized pipeline. Well, first of all, all work, or all
objects should go through all the stages. So here we have four stages in this
pipeline, and there's no sort of squiggly lines where things sort of come out of the
pipeline, or go around the pipeline, or exit early.
It doesn't happen in this, this diagram. So, in an idealized pipeline, you want all
of the objects to go through all the stages.
You don't want any resources shared between different stages.
So you don't want some resource here which is stared, shared between stage two and
stage three So an example of this in a car assembly pipe pipeline or assembly line
would be one tool which two different stages in the pipeline have to use or two
different stages have to use that would cause what is known as a structural hazard
in processor design. The propagation delay between all the
different, of all the different stages should be the same.
So the time taken for stage one should be the same as two, stage two, the same as
stage three, the same as stage four in an idealized pipeline.
And then, finally, the scheduling of an operation, or a transaction, or an object
going down the pipeline should not be affected by what's currently in the
pipeline. And, these conditions actually hold true
for most assembly lines in plants where people go and build cars or something like
that. There's no dependence between the
different parts. A car comes into it.
All cars are sort of, separate from each other.
It's not, not really a problem. Unfortunately, if you go and look at
something like a microprocessor and executing instructions.
Instructions actually depend on earlier instructions.
And they depend on it in multiple different ways.
They can either depend on the data values or they can depend on the fact that you
took a branch or a control flow instruction.
So we have to look at these different hazards and we have to think about how to
either solve them or deal with them and how we have to, to, we have to think about
how to deal with non-ideal pipelines. So let's go one step beyond our microcoded
disk processor design and look at a un-pipelined data path for the MIPS
instruction set. So this is going to be similar.
Or if you have already taken a computer organization class, you've probably seen
similar sorts of unpipelined data paths. So, when on a unpipeline data path, where
instead of the program counter here, we have some instruction, memory where we go
and fetch our instruction out. And, this flows through, we fetch our
registers from the register file. We flow through, we do the actual
computation, we might have to go access, data from, data memory if we do a load or
restore, and then finally comes back around.
We write the result here and we increment or change the program counter.
And, this is all done, in one cycle. And it's a very long cycle, cuz you launch
here, you go through all this work. You do all these things.
The data gets put back here, and you have to, it has to happen because we're
unpipelined, we have an unpipelined processor here, it has to happen in one
cycle. So this is kind of, not great from a cycle
time perspective. It is good from a perspective of the
number of cycles per instruction. It's always one because you launch
instruction, you wait until it's done, then you launch the next instruction, you
launch the next instruction, and each instruction it launches is one cycle.
So let's, let's simplify this design, a little bit so we can talk about what's
going on here in a little bit more depth. So, we're gonna simplify our un-pipelined
processor design and just take out some of the, the extra stuff here and focus on,
you learn from the program counter, it acts as the instruction memory all
combinationly, you go through the registered file combinational logic, you
go through the data memory, and you come back around.
So, that would be something like a worst case load instruction, and, and the value
gets run back here into the general purpose register file.
So let's now move on to thinking about how do we build the same data path, but we try
run a pipe line it now. Well the first thing we want to do is we
want to cut the pipeline up or cut our design up.
And we are gonna put in registers. And, it just, so happens that this is a
convenient place to put registers for something like a MIPS datapath.
And, some important things that you might recall from your digital logic design
course is that when you go to pipeline a circuit, you need to make sure that you
cut all lines which are going in this direction across the registers.
So if we were to cut here, we need to put a register here, here and here cuz there's
three sets of wires running from left to right.
This feedback path here doesn't need a pipeline register cuz it's, it's flowing
back and it's effectively running back into the register file here
combinationally. And you know, the register file can either
be clocked or unclocked depending on how you actually do this design.
So we've pipelined our data path now. And we'll, let's put some names on these
things, that we're gonna be re-using throughout the class.
So we're gonna call this the, the fetch phase or the, the instruction fetch time.
Sometimes in this class we'll either denote this with a upper case F, or we'll
denote this with I F, for instruction fetch.
The next thing we're going to do is we're going to decode the instruction, and we're
also going to fetch the registers out of the register file.
So the decode instruction this is how we take the tightly packed bits in the
instruction and we blow it up into control wires.
And the register fetch phase we're actually going to fetch from the register
file and this is sometimes we'll denote with an upper case D, or sometimes we'll
denote it with RF for register fetch. This is the execute phase.
We're actually doing real work here. We're taking the ALU.
We're doing some add, we're doing some multiply, something like that.
Maybe some subtract or a shift operation, and we'll denote this with an upper case X
or a, the letters E X, for execution. The memory phase we're accessing the
memory here, the data memory and then finally the write back phase, and we'll
denote this with m and then finally, or mem.
And then finally the write back stage here we'll typically denote with wb or w to
denote when we actually go to write the data into the register file.
We have a whole pipeline stage dedicated to that.
So, one of the, the important things, now, I, I just picked five different places
here to pipeline this, this design, but, this is not accidental.
What we're trying to do is we're trying to divide what was a long, a, a long time
period, and cut it up into smaller chunks here.
So we're trying to reduce our clock period, and we can do this by dividing the
execution up into multiple cycles. And if we look at the time, the cycle time
is greater than the maximum, or greater than or equal to the maximum of the time
to access the instruction memory. The time to access the register file.
The time to access the ALU. The time to access the data memory.
And a time to write back. And those are the different things in the
five stages here. So we're trying to.