A non-pipelined instruction execution unit operating at 2 GHz takes an average of 6 cycles to execute an instruction of a program $P$. The unit is then redesigned to operate on a 5-stage pipeline at 2 GHz. Assume that the ideal throughput of the pipelined unit is 1 instruction per cycle. In the execution of program $P$, 20\% instructions incur an average of 2 cycles stall due to data hazards and 20\% instructions incur an average of 3 cycles stall due to control hazards. The speedup (rounded off to one decimal place) obtained by the pipelined design over the non-pipelined design is ............

Question

cdquestions Admin · Accepted Answer

Step 1: Define throughput for a pipelined processor. The throughput of a pipelined processor is the number of instructions coming out from the last stage of the pipeline per unit time. In this case, the time unit is 1 clock cycle, which is $0.5 \, \text{ns}$. This means that each stage in the pipelined processor takes only 1 clock cycle to operate (ignoring register delay and clock skew). Step 2: Define CPI for pipelined and non-pipelined processors. The CPI (Clock Cycles Per Instruction) is defined as the average clock cycles per instruction. For a non-pipelined processor, it takes $6$ clock cycles to complete an instruction, whereas for a pipelined processor, it takes only $1$ clock cycle on average to complete an instruction. Thus: $$ \text{CPI (non-pipelined)} = 6, \quad \text{CPI (pipelined)} = 1. $$ Step 3: Define the speedup of the pipelined processor. The speedup of the pipelined processor compared to the non-pipelined processor is given by: $$ \text{Speedup} = \frac{\text{CPI (non-pipelined)}}{\text{Ideal CPI (pipelined)} + \text{Pipeline stall clock cycles}}. $$ Here, the pipeline stall clock cycles are added because some clock cycles are wasted due to stalls in the pipeline. Step 4: Calculate the speedup. For a program with a total instruction count of $IC$, the pipeline stall clock cycles are due to $20\%$ of the instructions stalling for $2$ cycles and another $20\%$ of the instructions stalling for $3$ cycles. The effective CPI for the pipelined processor becomes: $$ \text{Effective CPI (pipelined)} = 1 + (20\% \times 2) + (20\% \times 3). $$ Substitute the values: $$ \text{Effective CPI (pipelined)} = 1 + 0.4 + 0.6 = 2. $$ The speedup is then calculated as: $$ \text{Speedup} = \frac{\text{CPI (non-pipelined)}}{\text{Effective CPI (pipelined)}} = \frac{6}{2} = 3. $$ Step 5: Conclude the result. The pipelined processor is $3.0$ times faster than the non-pipelined processor. Final Answer: $$ \boxed{\text{The pipelined processor is 3.0 times faster than the non-pipelined processor.}} $$

Show Hint

Solution and Explanation

Top Questions on Indexed Address

Questions Asked in GATE CS exam