| dc.description.abstract |
Analysis of the performance of microcomputing systems (Pentium-based Systems as a case study) is investigated using cycle per instruction (CPI) metric. Also examined are the use of the architectural features of the Pentium processor under six different application programs. The effects of the various constituents of the CPI were also examined on the application programs. This analysis indicates where
each of the application spends its time during execution, thereby giving the designers a better understanding of the design trade-offs and the potential causes of performance bottlenecks.
A significant portion of the Pentium processor die is spent on implementing superscalar capability. Unfortunately, several factors prevent the system from increased performance by using the V pipe. Some of these factors are: (1) the Pentium's pairing rules are highly constrained and complicate the compiler optimisations, (2) complex instructions cannot execute in parallel, and (3) when the basic cycles per instruction is large, the contribution of parallel instruction execution is relatively low.
Therefore, there is a trade-off between maximal speedup of microcoded instructions and improved parallelism. Microcoded-instruction speedup requires the use of both pipes by the microcoded instruction; whereas to improve parallelism, the microcoded instruction should execute in only one pipe in parallel to some other instruction in the other pipe. Taking into account that today' s compilers use fewer and fewer microcoded instructions, the parallel approach may be better because it does not
interrupt the normal instruction advance through the pipes. The Pentium processor's branch prediction mechanism is found not to be very accurate under the
various applications. This parameter is very important for the future design of the Intel machines. Improved branch prediction accuracy is vital for superscalar and superpipeline machines. Further research is required to develop a new branch prediction mechanism. |
en_US |