Designers of microprocessor-based systems must constantly improveperformance and increase computational efficiency in their designs tocreate value. To this end, it is increasingly common to seecomputation accelerators in general-purpose processordesigns. Computation accelerators collapse portions of anapplication;;s dataflow graph, reducing the critical path ofcomputations, easing the burden on processor resources, and reducingenergy consumption in systems. There are many problems associated withadding accelerators to microprocessors, though. Design ofaccelerators, architectural integration, and software support allpresent major challenges.This dissertation tackles these challenges in the context ofaccelerators targeting acyclic and cyclic patterns ofcomputation. First, a technique to identify critical computationsubgraphs within an application set is presented. This technique ishardware-cognizant and effectively generates a set of instruction setextensions given a domain of target applications. Next, severalgeneral-purpose accelerator structures are quantitatively designedusing critical subgraph analysis for a broad application set.The next challenge is architectural integration ofaccelerators. Traditionally, software invokes accelerators bystatically encoding new instructions into the application binary. Thisis incredibly costly, though, requiring many portions of hardware andsoftware to be redesigned. This dissertation develops strategies toutilize accelerators, without changing the instruction set. In theproposed approach, the microarchitecture translates applications atrun-time, replacing computation subgraphs with microcode to utilizeaccelerators. We explore the tradeoffs in performing difficult aspectsof the translation at compile-time, while retaining run-timereplacement. This culminates in a simple microarchitectural interfacethat supports a plug-and-play model for integrating accelerators intoa pre-designed microprocessor.Software support is the last challenge in dealing with computationaccelerators.The primary issue is difficulty in generatinghigh-quality code utilizing accelerators. Hand-written assembly codeis standard in industry, and if compiler support does exist, simplegreedy algorithms are common. In this work, we investigate morethorough techniques for compiling for computation accelerators. Wheregreedy heuristics only explore one possible solution, the techniquesin this dissertation explore the entire design space, whenpossible. Intelligent pruning methods ensure that compilation is bothtractable and scalable.
【 预 览 】
附件列表
Files
Size
Format
View
Customizing the Computation Capabilities of Microprocessors.