One of the challenges faced today in the design of microprocessors is to obtain power, performance scalability and reliability at the same time with technology scaling, in the face of extreme process variations. The variation in delay behavior of the same design within-die and die-to-die increases significantly at technology nodes below 10nm. In addition, timing variations during chip operation occur due to dynamically changing factors like workload, temperature, aging. To guarantee lifetime operational correctness under timing uncertainties, safety margins in the form of one time worst-case guard-bands are incorporated into the design. Microprocessor pipelines are margined by operating the design at a higher voltage or lower frequency than what the design can support. Incorporating safety margins is a temporary hack to an increasing timing variation problem at lower technology nodes due to two reasons (1) How much guard-bands will be enough to guarantee reliable operation under delay variations is not known, which may result in difficult-to-model or difficult-to-detect speed/timing related bugs to escape into the field, resulting in a blue screen during system operation (2) The degree/amount of guard-bands to be incorporated to ensure reliability continues to increase resulting in significant power and performance inefficiency. The first part of this thesis describes a low cost post-manufacturing self-testing and speed-tuning methodology to top-up speed coverage and find the maximum reliable clock frequency of each processor pipeline in a multi-processor system. The second part of this thesis details the design and operation of a novel timing variation tolerant pipeline design, which eliminates the need to incorporate timing safety margins. Quantitative and qualitative analysis demonstrate great potential for co-existence of power, performance efficiency and reliability in microprocessor pipelines at lower technology nodes.
【 预 览 】
附件列表
Files
Size
Format
View
Self-adjusting pipeline designs and tuning methods for timing variation tolerance in multi-processor systems