Al-Otoom, Muawya Mohamed ; Dr. Eric Rotenberg, Committee Chair,Dr. Suleyman Sair, Committee Member,Dr. W. Rhett Davis, Committee Member,Al-Otoom, Muawya Mohamed ; Dr. Eric Rotenberg ; Committee Chair ; Dr. Suleyman Sair ; Committee Member ; Dr. W. Rhett Davis ; Committee Member
Conventional superscalar processors recover from a mispredicted branch by squashing all instructions after the branch. While simple, this approach needlessly re-executes many future control-independent (CI) instructions after the branch's reconvergent point. Selective recovery is possible, but is complicated by the fact that some control-independent instructions must be singled out for re-execution, namely those that depend on data influenced by the mispredicted branch. That is, control-independent data-dependent (CIDD) instructions must be singled out for re-execution, thus avoiding needless re-execution of control-independent data-independent (CIDI) instructions.To contrast different recovery models, we abstract the recovery process as constructing a "recovery sub-program" for repairing partially incorrect future state. In this conceptual framework, selective recovery constructs a shorter recovery sub-program than full recovery. In current selective recovery microarchitectures, the recovery sub-program is constructed on-the-fly after detecting a mispredicted branch, by sequencing through all CI instructions and singling out only the CIDD instructions among them. Not only is this discriminating approach complex, but the same recovery sub-program is repeatedly constructed every time this branch is mispredicted.We propose constructing the recovery sub-program for each branch once and caching it for future use. In particular, traces of CIDD instructions are pre-constructed and stored in a recovery trace cache. When a misprediction is detected, first, the branch's correct control-dependent instructions are fetched from the conventional instruction cache as usual. Then, at the reconvergent point, fetching simply switches from the instruction cache to the recovery trace cache. The appropriate recovery trace is fetched from the recovery trace cache at this time. In this way, fetching only the CIDD instructions is as simple as fetching all CI instructions from a conventional instruction cache. No explicit singling-out process is needed as this was done a priori, on the fill-side of the trace cache. Therefore, the recovery trace cache is efficient on multiple levels, combining the simplicity of full recovery with the performance of selective recovery.This thesis explains the proposed trace-cache-based control independence architecture, at a high level. Preliminary studies are also presented, to project the potential of exploiting control independence as well as the effectiveness of a trace-cache-based approach in particular. The results include (i) breakdowns of retired dynamic instructions into different categories, based on their control and data dependences with respect to prior mispredicted branches, (ii) contributions of individual recovery traces to total CIDI instruction savings, and (iii) hit ratios of finite recovery trace caches.
【 预 览 】
附件列表
Files
Size
Format
View
Preliminary Study of Trace-Cache-Based Control Independence Architecture.