With the advent of multicores, parallel programming has gained a lot ofimportance. For parallel programming to be viable for the predicted hundreds ofcores per chip, shared memory programming languages and environments mustevolve to enforce disciplined practices like ``determinism-by-defaultsemantics'' and ban ``wild shared-memory behaviors'' like arbitrary data racesand potential non-determinism everywhere. This evolution can not only benefitsoftware development, but can also greatly reduce the complexity in hardware.DeNovo is a hardware architecture designed from the ground up to exploit theopportunities exposed by such disciplined software models to make the hardwaremuch simpler and efficient at the same time.This thesis describes an effort to formally verify and evaluate the DeNovocache coherence protocol. By using a model checking tool, we uncovered threebugs in the protocol implementation which had not been found either in thetesting phase or in the simulation runs. All of these bugs were caused byerrors in translating the high level description into the implementation.Surprisingly, we also found six bugs in a state-of-the-art implementation ofthe widely used MESI protocol. Most of these bugs were hard to analyze and tookseveral days to fix. We provide quantitative evidence that DeNovo is a muchsimpler protocol by showing that the DeNovo protocol has about 15X fewerreachable states when compared to MESI when using the Murphi model checkingtool for verification. This translates to about 20X difference in the runtimeof the tool. Finally, we show that this simplicity of the DeNovo protocol doesnot compromise performance for the applications we evaluated.On the contrary,for some applications, DeNovo achieves up to 67\% reduction in memory stalltime and up to 70\% reduction in network traffic when compared to MESI.
【 预 览 】
附件列表
Files
Size
Format
View
Verification and Performance of the DeNovo cache coherence protocol