Computer Science Colloquia
Tuesday, December 9, 2014
Advisor: Kevin Skadron
Attending Faculty: Mircea Stan, Chair; Gabe Robins, Jim Cohoon and Pradip Bose (IBM)
10:00 AM, Rice Hall, Rm. 536
Ph.D. Proposal Presentation
PEARL: A First-order Modeling Framework for Power-Efficient and Reliable Multiprocessing System
Recent trends in technology scaling pose even more challenges on power constraints, also known as the "Power Wall". In particular,threshold voltage scales down slowly in current and future technology nodes to keep leakage power under control. As Moore's Law continues to double transistor density across technology nodes, total power consumption will soon exceed TDP, and if high supply voltage must be maintained, future chips will only support a small fraction of active transistors, leaving others inactive, a phenomenon referred to as "dark silicon".
Reliability has always been a constraint, but lower supply voltages have made it a more prominent design consideration, which traditionally has been a serious concern only for mission-critical or high-end server systems. System designers are enforced to apply error detection and/or recovery mechanism to ensure the computation correctness. This leaves the system designer with worse performance as well as even more pressure on an already stringent power budget.
With the research completed so far and the proposed research, I will contribute a set of modeling frameworks with regard to three important design factors: power, performance, and resilience. Lumos/Lumos+ provides a mechanism for system designers to rapidly explore design points of future heterogeneous architectures, and make early decisions before spending significant efforts on a single path. AFI/AFIpar help application developers to better understand applications' resilience. System designers can also benefit from resilience characterization by applying application-specific tuning algorithms. Finally, we demonstrate an optimization scenario where trade offs are made among power, performance, and resilience to achieve the best energy efficiency while still maintaining real-time constraints.