People/Web Search Calendar Emergency Info A-Z Index UVA Email University of Virginia

Computer Science Colloquia

Tuesday, April 21, 2014
John "Jack" Wadden
Advisor: Kevin Skadron
Attending Faculty: Sudhanva Gurumurthi (Chair), Jack Davidson, and Worthy Martin

11:00 AM, Rice Hall, Rm. 504

Ph.D.Qualifying Exam Presentation
Real-World Design and Evaluation of Compiler-Managed GPU Redundant Multithreading

ABSTRACT

Reliability for general purpose processing on the GPU (GPGPU) is becoming a weak link in the construction of reliable supercomputer systems. Because hardware protection is expensive to develop, requires dedicated on-chip resources, and is not portable across different architectures, the efficiency of software solutions such as redundant multithreading (RMT) must be explored. This paper presents a real-world design and evaluation of automatic software RMT on GPU hardware. We first describe a compiler pass that automatically converts GPGPU kernels into redundantly threaded versions. We then perform detailed power and performance evaluations of three RMT algorithms, each of which provides fault coverage to a set of structures in the GPU. Using real hardware, we show that compiler managed software RMT has highly variable costs. We further analyze the individual costs of redundant work scheduling, redundant computation, and inter-thread communication, showing that no single component in general is responsible for high overheads across all applications; instead, certain workload properties tend to cause RMT to perform well or poorly. Finally, we demonstrate the benefit of architectural support for RMT with a specific example of fast, register-level thread communication.