Computer Science Colloquia
Monday, January 28, 2013
Advisor: Jack Davidson & Mary Lou Soffa
Attending Faculty: Sudhanva Gurumurthi (Chair), Mary Jane Irwin (Pennsylvania State University), and John Lach (Minor Representative)
4:00 PM, Rice Hall, Rm. 242
Ph.D. Project Proposal Presentation
Addressing Performance and Reliability in Multi-core Architectures via Effective Thread-mapping
For higher processing and computing power, chip multiprocessors (CMPs) have become the new mainstream architecture. As the technology scaling continues and more transistors are accommodated on the chip, the number of cores on CMP is growing, and the multi-core machines are scaling up to many-core machines. With this multi-core scaling, two major problems arise: shared-resource contention and increased susceptibility to soft errors or transient faults. Shared-resource contention can degrade an application’s performance by more than 50%, and the fault-tolerant mechanisms impose significant performance overhead by more than 30%. These two issues prevent the expected scalable performance improvements on CMPs.
The proposed research aims to address the challenges imposed by shared-resource contention and soft errors, and improve application performance and reliability by using an application’s inherent characteristics for different resources. First, a general standard methodology will be designed to characterize a multi-threaded application by identifying its key behavior and the way its performance and vulnerability are affected by shared-resource contention and soft errors, respectively. These characterizations for contention will be based on an application’s behavior of memory access and for soft errors will be based on an application’s behavior of resource occupancy. Using these characterizations, two thread-mapping algorithms will be designed to mitigate shared-resource contention in the memory hierarchy and improve application reliability. Novel run-time systems will be designed and developed, which will implement thread-mapping algorithms that dynamically map an application’s threads in the presence of any co-runner(s). Typically fault-tolerant mechanisms impose high performance overhead, and it is desirable to improve an application’s reliability, as well as maintain application performance. However, simultaneous performance and reliability improvement are conflicting goals. The thread-mapping algorithm that ensures better performance may not ensure better reliability. To address this issue, application characterizations for contention and soft errors will be integrated to design another thread-mapping algorithm that balances multi-threaded application performance and reliability. Our algorithms will be evaluated using multi-threaded applications from the PARSEC and NAS parallel benchmark suites on state-of-the-art multi- and many-core machines. The success of this research will enable contention mitigation and minimize the effect of soft errors for workloads consisting of multiple multi-threaded applications, ensuring improved application performance and reliability.