Faster Systems Through Commodity Math Acceleration

Dr. Douglas M. Pase

IBM xSeries Performance Development & Analysis

June 16th, 12-1pm
FAB 310-18

ABSTRACT

Over the past decades Moore's Law allowed microprocessors to enjoy exponential increases in performance. However, in recent years commodity microprocessors have run up against fundamental limitations of power, clock frequency and heat dissipation, forcing designers to use more creative means to obtain increases in performance. Some recent solutions include vapor phase transition or water cooled heat sinks to increase the thermal tolerance of processors. Other solutions include dual-core processors, thermal throttling, and increased internal parallelism to accomplish more work per clock.

In this talk I examine a solution popular in the 1970's and 1980's that has once again become highly effective -- the attached processor. ClearSpeed, a company based in Bristol, UK, manufactures a math accelerator card that fits into a standard PCI-X I/O slot in an inexpensive server or workstation, consumes 25 watts, and computes at rates in excess of 50 gigaflops per second. The processing elements (PEs) of this dual-processor card use an SIMD architecture where each PE implements a fully pipelined integer and 64-bit ALU. This talk describes in some detail the architecture and capabilities of this new product.


BIOGRAPHY

Douglas Pase earned a Bachelor of Science in Mathematics and Computer Science from Northern Arizona University in 1982, and a Doctorate in Computer Science and Engineering from Oregon Graduate Center. His doctorate research investigated parallel languages and architectures, and algorithms for scheduling tasks in parallel. He has been part of the compiler development teams for Honeywell Large Information Systems, McDonnell Douglas and for Floating-Point Systems. At NASA Ames Research Center he investigated the capabilities of parallelizing compilers and the future of parallel computing architectures. He later joined Cray Research to co-author the CRAFT Programming Model and to develop the MPP Apprentice performance analysis tool. Afterwards he moved to IBM to create the Dynamic Probe Class Library (DPCL), based on DynInst from the University of Wisconsin - Madison. DPCL allows instrumentation to be placed into a parallel application while it is running, and provides a complete infrastructure to support that instrumentation. Dr. Pase's current position is with the IBM xSeries Performance Development and Analysis Team as lead of the HPC effort. Dr. Pase is the author of 23 technical papers and 8 patents. He is a member of the Linux Cluster Institute steering committee and regularly teaches graduate courses at North Carolina State University.

HOST
Dr. Karen Karavanic