The MIT alewife machine
01 March 1999
A variety of models for parallel architectures, such as shared memory, message passing, and data flow, have converged in the recent past to a hybrid architecture form called distributed shared memory (DSM). By using a combination of hardware and software mechanisms, DSM combines the nice features of all the above models and is able to achieve both the scalability of message-passing machines and the programmability of shared memory systems. Alewife, an early prototype of such DSM architectures, uses a hybrid of software and hardware mechanisms to support coherent shared memory, efficient user-level messaging, fine-grain synchronization, and latency tolerance. Alewife supports up to 512 processing nodes connected over a scalable and cost-effective mesh network at a constant cost per node. Four mechanisms combine to achieve Alewife' s goals of scalability and programmability: software-extended coherent shared memory provides a global, linear address space; integrated message passing allows compiler and operating system designers to provide efficient communication and synchronization; support for fine-grain computation allow many processors to cooperate on small problem sizes; and latency-tolerance mechanisms-including block multithreading and prefetching-mask unavoidable delays due to communication. Extensive results from microbenchmarks, together with over a dozen complete applications running on a 32-node prototype, demonstrate that integrating message passing with shared memory enables a cost-efficient solution to the cache coherence problem and provides a rich set of programming primitives. Our results further show that messaging and shared memory operations are both important because each helps the programmer to achieve the best performance for various machine configurations. Block multithreading and prefetching improve performance significantly, and language constructs that allow programmers to express fine-grain synchronization can improve performance by over a factor of two.