SET™ Frequently Asked Questions
What does your Supercomputing Engine Technology (SET™) do? Does it support both multicore and distributed computing parallelism? What's going on under the covers?
SET™ applies the parallel computing paradigm of distributed-memory MPI, proven over the last twenty years to achieve efficient parallelism from multicore to clusters to clouds and supercomputers. However it has three defining differences from MPI.
The first is that it provides a support architecture and framework that covers common parallel computing patterns. Beyond simply message-passing patterns, SET™ "owns" the data to be manipulated across the parallel computer so that SET can organize and rearrange the data as needed for the parallel computing pattern. SET™ supports parallel data structures, such as partitioning with guard cells and element management, and parallel execution patterns, such as divide-and-conquer array generation, common to many parallel codes, including MPI codes. Because that support has never made it into MPI itself, every writer for MPI has had to rewrite the same parallel data structure and execution again and again. SET™ makes parallel code writing easier by writing that once and not requiring users to debug that part of the code.
The second defining difference from MPI is that it has the application organized into a "Front End" and "Back End", with distinct purposes. The Front End is the "Captain" of the application, directing the entire application and making global decisions, much like the main() or main loop of a code. The Back End does the grunt work, the raw and low-level calculations. SET™ is the bridge between the Front End and the Back End, but that division allows SET™ to organize the work performed by the many Back End codes as appropriate for parallelism. In particular, SET™ runs many Back End codes simultaneously, allowing the writer of the Back End code, because by definition it simply does on its own chunk work on its chunk of data, to not have to think about parallelism.
The third is that the result is a parallel computing approach that is much easier to use for application developers. As much as possible, the details of parallel computing are handled by SET™, whether it be data or execution management across the cluster. The application-specific pieces are in the Front End, which defines the high-level execution of the parallel application, and the Back End, where the low-level calculations are actually performed.
Many programmers with parallel computing needs see MPI as too much like assembly language. We designed SET™ with the scope necessary to cover parallel computing details while enabling the application writer to think sequentially as much as possible.
What types of dependencies does SET™ have on the underlying platform (OS, hardware, etc.)?
Fundamentally, SET™ needs a parallel system with some equivalent of MPI_Irecv, MPI_Isend, and MPI_Test, plus the usual metrics of the system (rank and size). This makes it possible to port SET™ to shared-memory as well as standard MPI systems. At present implementation of SET™ runs on all the major Unix-compatible platforms. We've run it on OS X and 64-bit Linux clusters as well as larger systems like SGI. As ACS's resources allows we will expand SET to other OS's.
How would a sequential program need to be modified so that it could tap into the SET™ technology? How long would this typically take?
The application would be organized into a "Front End" and "Back End": The Front End is the "captain" of the application, directing the entire application and making global decisions, much like the main() or main loop of a code. The Front End is also where the user-interface, if any, resides. The Back End does the grunt work, the raw and low-level calculations. Any modern modular code should be able to be factored relatively easily this way, as it is an excellent and well-accepted approach for reusing code between projects.
How long would this typically take? Factoring the application into the Front End and Back End should be straightforward for a modular or other modern, well-organized application. After that one adds "glue code" between SET™ and the Front End and SET™ and the Back End, which typically consists of wrapper calls or minor replacements. Then there's testing and optimization. Most projects using conventional approaches allow a year to accomplish this; with SET™ this can take under a month.
How well does the technology scale in the multicore, multiprocessor, and multi-server dimensions?
Since the underlying paradigm is that of distributed-memory MPI, it scales almost as well as distributed-memory MPI on all parallel computing implementations. Where SET™ might do poorly is also where other parallel approaches do poorly, such when communication time is far greater than the computation size. The purpose of SET™ is to make it much easier for the software writer to quickly produce an application that can achieve scale.
Compared to a hand-coded MPI application, how well does SET™ perform?
The SET™ approach has produced codes that scale almost as well as traditional hand-coded MPI applications. In some cases the results are indistinguishable from what is accomplished via hand-coded MPI applications.