The purpose of this paper is neither to compare and contrast different concurrent programming languages nor to teach the reader how to program in any of them, but rather to convey basic, introductory information about the most popular parallel languages. A section of the paper is devoted to each language and aims to answer four basic questions about the language: "Who designed it?", "What architecture does it run on?", "What are its new language features?", and "How is parallel data treated?". Based on this introductory information, the reader should be able to make an informative decision on which language to use or to explore more thoroughly for his or her specific needs.
C* is an extension of the C language and was designed for Thinking Machine's SIMD (Single Instruction Multiple Data) computer, the Connection Machine. The language design assumes a Von-Newmann style host computer working in cooperation with an array of identical processors, each with local memory. The SIMD philosophy results in the processors in the array simultaneously executing the same instruction stream while communicating data between each other and the host computer. (Barszcz 1992)
C* has only three new language features over C. A poly type attribute identifies parallel data. The multiplication operation a*b specified on two poly operands a and b will result in all processors performing the multiplication simultaneously. Parallel data can be organized with a domain feature which is similar to the C struct and C++ class. Also, a selection statement is included to specify the activation and deactivation of parallel execution. (Barszcz 1992)
All data in C* is represented as either scalar or parallel. Scalar data is identified by the keyword mono and exists in the memory of the host computer. Parallel data identified by the keyword poly exists in the local memory of the parallel processors. Data declared within a domain is by default poly, whereas data declared outside a domain is by default mono. Otherwise, C* has the same declaration, definition, and initialization rules as C.
The keyword domain is used in two different contexts. Used analogously to the C keyword struct, it's used to describe the memory of a data processor. Processors can only parallel execute the same instruction synchronously if each processor contains the same memory layout. Domain is also used to scope portions of a C* program. Processors are activated by entering a domain if the processors contain operands of that domain type. Processors are deactivated within a domain using familiar conditional C operators. Each processor in the processor array performs the conditional test independently and determines whether it's to remain active or to deactivate itself. (Barszcz 1992)
Occam arises from a collaboration between David May and Tony Hoare, known for Communicating Sequential Processes (CSP). David May spent several years formulating his ideas for programming concurrent processes into EPL (Experimental Programming Language). Since 1982 these ideas have evolved at INMOS Ltd. into Occam as it is today. Occam is named after the named after the 14th century philosopher William of Occam who proposed that invented things should not be duplicated beyond necessity. Thus, the founding principle of Occam is "to avoid unnecessary duplication of language mechanisms" (INMOS Limited 1988).
Occam1 was released in January 1983 and provided the main constructions and mechanisms of the language. The language has since been refined into Occam2 which introduces floating point representations of real numbers, multi-dimensional arrays, and side-effect free functions. (INMOS Limited 1988)
The Occam programming language is a high-level language designed for implementing concurrent programs on a network of processors. The language can be implemented on any general purpose computer, but it is most efficient on the transputer. The development of the INMOS transputer has been closely retaed to Occam. The transputer can be considered an "Occam machine". When used to program a single transputer or a network of transputers, Occam provides the equivalent efficiency of programming a conventional computer at assembler level. (Gaudiot 1992)
An Occam program is a collection of processes. Like other high-level languages, each process is in turn a collection of processes which form a block structure. However, in Occam indentation is not only for readability, but also for program structure. Each sub-block of a block must be indented exactly two spaces. Processes may be performed sequentially with SEQ, in parallel with PAR, or in alternation with ALT.
The Occam network is composed of a host node with a unique program for communicating with the user and direct access devices. The remaining "worker nodes" usually run the same program which differs from the host's program. Since each transputer has its own program and data set, there does not exist what conventionally might be called parallel data. When data needs to be shared between two or more transputers, it's sent on a network path. In most cases the channel protocol (covered below) has a variable determining who the message is for. Each node on the network path must listen for the message, determine if it is theirs and, if necessary, relay it on down the network.
An essential part of concurrent programming is interaction between concurrent processes. Occam provides channels for communicating between two concurrent processes and two communication primitives, input and output. Input and output processes work only on channels and are denoted by ? and !, respectively as follows:
The format and type of values passed on a channel is specified by the channel protocol specified in the channel declaration. Channel protocols enable the compiler to check the usage of channels.
Consider the following channel declaration and output channel:
This declaration declares a channel named screen with a protocol of type BYTE. (INMOS Limited 1988)
Communications over a channel can occur only when both input and output processes are ready. If a process on one transputer reaches an input statement before a process or another transputer reaches its output statement, the first process must wait on the second. This wait condition can be alleviated in some instances through the ALT statement. Using ALT, processes are done in alternation. If an ALT statement followed by two channel input statements indented two spaces is reached, the effect is to listen to both channels at the same time. (Gaudiot 1992)
Sisal, Streams and Iterations in a Single Assignment Language, began as a collaborative effort between Lawrence Livermore National Laboratory, Colorado State University, University of Manchester, and Digital Equipment Corporation. Initially defined in 1983, it was revised in 1985 and the language definition has since remained constant. This consistency has allowed Sisal to be used by research groups around the world. (Feo 1992, 337)
Sisal is a functional language designed to achieve execution performance comparable to imperative languages. A primary objective in designing the language was to validate the functional style of programming for large-scale scientific applications. Functional programs are free of aliases, side effects, and time dependent errors. Thus, functional languages promote the development of correct, determinate parallel programs. (Feo 1992) All in all, this contributes to a major feature of Sisal and a goal many other parallel languages, that "A well-designed language can shield the programmer from timing problems while automatically exploiting any parallelism expressed in the program" (Skedzielewski 1991, 106). This makes the programmer's job much easier since the Sisal compiler is responsible for communicating data values, synchronizing operations, and managing memory.
The Sisal language is not explicitly targeted to multiprocessors. Through multiple functional units, single processors can also exploit low-level parallelism that can be extracted from a Sisal program. (Skedzielewski 1991, 109) In fact, nearly all Sisal programs will "run on any computer system, regardless of topology or number of processors, in parallel and without rewriting" (Feo 1992, 376).
Variable names in Sisal stand for values rather than memory locations. Remember from above that the last three letters of "Sisal" stand for Single Assignment Language. Since it is a "single assignment" language, a name may be assigned a value only once within each scope. After that assignment it becomes a read-only value, which can be shared among any number of parallel processes. This eliminates one of the major causes of nondeterminacy, the race condition. (Skedzielewski 1991) For example, consider two pieces of code running in parallel and sharing x. Each process initializes x to a different value, increments x by a different value, and prints x. The result is a pair of values. But how many pairs can appear? With both processes manipulating the variable, twenty different pairs of values could appear. Any experienced parallel programmer would avoid this specific example, but the same programmer could easily generate an equivalent mistake in a large program. The real problem then becomes finding the error. Sisal avoid this problem entirely and keeps parallel values clear and easy to understand.
Functions are a large source of parallelism in Sisal. As in other languages, the type of each parameter and result value is declared in the function header. A function has access only to is arguments, there are no global values, and a function does not retain its values between calls. Like LISP and other functional languages, Sisal functions are closely related to mathematical functions and given a fixed set of inputs, a function will always return the same result. Therefore, functions are great contenders for implementation as parallel tasks on a multiprocessor. (Skedzielewski 1991)
Sisal can draw on parallelism not only in functions, but also in many other ways. Forall expressions, multi-expressions, and streams are also candidates for parallel execution, and it become the compiler's responsibility to decide which of these sources to exploit. (Skedzielewski 1991)
Ada is the most widely known concurrent programming language. Ada was developed for the Department of Defense (DoD) and is the result of an extensive international design competition. By 1974, hundreds of different programming languages were in use for DoD projects. None of them were standardized by DoD and the cost of developing, maintaining, and adapting software was rapidly escalating. (Andrews 1991) Malcolm Curie, Director of Defense Research and Engineering, in January 1975 formed the High-Order Language Working Group (HOLWG).
With representatives from all of the military services and liaisons with England, France, and West Germany. HOLWG's requirements were to determine if a new language was needed, if so to identify its requirements, and if not, to determine which existing language or set of languages should be adopted. In April 1977, the Ironman document set the language requirements and made the request for proposals public. In May 1979 the Cii Honeywell/Bull language's design was chosen as the winner. After acceptance by members of the field, Ada was standardized in 1983 by the American National Standards Institute. (Sebesta 1993)
The name for the new language, Ada, was recommended by Jack Cooper of the Navy Material Command in the spring of 1979 and later adopted. Augusta Ada Byron (1815-1851), Countess of Lovelace, mathematician and daughter of poet Lord Byron, is generally recognized as being the world's first programmer. Working with Charles Babbage, she wrote programs for several numerical processes on his first mechanical computers, the Difference and Analytical Engines. (Sebesta 1993)
In 1974 over half of DoD's applications were embedded systems. In an embedded system "the computer hardware is embedded in the device it controls or for which it provides services" (Sebesta 1993, 79). Ada's intended use was in programming embedded systems that control aircraft, submarines, and so on. It's concurrency features were very important for this purpose. (Andrews 1991) Today, it's used on many multiprocessor and uniprocessor systems. Ada does not provide any way to assign processes to processors; if it's necessary to control the layout of a distributed program, it has to be expressed in "an implementation-specific configuration language" (Andrews 1991, 569). Usually, however, concurrent Ada programs can be written without regard to any specific hardware architecture, because they will run on uniprocessors. In which case, Ada achieves the illusion of concurrency by interleaving or time-slicing among the Ada processes, otherwise known as tasks. (Dritz 1992, 11)
Ada is a complex high-level language with many features. It provides encapsulation for data types, data objects, and procedures through packages. Through these, generic program units can be instantiated. Excellent exception handling is one of Ada's many low-level features. However, with respect to concurrent programming, the main language mechanisms are tasks and rendezvous.
Tasks, or processes, in Ada are declared much as procedures are, within a subprogram or package. A task is essentially a process. A task has a task specification which declares visible objects and a task body which contains additional local declarations and statements. The two can be separately compiled. (Snow 1992)
The most common form of a task specification is: (Andrews 1991, 568)
task identifier is entry declarations end;
The most common form of a task body is: (Andrews 1991, 568)
task body identifier is local declarations begin statements end identifier;
Tasks declarations are elaborated one at a time, in the order in which they appear. Elaborating a task declaration creates an instance of the task and begins upon entry into the procedure in which they are declared. The procedure is known as the parent, and the new task as the child task. (Snow 1992) After all declarations are elaborated, the sequential statements in the subprogram begin execution as an anonymous task. The procedure can not terminate until all its child tasks have completed. This provides some synchronization in initialization and termination of tasks, guaranteeing that the number of concurrently active tasks is the same when a procedure completes as it was when that procedure was called. (Schiper 1989; Andrews 1991)
Tasks execute in parallel with the rest of the program until they terminate or reach some kind of synchronization point. Those that work on their own sets of data and never cooperate are uninteresting. Usually, tasks share global data and have to synchronize their references to them to avoid conflicts with different tasks accessing the same data; or tasks own their own data and values must be transmitted between interacting tasks. Rendezvous provides the synchronization and the communication aspects of this interaction. (Dritz 1992)
A rendezvous is the action of a task accepting a message sent by another task, allowing tasks to communicate with each other. Ada includes simple and complex methods for controlling rendezvous. Here, only a brief introduction to the basic concepts of simple rendezvous will be presented with no coverage of syntax due the wordiness of syntax in Ada and the variations involved.
Tasks can be active or passive. An active task is a client process which performs certain functions and which requires a service from another process from time to time. These services are provided by a passive task (or server). The passive task is so called because it spends most of its time waiting for active tasks to request some kind of service from it. In the simplest kind of rendezvous, the "calling task" asks for some service (make an entry call) from a "called task" . (Dritz 1992) (Readers familiar with the producer-consumer model for processes synchronization should begin seeing similarities here.) A passive task may offer to make a rendezvous with any process which wishes to meet with it; whereas, an active task must request a particular rendezvous with a particular passive task.
The "calling" is done through an entry call. The entry call can optionally include parameters, so it can pass and receive values to and from the called task. The called task's role is expressed by an accept statement. It's obviously very unlikely that two tasks wishing to make a rendezvous will arrive at the point of synchronization at the same time. Thus one of processes must wait for the other, and when the two processes do meet, the code of the entry is executed "jointly" by the two processes. This is only a barest outline of the rendezvous mechanism, but its use in ensuring process synchronization and in resolving shared data conflicts should be clear. (Snow 1992)
These few concurrent languages here represent only a few of many languages designed for parallel programming. Each was designed for different problems, different data, and different architectures, and each provides its own new features for handling these. It is hoped that the reader has gained an understanding of the decisions incorporated into each language to better facilitate his own decision of language choice.
Andrews, Gregory R. 1991. Concurrent Programming: principles and practice. Redwood City, California: The Benjamin/Cummings Publishing Company.
Barszcq, Eric., and David L. Andrews. 1992. "The C* Parallel Programming Language". In A Comparative Study of Parallel Programming Languages: The Salishan Problems, edited by J. T. Feo, 93-132. Amsterdam: Elsevier Science Publishers B.V.
Dritz, Kenneth W. 1992. "Ada Solutions to the Salishan Problems". In A Comparative Study of Parallel Programming Languages: The Salishan Problems, edited by J. T. Feo, 9-92. Amsterdam: Elsevier Science Publishers B.V.
Feo, John. 1992. "Sisal". In A Comparative Study of Parallel Programming Languages: The Salishan Problems, edited by J. T. Feo, 337-396. Amsterdam: Elsevier Science Publishers B.V.
Gaudiot, Jean-Luc., and Dae-Kyun Yoon. 1992. "Occam". In A Comparative Study of Parallel Programming Languages: The Salishan Problems, edited by J. T. Feo, 217- 262. Amsterdam: Elsevier Science Publishers B.V.
INMOS Limited. 1988. occam2 Reference Manual. Cambridge: Cambridge University Press.
Schiper, Andre. 1989. Concurrent Programming. Cambridge: Cambridge University Press.
Sebesta, Robert W. 1992. Concepts of Programming Languages. Redwood City, California: The Benjamin/Cummings Publishing Company.
Snow, C. R. 1992. Concurrent Programming. Cambridge: Cambridge University Press.
Szymanski, Boleslaw K. 1991. Parallel Functional Languages and Compilers. New York: ACM Press.