Terry's Cybersphere VRML Tutor Win95 Tips Web Threads Research

Note: The following represents work compiled during a Scientific Visualization course under Dr. Bouvier at the University of Arkansas at Fayetteville. The bibliography is by no means complete or up-to-date, nor meant to be. It merely represents a handfull of papers on parallel volume visualization from hardware and software sides.


Parallel Volume Visualization Bibliography



by Terry Smith
Compiled April 1996


Kruger, Wolfgang and Peter Schroder, "Data Parallel Volume Rendering", Scientific Visualization: Advances and Challenges, Academic Press, 1994, 37-52.

The authors present image space and object space algorithms for massively parallel volume rendering which they have implemented on the CM2, both of which are well-suited to interactive work. The authors use a scale/shear technique for their image space algorithm and compare it to similar, previously implemented techniques. Their object space algorithm is a ray parallel algorithm which carefully chooses ray alignment to achieve a one-to-one mapping of rays onto voxels. The next section of the paper describes in highly-technical mathematics the transport theory model of light propagation only to seemingly come to the conclusion that workstations can perform fast rendering when using shading based on look-up tables whereas on machines such as the CM2, one might as well a use high fidelity shading model. The only thing useful presented in the paper is the use of scale/shear transformation for fast 3D rotations.

Deering, Michael, Stephanie Winner, Bic Schediwy, Chris Duffy, and Neil Hunt, "The Triangle Processor and Normal Vector Shader: A VLSI System for High Performance Graphics", Computer Graphics 22(4), 1988, 21-30.

Describes a system consisting of a pipeline of triangle processors which rasterize the geometry, followed by a pipeline of shading processors. A Triangle Processor is dedicated to the rasterization of a single triangle. The triangle process pipeline performs 100 billion additions per second, and the shading pipeline performs two billion multiplies per second. This allows more than one million triangles to be displayed per second. Also, note that the "Normal Vector Shader" chips perform a full multiple light source Phong illumination model independently to each pixel. Basically, this thing is big and very, very fast and most of the paper describes the internal architecture of the two chips presented.

Dippe, Mark and John Swensen, "An Adaptive Subdivision Algorithm and Parallel Architecture for Realistic Image Synthesis", Computer Graphics 18(3), 1984, 149-158.

Presents a ray tracing algorithm that adaptively subdivides scenes into subregions so that each has roughly uniform load; the subregions are not uniform , but vary according to the complexity of the scene. Note that their algorithm doesn't subdivide the 2D projection of 3D space when rendering, but subdivides the 3D space itself. Their architecture, a 3D array of processors, is very appropriate for this. Interestingly, they use a feedback scheme to allow neighboring subregions to share each others' load information, and to allow relatively more loaded subregions to adjust their boundaries to reduce load. It discusses the message routing which is quite a problem here, and goes on to give many details about the subdivision algorithm and its time/cost performance.

Ginsberg, Myron, "Challenges to the Use of Supercomputers and Scientific Visualization for Automotive Applications", Computers and Graphics, 17(5), 1993, 507-515.

Most of this most tells the reader that the number of supercomputers among automotive manufacturers in increasing, but they're not being used to their full potential. It does convey the interesting fact that a crash test can take months of planning and construction of prototypes and takes 80ms, while a computer simulation may take 20 hours of CPU time on a single processor of a Cray Y-MP and generate paper output over 4 feet high. (What kind of printer do you buy for a Cray Y-MP?) Another useful bit of information in the paper is the use of visualization in designing a car's air-conditioning system before it's built. Researchers can model the cool-down transition from a "boiling" hot to a comfortable temperature level. By looking at the animation, they can determine how many air vents are needed and the "optimal placement of them". (I guess if you live in Michigan this isn't part of your common-sense.)

Lang, Ulrigh, Ruth Lang, and Roland Ruhle, "Scientific Visualization in a Supercomputer Network at RUS", Computers and Graphics, 17(1), 1993, 15-22.

Describes the software application environment at the University of Stuttgart known as RSYST. RSYST is a database oriented system that allow transparent access to any data object on any machine in a network. At RUS this network consists of a Cray-2, a Cray Y-MP, and several SGI workstations. Machine dependent data type conversion is automatically done during access. The user interface of RSYST is based on X Windows and the visualization module based on GL. PHIGS has been distributed between the Cray-2 and Sun and SGI workstations and is used to distribute the visualization of fluid flow calculations.

Levinthal, Adam and Thomas Porter, "A SIMD Graphics Processor", Computer Graphics 18(3), 1984, 77-82.

Purely hardware oriented and a bit dated by now, this paper still bears attention due to its conception at Lucasfilm Ltd. and its application to the movie industry. The paper presents the first machine to be completed by the Lucasfilm Pixar project whose goal is to produce machines for film-quality image creation. This system, called the Lucasfilm Compositor, extends the range of a conventional optical film printer by using digital signal processing techniques, allowing merging of multiple images, creation of mattes, filtering, color correction, and others. The paper brings to light the importance of an alpha channel for retaining transparency information used in movie-quality images. Most of the paper deals with the details of the SIMD processors. Of note, however, is that the Compositor is designed to execute several key algorithms at an average rate of one microsecond per pixel.

Ma, Kwan-Liu and James S. Painter, "Parallel Volume Visualization on Workstations", Computers and Graphics 17(1), 1993, 31-37.

For those of us who don't have a Cray or Connection Machine handy, this paper discusses distributing volume data sets and computational demands across a network of general purpose workstations. In some cases this may allow the entire data set and/or all components of the volume to be loaded into memory at one time, where on a single machine this may not be possible. Using networked workstations also allows the user to watch multiple views of a particular structure in the data. Multiple-variable visualization is also possible. The authors describe an example volume of 35 scalar data sets which, in most circumstances, would be difficult to load into memory and view concurrently. But the authors are able to distribute the 35 scalar data sets to multiple networked workstations and allow the output to be sent to the host for display. The paper also explains data subdivision methods of dividing a data set across multiple computers.

Neumann, Ulrich, "Interactive Volume Rendering on a Multicomputer", Computer Graphics 26, 1992, 87-93.

Describes a volume rendering algorithm for MIMD message passing multicomputers and the issues involved. It begins with an implementation on a 1D ring network and is extended to a 2D mesh topology. The rendering method used is a parallelized splatting approach on the Pixel-Planes 5 machine, and pseudocode for this is provided. The algorithm presented makes use of a static, interleaved, slab distribution of data among nodes. Each Renderer processor is assigned a unique 128 x 128 pixel screen region. This is a very easy paper to understand, and the author does a great job of explaining his algorithm and the graphic and architectural issues involved in implementing it on the two topologies.

Pineda, Juan, "A Parallel Algorithm for Polygon Rasterization", Computer Graphics 22(4), 1988, 17-20.

Presents an algorithm for rastering 3D Z-buffered polygons in parallel. Previously, the "PIXEL-PLANES" system used a linear function to interpolate polygon edges. It was highly parallel, but required custom memory chips for the frame buffers. The author's algorithm also uses a linear function to define polygon edges, but it's better suited to frame buffers using conventional DRAM and VRAM. The function mentioned is called the "edge function". It classifies point on a 2D plane that is subdivided by a line, into three regions: points on the left of the line have a value greater than zero, points right of the line are less than zero, and points on the line are zero. Doing this, the interior area of a polygon is positive. So for each pixel we can compute an R,G,B value, a Z-buffer value, and E1...En values of the edge functions. The paper goes on to give several traversal algorithms and a very short explanation of doing the algorithm in parallel by dividing the triangle into blocks.

Potmesil, Michael and Eric M. Hoffert, "Architecture and Applications of the Pixel Machine", Frontiers of Scientific Visualization, John Wiley & Sons Press, 1994, 213-243.

The Pixel Machine was the first commercially available programmable parallel processor for geometry and image computing. The first part of this paper describes the details of the Pixel Machine which is an MIMD architecture consisting of a pipeline of pipe nodes executing sequential algorithms and an array of m x n pixel nodes which execute parallel algorithms. Communication, virtual memory, and performance analysis are examined for the machine. It contains interesting graphs of time versus number of nodes for the parallel performance of raster operations, ray tracing, volume rendering, and image processing. Note that in ray tracing each pixel node contains a copy of the entire display list for the scene. Volume rendering is accomplished with ray casting.