next up previous

1 Overview     continued...

It is not necessary to write a lower level program in order to compromise abstraction and portability. As a simple example, suppose a program multiplies a value x by 2. The obvious way to write this in C is ``2*x''. But if a programmer knows the machine that will execute this program has a slow multiplication instruction, and knows that integers are represented in binary, she can use the expression ``x << 1'' instead.gif The resulting program is less abstract than the original, since one of its operations is defined in machine level terms (shifting a pattern of bits) instead of mathematical terms (multiplication). It is less portable, since it now runs only on machines that use binary to represent integers (a pretty safe bet) and is efficient only on machines that shift bits in a single operation.

There are situations where programmers need to use knowledge of the underlying computer system in order to optimize programs written in high level languages, and computational scientists will often find themselves in these situations. If a program runs for days, or even weeks, an optimization that improves execution by just a few percent can save many hours, which translates into real savings if the program runs at a supercomputer center where the scientist pays for CPU time. Another factor is that high performance computers used by computational scientists are much more complicated than other machines, and a compiler may not be able to translate a program efficiently without a little help from the programmer. A common situation is a loop written in Fortran, which, if written carefully, can be translated into a single instruction for a vector processor. Computational scientists often use the newest machines, and these are the machines most likely to have immature compilers. It takes several years experience with real programs for compiler writers to learn how to develop optimizations that will more fully exploit the capabilities of the underlying machine, and in many cases the theory behind the optimizations has yet to be worked out. For example, techniques for optimal mapping of independent pieces of a parallel program so they can be executed simultaneously on different nodes in a parallel processor is an active area of research in computer science. Programmers who use parallel processors often need to allocate tasks themselves using system- dependent library routines to send information from one task to another. Knowing how processors are interconnected will have an impact on how efficiently messages are passed.