2.5 Operating Systems



next up previous
Next: 2.6 Data Representations Up: 2 Basic Computer Architecture Previous: 2.4 I/O

2.5 Operating Systems

 

The user's view of a computer system is of a complex set of services that are provided by a combination of hardware (the architecture and its organization) and software (the operating system). Attributes of the operating system also affect the performance of user programs.

Operating systems for all but the simplest personal computers are multi-tasking operating systems. This means the computer will be running several jobs at once. A program is a static description of an algorithm. To run a program, the system will decide how much memory it needs and then start a process for this program; a process (also known as a task) can be viewed as a dynamic copy of a program. For example, the C compiler is a program. Several different users can be compiling their code at the same time; there will be a separate process in the system for each of these invocations of the compiler.

Processes in a multi-tasking operating system will be in one of three states. A process is active if the CPU is executing the corresponding program. In a single processor system there will be only one active process at any time. A process is idle if it is waiting to run. In order to allocate time on the CPU fairly to all processes, the operating system will let a process run for a short time (known as a time slice; typically around 20ms) and then interrupt it, change its status to idle, and install one of the other idle tasks as the new active process. The previous task goes to the end of a process queue to wait for another time slice.

The third state for a process is blocked. A blocked process is one that is waiting for some external event. For example, if a process needs a piece of data from a file, it will call the operating system routine that retrieves the information and then voluntarily give up the remainder of its time slice. When the data is ready, the system changes the process' state from blocked to idle, and it will be resumed again when its turn comes.

The predominant operating systems for workstations is Unix, developed in the 1970s at Bell Labs and made popular in the 1980s by the University of California at Berkeley. Even though there may be just one user, and that user is executing only one program (e.g. a text editor), there will be dozens of tasks running. Many Unix services are provided by small systems programs known as daemons that are dedicated to one special purpose. There are daemons for sending and receiving mail, using the network to find files on other systems, and several other jobs.

The fact that there may be several processes running in a system at the same time as your computational science application has ramifications for performance. One is that it makes it slightly more difficult to measure performance. You cannot simply start a program, look at your watch, and then look again when the program stops to measure the time spent. This measure is known as real time or ``wall-clock time,'' and it depends as much on the number of other processes in the system as it does on the performance of your program. Your program will take longer to run on a heavily-loaded system since it will be competing for CPU cycles with those other jobs. To get an accurate assessment of how much time is required to run your program you need to measure CPU time. Unix and other operating systems have system routines that can be called from an application to find out how much CPU time has been allocated to the process since it was started.

Another impact of having several other jobs in the process queue is that as they are executed they work themselves into the cache, displacing your program and data. During your application's time slice its code and data will fill up the cache. But when the time slice is over and a daemon or other user's program runs, its code and data will soon replace yours, so that when yours resumes it will have a higher miss rate until it reloads the code and data it was working on when it was interrupted. This period during which your information is being moved back into the cache is known as a reload transient. The longer the interval between time slices and the more processes that run during this interval the longer the reload transient.

Supercomputers and parallel processors also use variants of Unix for their runtime environments. You will have to investigate whether or not daemons run on the main processor or a ``front end'' processor and how the operating system allocates resources. As an example of the range of alternatives, on an Intel Paragon XPS with 56 processors some processors will be dedicated to system tasks (e.g. file transfers) and the remainder will be split among users so that applications do not have to share any one processor. The MasPar 1104 consists of a front-end (a DEC workstation) that handles the system tasks and 4096 processors for user applications. Each processor has its own 64KB RAM. More than one user process can run at any one time, but instead of allocating a different set of processors to each job the operating system divides up the memory. The memory is split into equal size partitions, for example 8KB, and when a job starts the system figures out how many partitions it needs. All 4096 processors execute that job, and when the time slice is over they all start working on another job in a different set of partitions.



next up previous
Next: 2.6 Data Representations Up: 2 Basic Computer Architecture Previous: 2.4 I/O



verena@csep1.phy.ornl.gov