2.6 Data Representations



next up previous
Next: 2.7 Performance Models Up: 2 Basic Computer Architecture Previous: 2.5 Operating Systems

2.6 Data Representations

 

Another important interaction between user programs and computer architecture is in the representation of numbers. This interaction does not affect performance as much as it does portability. Users must be extremely careful when moving programs and/or data files from one system to another because numbers and other data are not always represented the same way. Recently programming languages have begun to allow users to have more control over how numbers are represented and to write code that does not depend so heavily on data representations that it fails when executed on the ``wrong'' system.

The binary number system is the starting point for representing information. All items in a computer's memory - numbers, characters, instructions, etc. - are represented by strings of 1's and 0's. These two values designate one of two possible states for the underlying physical memory. It does not matter to us which state corresponds to 1 and which corresponds to 0, or even what medium is used. In an electronic memory, 1 could stand for a positively charged region of semiconductor and 0 for a neutral region, or on a device that can be magnetized a 1 would represent a portion of the surface that has a flux in one direction, while a 0 would indicate a flux in the opposite direction. It is only important that the mapping from the set {1,0} to the two states be consistent and that the states can be detected and modified at will.

Systems usually deal with fixed-length strings of binary digits. The smallest unit of memory is a single bit, which holds a single binary digit. The next largest unit is a byte, now universally recognized to be eight bits (early systems used anywhere from six to eight bits per byte). A word is 32 bits long in most workstations and personal computers, and 64 bits in supercomputers. A double word is twice as long as a single word, and operations that use double words are said to be double precision operations.

Storing a positive integer in a system is trivial: simply write the integer in binary and use the resulting string as the pattern to store in memory. Since numbers are usually stored one per word, the number is padded with leading 0's first. For example, the number 52 is represented in a 16-bit word by the pattern 0000000000110100.

The meaning of an -bit string when it is interpreted as a binary number is defined by the formula, , i.e. bit number i has weight:

Compiler writers and assembly language programmers often take advantage of the binary number system when implementing arithmetic operations. For example, if the pattern of bits is ``shifted left'' by one, the corresponding number is multiplied by two. A left shift is performed by moving every bit left and inserting 0's on the right side. In an 8-bit system, for example, the pattern 00000110 represents the number 6; if this pattern is shifted left, the resulting pattern is 00001100, which is the representation of the number 12. In general, shifting left by bits is equivalent to multiplying by .

Shifts such as these can be done in one machine cycle, so they are much faster than multiplication instructions, which usually takes several cycles. Other ``tricks'' are using a right shift to implement integer division by a power of 2, in which the result is an integer and the remainder is ignored (e.g. 15 4 = 3) and taking the modulus or remainder with respect to a power of 2 (see problem 8).

A fundamental relationship about binary patterns is that there are 2 distinct -digit strings. For example, for there are = 256 different strings of 1's and 0's. From this relationship it is easy to see that the largest integer that can be stored in an -bit word is : the patterns are used to represent the integers in the interval .

An overflow occurs when a system generates a value greater than the largest integer. For example, in a 32-bit system, the largest positive integer is . = 4,294,976,295. If a program tries to add 3,000,000,000 and 2,000,000,000 it will cause an overflow. Right away we can see one source of problems that can arise when moving a program from one system to another: if the word size is smaller on the new system a program that runs successfully on the original system may crash with an overflow error on the new system.

There are two different techniques for representing negative values. One method is to divide the word into two fields, i.e. represent two different types of information within the word. We can use one field to represent the sign of the number, and the other field to represent the value of the number. Since a number can be just positive or negative, we need only one bit for the sign field. Typically the leftmost bit represents the sign, with the convention that a 1 means the number is negative and a 0 means it is positive. This type of representation is known as a sign-magnitude representation, after the names of the two fields. For example, in a 16-bit sign-magnitude system, the pattern 1000000011111111 represents the number and the pattern 0000000000000101 represents +5.

The other technique for representing both positive and negative integers is known as two's complement. It has two compelling advantages over the sign-magnitude representation, and is now universally used for integers, but as we will see below sign-magnitude is still used to represent real numbers. The two's complement method is based on the fact that binary arithmetic in fixed-length words is actually arithmetic over a finite cyclic group. If we ignore overflows for a moment, observe what happens when we add 1 to the largest possible number in an -bit system (this number is represented by a string of 1's):

The result is a pattern with a leading 1 and 0's. In an -bit system only the low order bits of each result are saved, so this sum is functionally equivalent to 0. Operations that lead to sums with very large values ``wrap around'' to 0, i.e. the system is a finite cyclic group. Operations in this group are defined by arithmetic modulo 2.

For our purposes, what is interesting about this type of arithmetic is that , which is represented by a 1 followed by 0's, is equivalent to 0, which means for all between 0 and . A simple ``trick'' that has its roots in this fact can be applied to the bit pattern of a number in order to calculate its additive inverse: if we invert every bit (turn a 1 into a 0 and vice versa) in the representation of a number and then add 1, we come up with the representation of . For example, the representation of 5 in an 8-bit system is 00000101. Inverting every bit and adding 1 to the result gives the pattern 11111011. This is also the representation of 251, but in arithmetic modulo 2 we have so this pattern is a perfectly acceptable representation of (see problem 7).

In practice we divide all -bit patterns into two groups. Patterns that begin with 0 represent the positive integers and patterns beginning with 1 represent the negative integers . To determine which integer is represented by a pattern that begins with a 1, compute its complement (invert every bit and add 1). For example, in an 8-bit two's complement system the pattern 11100001 represents , since the complement is . Note that the leading bit determines the sign, just as in a sign-magnitude system, but one cannot simply look at the remaining bits to ascertain the magnitude of the number. In a sign-magnitude system, the same pattern represents .

The first step in defining a representation for real numbers is to realize that binary notation can be extended to cover negative powers of two, e.g. the string ``110.101'' is interpreted as

Thus a straightforward method for representing real numbers would be to specify some location within a word as the ``binary point'' and give bits to the left of this location weights that are positive powers of two and bits to the right weights that are negative powers of two. For example, in a 16-bit word, we can dedicate the rightmost 5 bits for the fraction part and the leftmost 11 bits for the whole part. In this system, the representation of 6.625 is 0000000011010100 (note there are leading 0's to pad the whole part and trailing 0's to pad the fraction part). This representation, where there is an implied binary point at a fixed location within the word, is known as a fixed point representation.

There is an obvious tradeoff between range and precision in fixed point representations. bits for the fraction part means there will be numbers in the system between any two successive integers. With 5 bit fractions there are 32 numbers in the system between any two integers; e.g. the numbers between 5 and 6 are 5 (5.03125), 5 (5.03125), etc. To allow more precision, i.e. smaller divisions between successive numbers, we need more bits in the fraction part. The number of bits in the whole part determines the magnitude of the largest positive number we can represent, just as it does for integers. With 11 digits in the whole part, as in the example above, the largest number we can represent in 16 bits is . Moving one bit from the whole part to the fraction part in order to increase precision cuts the range in half, and the largest number is now .

To allow for a larger range without sacrificing precision, computer systems use a technique known as floating point. This representation is based on the familiar ``scientific notation'' for expressing both very large and very small numbers in a concise format as the product of a small real number and a power of 10, e.g. . This notation has three components: a base (10 in this example); an exponent (in this case 23); and a mantissa (6.022). In computer systems, the base is either 2 or 16. Since it never changes for any given computer system it does not have to be part of the representation, and we need only two fields to specify a value, one for the mantissa and one for the exponent.

As an example of how a number is represented in floating point, consider again the number 6.625. In binary, it is

If a 16-bit system has a 10-bit mantissa and 6-bit exponent, the number would be represented by the string 1101010000 000010. The mantissa is stored in the first ten bits (padded on the right with trailing 0's), and the exponent is stored in the last six bits.

As the above example illustrates, computers transform the numbers so the mantissa is a manageable number. Just as is preferred to or in scientific notation, in binary the mantissa should be between and . When the mantissa is in this range it is said to be normalized. The definition of the normal form varies from system to system, e.g. in some systems a normalized mantissa is between and .

Since we need to represent both positive and negative real numbers, the complete representation for a real number in a floating point format has three fields: a one-bit sign, a fixed number of bits for the mantissa, and the remainder of the bits for the exponent. Note that the exponent is an integer, and that this integer can be either positive or negative, e.g. we will want to represent very small numbers such as . Any method such as two's complement that can represent both positive and negative integers can be used within the exponent field. The sign bit at the front of the number determines the sign of the entire number, which is independent of the sign of the exponent, e.g. it indicates whether the number is or .

In the past every computer manufacturer used their own floating point representation, which made it a nightmare to move programs and datasets from one system to another. A recent IEEE standard is now being widely adopted and will add stability to this area of computer architecture. For 32-bit systems, the standard calls for a 1-bit sign, 8-bit exponent, and 23-bit mantissa. The largest number that can be represented is , and the smallest positive number (closest to 0.0) is . Details of the standard are presented in an appendix to this chapter.


Figure 2 Distribution of Floating Point Numbers View Figure

Figure 2 illustrates the numbers that can be stored in a typical computer system with a floating point representation. The figure shows three disjoint regions: positive numbers , 0.0, and negative numbers . is the largest number that can be stored in the system; in the IEEE standard representation . is the smallest positive number, which is in the IEEE standard.

Programmers need to be aware of several important attributes of the floating point representation that are illustrated by this figure. The first is the magnitude of the range between and . There are about integers in this range. However there are only different 32-bit patterns. What this means is there are numbers in the range that do not have representations. Whenever a calculation results in one of these numbers, a round-off error will occur when the system approximates the result by the nearest (we hope) representable number. The arithmetic circuitry will produce a binary pattern that is close to the desired result, but not an exact representation. An interesting illustration of just how common these round-off errors are is the fact that 1 does not have a finite representation in binary, but is instead the infinitely repeating pattern .

The next important point is that there is a gap between , the smallest positive number, and 0.0. A round-off error in a calculation that should produce a small non-zero value but instead results in 0.0 is called an underflow. One of the strengths of the IEEE standard is that it allows a special denormalized form for very small numbers in order to stave off underflows as long as possible. This is why the exponent in the largest and smallest positive numbers are not symmetrical. Without denormalized numbers, the smallest positive number in the IEEE standard would be around .

Finally, and perhaps most important, is the fact that the numbers that can be represented are not distributed evenly throughout the range. Representable numbers are very dense close to 0.0, but then grow steadily further apart as they increase in magnitude. The dark regions in Figure 2 correspond to parts of the number line where representable numbers are packed close together. It is easy to see why the distribution is not even by asking what two numbers are represented by two successive values of the mantissa for any given exponent. To make the calculations easier, suppose we have a 16-bit system with a 7-bit mantissa and 8-bit exponent. No matter what the exponent is, the distance between any two successive values of the mantissa, e.g. between and , will be . For numbers closest to 0.0, the exponent will be a negative number, e.g. , and the distance between two successive floating point numbers will be . At the other end of the scale, when exponents are large, the distance between two numbers will be approximately , namely .



next up previous
Next: 2.7 Performance Models Up: 2 Basic Computer Architecture Previous: 2.5 Operating Systems



verena@csep1.phy.ornl.gov