IEEE 754 floating point standard is the most common representation today for real numbers on computers. The IEEE (Institute Of Electrical And Electronics Engineers) has produced a standard to define floating –point representation and arithmetic. Although there are other representation used for floating point numbers. The standard brought out by the IEEE come to be known as IEEE 754.It is interesting to note that the string of significant digits is technically termed the mantissa of the number, while the scale factor is appropriately called the exponent of the number. The general form of the representation is the following (-1)S * M* 2E . Where S represents the sign bit, M represents the mantissa and E represents the exponent. When it comes to their precision and width in bits, the standard defines two groups: base and extended format. The basic format is further divided into Single –Precision format with 32-bits wide, and double-precision format with 64-bits wide. The three basic components are the sign, exponent, and mantissa.

1.1 IEEE 754 Floating Point Standard:

The IEEE 754 is a floating-point standard established by the IEEE in 1985. It contains two representations for floating-point numbers—the IEEE single precision format and the IEEE double precision format. The IEEE 754 single precision representation uses 32 bits, and the double precision system uses 64 bits. Although 2’s complement representations are very common for negative numbers, the IEEE floating-point representations do not use 2’s complement for either the fraction or the exponent. The designers of IEEE 754 desired a format that was easy to sort and hence adopted a sign-magnitude system for the fractional part and a biased notation for the exponent.

The IEEE 754 floating-point formats need three sub-fields: sign, fraction, and exponent. The fractional part of the number is represented using a sign-magnitude representation in the IEEE floating-point formats—that is, there is an explicit sign bit (S) for the fraction. The sign is 0 for positive numbers and 1 for negative numbers. In a binary normalized scientific notation, the leading bit before the binary point is always 1. Therefore, the designers of the IEEE format decided to make it implied, representing only the bits after the binary point. In general, the number is of the form

N 5 (21)S 3 (1 1 F ) 3 2E

where S is the sign bit

F is the fractional part

E is the exponent.

The base of the exponent is 2. The base is implied—that is, it is not stored anywhere in the representation. The magnitude of the number is 1 1 F because of the omitted leading 1. The term significand means the magnitude of the fraction and is 1 1 F in the IEEE format. But often the terms significand and fraction are used interchangeably by many and are so used in this book. The exponent in the IEEE floating-point formats uses what is known as a biased notation. A biased representation is one in which every number is represented by the number plus a certain bias. In the IEEE single precision format, the bias is 127. Hence, if the exponent is 11, it will be represented by 11 1 127 5 128. If the exponent is 22, it will be represented by 22 1 127 5 125. Thus, exponents less than 127 indicate actual negative exponents, and exponents greater than 127 indicate actual positive exponents. The bias is 1023 in the double precision format. If a positive exponent becomes too large to fit in the exponent field, the situation is called overflow, and if a negative exponent is too large to fit in the exponent field, that situation is called underflow.

1.2 Single Precision floating point Numbers:

The IEEE single precision format uses 32 bits for representing a floating-point number, divided into three subfields, as illustrated in Figure 7-1. The first field is the sign bit for the fractional part. The next field consists of 8 bits, which are used for the exponent. The third field consists of the remaining 23 bits and is used for the fractional part.

Fig.1.1: Single-precision floating point representation.

The sign bit reflects the sign of the fraction. It is 0 for positive numbers and 1 for negative numbers. In order to represent a number in the IEEE single precision format, first it should be converted to a normalized scientific notation with exactly one bit before the binary point, simultaneously adjusting the exponent value.

The exponent representation that goes into the second field of the IEEE 754 representation is obtained by adding 127 to the actual exponent of the number when it is represented in the normalized form. Exponents in the range 1–254 are used for representing normalized floating-point numbers. Exponent values 0 and 255 are reserved for special cases, which will be discussed subsequently.

The representation for the 23-bit fraction is obtained from the normalized scientific notation by dropping the leading 1. Zero cannot be represented in this fashion; hence, it is treated as a special case (explained later). Since every number in the normalized scientific notation will have a leading 1, this leading 1 can be dropped so that one more bit can be packed into the significand (fraction). Thus, a 24-bit fraction can be represented using the 23 bits in the representation. The designers of the IEEE formats wanted to make the highest use of all the bits in the exponent and fraction fields.