IEEE vs. Microsoft Binary Format; Rounding Issues (Complete)

ID: q35826

SUMMARY

This article discusses the following:

1. Why Microsoft uses the IEEE Floating Point format instead of the

   Microsoft Binary Format (MBF) in the following products:

    - Microsoft QuickBasic versions 4.00, 4.00b, and 4.50 for the IBM PC.
    - Microsoft Basic Compiler versions 6.00 and 6.00b for MS-DOS and MS
      OS/2.
    - Microsoft Basic PDS version 7.00 for MS-DOS and MS OS/2.

2. Differences between IEEE Floating Point format and the Microsoft
   Binary Format (MBF). Numeric rounding issues in IEEE. For more
   information, search for a separate article on the following words:

      IEEE and tutorial and rounding

3. Microsoft plans for using IEEE instead of Microsoft Binary Format (MBF)
   in the future.

MORE INFORMATION

                        IEEE and Rounding
                        =================

1. Why use IEEE instead of MBF?

   IEEE was chosen as the math package for QuickBasic version 4.00 and
   Microsoft Basic Compiler 6.00 to allow for mixed-language calling
   capabilities. This ability is a very desirable feature. In addition
   to this feature, IEEE also is more accurate than Microsoft Binary
   Format (MBF). Calculations are performed in an 80-bit temporary area
   rather than a 64-bit area. (Note, the Alternate-Math Libraries use
   a 64-bit temporary area.) The additional bits provide for more
   accurate calculations and decrease the possibility that the final
   result has been degraded by excessive roundoff errors. Keep in mind
   that precision errors are inherent in any binary floating-point math.
   Not all numbers can be accurately represented in a binary
   floating-point notation.

   IEEE also can take advantage of a math coprocessor chip (such as
   the 8087, 80287, and 80387) for great speed. MBF cannot take
   advantage of a coprocessor.

2. If the calculations are more accurate, why are numbers such as
   .07#, 8.05#, and 9.96# displayed with a 1 in the 16th digit?
   Microsoft Binary Format (MBF) does not do this.

   MBF is accurate to 15 digits, while IEEE is accurate to 15 or 16
   digits. Since the numbers are stored in different formats, the
   last digit may vary. MBF double-precision values are stored in
   the following format:

      -------------------------------------------------
     |              |    |                             |
     |8 Bit Exponent|Sign|   55 Bit Mantissa           |
     |              | Bit|                             |
      -------------------------------------------------

   IEEE double precision values are stored in the following format:

      -------------------------------------------------
     |    |                | |                         |
     |Sign| 11 Bit Exponent|1|  52 Bit Mantissa        |
     | Bit|                | |                         |
      -------------------------------------------------
                            ^
                            Implied Bit (always 1)

   You will notice that Microsoft Binary Format (MBF) has 4 more bits
   of precision in the mantissa. However, this does not mean that the
   value is any more accurate. Precision is the number of bits you are
   working with, while accuracy is how close you are to the real
   number. In most cases, the IEEE value will be more accurate because
   it was calculated in an 80-bit temporary. (When the IEEE standard
   was proposed, the main consideration for double precision values
   was range. As a minimum, the desire was that the product of any two
   32-bit numbers should not overflow the 64-bit format.)

3. Why doesn't my rounding algorithm eliminate the 1's in the 16th
   place?

   Your rounding algorithm is correctly rounding the numbers, but the
   extra digit is occurring because of the inherent rounding errors
   and format differences. For example, 6.99999999999999D-2 is rounded
   to .07 but the internal IEEE representation of the value is
   7.000000000000001D-2. (It is true that MBF displays the value as
   .07, but the difference in values is not considered as a problem. It
   is a difference between math packages.)

4. Why doesn't the STR$ function get the proper strings from either
   single or double-precision numbers?

   The STR$ function works correctly. The value placed in the string
   is the same as the value displayed on the screen with an
   unformatted PRINT. If the IEEE representation of .07 is
   7.000000000000001D-2, then the STR$ will return
   7.000000000000001D-2.

   There are a few ways to generate the desired string. The method
   used depends on the range of numbers, other resources available,
   and programmer's preference. Listed below are three possible
   routines that can be used. Keep in mind that as soon as the string
   is converted back to a number, it will no longer be truncated.

   Method 1
   --------

   If the range of numbers is between 2^32/100 and -2^32/100, the
   following method can be used:

   FUNCTION round2$ (number#)
   n& = number# * 100#
   hold$ = LTRIM$(RTRIM$(STR$(n&)))

   IF (MID$(hold$, 1, 1) = "-") THEN
      hold1$ = "-"
      hold$ = MID$(hold$, 2)
   ELSE
      hold1$ = ""
   END IF

   length = LEN(hold$)
   SELECT CASE length
   CASE 1
      hold1$ = hold1$ + ".0" + hold$
   CASE 2
      hold1$ = hold1$ + "." + hold$
   CASE ELSE
      hold1$ = hold1$ + LEFT$(hold$, LEN(hold$) - 2)
      hold1$ = hold1$ + "." + RIGHT$(hold$, 2)
   END SELECT
   round2$ = hold1$
   END FUNCTION

   The value being rounded is multiplied by 100# and the result is
   stored in a long integer. The long integer is converted to a string
   and the decimal point is inserted in the correct location.

   Method 2
   --------

   This routine is much more complicated than the first method, though
   it handles a much larger range of values. The value being rounded
   is multiplied by 100# and this result must fit within the range of
   valid double precision numbers.

   FUNCTION round$ (number#) STATIC
   number# = INT((number# + .005) * 100#) / 100#
   hold$ = STR$(number#)
   hold$ = RTRIM$(LTRIM$(hold$))

   IF (MID$(hold$, 1, 1) = "-") THEN
     new$ = "-"
     hold$ = MID$(hold$, 2)
   ELSE
     new$ = ""
   END IF

   x = INSTR(hold$, "D")
   DecimalLocation = INSTR(hold$, ".")

   IF (x) THEN  'scientific notation
     exponent = VAL(MID$(hold$, x + 1, LEN(hold$)))
     IF (exponent < 0) THEN
       new$ = new$ + "."
       new$ = new$ + STRING$(ABS(exponent) - 1, ASC("0"))
       round$ = new$ + MID$(hold$, 1, 1)
     ELSE
       new$ = new$ + MID$(hold$, 1, DecimalLocation - 1)
       num = LEN(hold$) - 6
       IF num < 0 THEN
         num = exponent
       ELSE
         num = exponent - num
      new$ = new$+MID$(hold$, DecimalLocation+1, x-DecimalLocation-1)
       END IF
       new$ = new$ + STRING$(num, ASC("0")) + ".00"
       round$ = new$
     END IF

   ELSE  'not scientific notation
     x = INSTR(hold$, ".") 'find decimal point
     IF (x) THEN
       IF MID$(hold$, x + 3, 1) = "9" THEN
         xx = VAL(MID$(hold$, x + 2, 1)) + 1
         hold1$ = LEFT$(hold$, x)
         IF xx = 10 THEN
     hold1$ = hold1$+LTRIM$(STR$(VAL(MID$(hold$, x + 1, 1)) + 1))+"0"
           round$ = new$ + hold1$
         ELSE
           hold1$ = hold1$ + MID$(hold$, x + 1, 1) + LTRIM$(STR$(xx))
           round$ = new$ + hold1$
         END IF
       ELSE
         round$ = new$ + LEFT$(hold$, x + 2)
       END IF
     ELSE
      round$ = new$ + hold$
     END IF
   END IF
   END FUNCTION

   Method 3
   --------

   This method requires the use of the Microsoft C Compiler 5.x. It
   uses the C library routine sprintf(). This routine takes formatted
   screen output and stores it in a string variable.

   C Routine:

   struct basic_string {
      int length;
       char *address;
       } ;

     void round(number,string)
     double *number;
     struct basic_string *string;
     {
     sprintf(string->address,"%.2f",*number);
     }

   Basic Program:

   DECLARE SUB Round CDECL (number#, answer$)
   CLS
   b# = .05#
   FOR i = 1 TO 10
        b# = b# + .01#
        answer$ = SPACE$(50)
        CALL Round(b#, answer$)
        PRINT b#, LTRIM$(RTRIM$(answer$))
        PRINT
        cnt = cnt + 4
        IF cnt > 40 THEN
           cnt = 0
           INPUT a$
        END IF
   NEXT i

   The same screen formatting can be accomplished with Basic's PRINT
   USING statement. However, Basic has no direct means of storing this
   information in a string. The information can be sent to a
   Sequential file and then read back into string variables.

   You can also write the information to the screen and read this
   information using the SCREEN function. The SCREEN function returns
   the ASCII value of the specified screen location. Consider the
   following example:

   x# = 7.000000000000001D-02
   CLS
   LOCATE 1, 1
   PRINT USING "#################.##"; x#
   FOR i = 1 TO 20
   num = SCREEN(1, i)
   SELECT CASE num
     CASE ASC(".")
       number$ = number$ + "."
     CASE ASC("-")
       number$ = "-"
     CASE ASC("0") TO ASC("9")
       number$ = number$ + CHR$(num)
     CASE ELSE
   END SELECT
   NEXT i
   PRINT number$

   The PRINT USING statement would display 17 spaces and then .07. The
   value of number$ would be .07.

5. Does Microsoft plan to use Microsoft Binary Format (MBF) in future
   versions of Basic?

   At this time, there are no plans to return to MBF. The benefits of
   IEEE (interlanguage calling and coprocessor support) are far greater
   than those of MBF.

Additional reference words: QuickBas BasicCom KBCategory: kbprg kbcode KBSubcategory:

Last Reviewed: January 12, 1995