Excel AppNote: Opening and Saving Text Files (WE0801)

Last reviewed: October 6, 1997
Article ID: Q99844
4.00 WINDOWS Appnote kbusage

The information in this article applies to:

  • Microsoft Excel for Windows, version 4.0

The Application Note "Opening and Saving Text Files" (WE0801), discusses how Microsoft Excel parses text and formats numbers and alphanumeric characters in text files. In addition, it includes the Text Reader add-in macro (TXTRDR.XLA), which you can use to import text files. This macro parses your text according to a delimiter you select and prevents Microsoft Excel from applying automatic number formatting to the entries in your file.

You can obtain this Application Note from the following sources:

You can obtain this Application Note from the following sources:

  • Microsoft's World Wide Web Site on the Internet
  • The Internet (Microsoft anonymous ftp server)
  • The Microsoft Network (MSN)
  • Microsoft Download Service (MSDL)
  • Microsoft Product Support Services

For complete information, see the "To Obtain This Application Note" section at the end of this article.

THE TEXT OF WE0801

  Microsoft(R) Product Support Services Application Note (Text File)
                 WE0801: OPENING AND SAVING TEXT FILES
                                                   Revision Date: 5/93
                                                                1 Disk

The following information applies to Microsoft(R) Excel for Windows(TM), version 4.0.

  --------------------------------------------------------------------
| INFORMATION PROVIDED IN THIS DOCUMENT AND ANY SOFTWARE THAT MAY     |
| ACCOMPANY THIS DOCUMENT (collectively referred to as an Application |
| Note) IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER      |
| EXPRESSED OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE IMPLIED      |
| WARRANTIES OF MERCHANTABILITY AND/OR FITNESS FOR A PARTICULAR       |
| PURPOSE. The user assumes the entire risk as to the accuracy and    |
| the use of this Application Note. This Application Note may be      |
| copied and distributed subject to the following conditions:  1) All |
| text must be copied without modification and all pages must be      |
| included;  2) If software is included, all files on the disk(s)     |
| must be copied without modification (the MS-DOS(R)  utility         |
| diskcopy is appropriate for this purpose);  3) All components of    |
| this Application Note must be distributed together;  and  4) This   |
| Application Note may not be distributed for profit.                 |
|                                                                     |
| Copyright (C) 1993 Microsoft Corporation.  All Rights Reserved.     |
| Microsoft and MS-DOS are registered trademarks and Windows is a     |
| trademark of Microsoft Corporation.                                 |
 ---------------------------------------------------------------------


OVERVIEW

This Application Note discusses how Microsoft Excel parses text and formats numbers and alphanumeric characters in text files. In addition, it includes the Text Reader add-in macro (TXTRDR.XLA), which you can use to import text files. This macro parses your text according to a delimiter you select and prevents Microsoft Excel from applying automatic number formatting to the entries in your file.

INTRODUCTION

In Microsoft Excel version 4.0, you can open and save files in text file format. Your text file can consist of rows of text in a single column or rows and columns of text. When you open a text file in Microsoft Excel, the way the text is broken into columns depends on the character you choose to delimit your fields or columns of text. You can also choose no delimiter and parse your text after the file is opened. The most common column delimiters are commas, tabs, and spaces.

When you open or save text files in Microsoft Excel, you may find that some of the text entries in your file do not appear as expected. These unexpected results may occur because of the way Microsoft Excel parses text and applies formatting to numbers and alphanumeric characters.

OPENING AND PARSING TEXT FILES

How Text Is Parsed

When you open a text file, Microsoft Excel checks the file for the column delimiter you selected in the Text File Options dialog box (from the File menu, choose Open and then choose the Text button). In this dialog box, the default Column Delimiter option is Tab; you can also select the Comma, Space, Semicolon, None, or Custom option (when you select Custom, you can specify a custom delimiter). Once you choose a column delimiter, Microsoft Excel places the text following each instance of the specified delimiter in a separate cell. For example, if your text is delimited by spaces and if you select the Space delimiter, Microsoft Excel places the text following each space in a separate cell. This process is called parsing and occurs automatically when you specify a delimiter for your text file.

If you do not select the Space delimiter before you open this space- delimited file, or if you select the None (no delimiter) option, the text in your file will not be parsed; that is, each line of text will appear in a separate row in column A only. In this case, after you open the file in Microsoft Excel, you can break the text into separate columns using one of the parsing commands.

The parsing commands, located on the Data menu, are:

  • Parse - This command parses text based on a fixed length or number of characters per column, rather than parsing your text based on a specified character.

    -and-

  • Smart Parse - This command is added to the Data menu when you load the Flat File add-in macro. Smart Parse gives you the option of parsing your data based on a specified delimiter; this is similar to specifying a delimiter in the Text File Options dialog box, with the exception that the parsing is done after you've opened the text file and there is not an option to parse based on a tab delimiter.

    NOTE: When you open a file with a .CSV extension, Microsoft Excel for Windows will always assume that this file is a comma-separated value (CSV) file regardless of the selected column delimiter. For files with any other filename extension, for example, .TXT, Microsoft Excel will use the delimiter you specify in the Text File Options dialog box.

How Text Entries Are Formatted

The following example illustrates how Microsoft Excel treats different types of entries when you open a text file with a specified delimiter or when you parse a text file after it's been opened. If you have a text file containing the following data, where each entry is separated by a tab

   09876     00000001  1/93  12-31-92 2E9
   12345     67-890    01234-5678     alpha     text string

and you select the Tab delimiter in the Text File Options dialog box, the entry following each tab is placed in a separate column and each line is placed on a separate row. Each text string is evaluated when it is placed in its respective cell. If the contents can be interpreted as a value, Microsoft Excel will attempt to apply a built- in number format to that value. If there is no corresponding number format, the contents will be interpreted as text and placed in the cell as is.

When you open this file in Microsoft Excel, it will resemble the following example:

      |    A   |     B     |      C      |       D   |       E
   ---|------- |-----------|-------------|-----------|---------------
   1  |   9876 |         1 |      Jan-93 |  12/31/92 |        2.00E+09
   ---|--------|-----------|-------------|-----------|----------------
   2  | 12345  | 67-890    | 01234-5678  |  alpha    |    text string

Note that the appearance of some text entries changes. Entries that are interpreted as values are right-aligned and those that are interpreted as text are left-aligned -- these are the alignment defaults for Microsoft Excel.

The following table explains how each entry from the sample data is

   formatted in Microsoft Excel.

   Original     In Microsoft   How entry is formatted in
   entry        Excel          Microsoft Excel
   -------------------------------------------------------------------
   09876 and    9876 and 1     These entries are interpreted as
   00000001                    values in the General number format.
                               Because leading zeros in a number are
                               insignificant, Microsoft Excel drops
                               the zeros.


   1/93         Jan-93         If there are any hyphens or slashes
                               separating values, Microsoft Excel
                               will examine the individual values to
                               see if they can be construed as a day
                               and/or a month and/or a year. If so, a
                               date format is applied. Because the 1
                               can be interpreted as a month value
                               and 93 is a valid year, this entry
                               matches the mmm-yy built-in number
                               format.

   12-31-92     12/31/92       Because the individual values
                               separated by the hyphens can be
                               interpreted as a valid month, day, and
                               year, Microsoft Excel applies the
                               closest matching date format, m/d/yy.
                               Note that when the format is applied,
                               the hyphens are replaced with slashes.
   2E9          2.00E+09       This entry is interpreted to match the
                               0.00E+00 built-in number format.

                                 NOTE: If the item had been 2x9,
                                 where x is any letter other than e
                                 or E, the item would have remained a
                                 text string and would not have
                                 changed.

   12345        12345          This entry matches the General built-
                               in number format. The application of
                               this format does not alter the value's
                               appearance because there are no
                               leading zeros.

   67-890 and   67-890 and     These entries
   01234-5678   01234-5678     are interpreted as text because
                               hyphens separate the numbers and
                               because the numbers on either side of
                               the hyphens cannot be interpreted as a
                               month, day, or year.

   alpha and    alpha and      These entries are not changed because
   text string  text string    they are text.

Using Custom Number Formats to Change Text Entries

You can use custom number formats to return entries to their original appearance (that is, the way they looked before you opened the text file in Microsoft Excel). For example, to change 9876 to 09876, do the following:

  1. Select cell A1 (the cell where the value is located).

  2. From the Format menu, choose Number.

  3. In the Code box, type 00000. This formats the number to always have 5 digits, displaying leading zeros even when you've entered a single digit value. For example, with this number format applied, 1 will be displayed as 00001, and 25 will be displayed as 00025.

  4. Choose OK to accept the number format. 9876 is now displayed as 09876.

In date formats in Microsoft Excel, hyphens and slashes are interchangeable. If you would rather use hyphens than slashes (or vice versa), apply a custom number format that uses the desired character. For example, to change the value 12/31/92 to 12-31-92, apply a m-d-yy custom number format .

    Important: If you apply custom number formats or make other
    formatting changes to font size, borders, shading, row height, or
    column width, you must save the file in the Normal file format to
    retain the formatting. (From the File menu, choose Save As and
    under Save File As Type, select the Normal format.)

Some entries cannot be returned to their original appearance with a custom number format. For example, there is no custom number format that will restore 2.00E+09 to 2E9. To retain the format 2E9, you can either add a text character to the string using a word processor or other text editor, or you can use the Text Reader add-in macro to import the file. For information on how to use the Text Reader add-in macro, see "Importing Text Files with the Text Reader Add-in Macro" in this Application Note.

For more information about custom number formats, see pages 221-227 of "User's Guide 1" or see the "Creating or Deleting Custom Number Formats" topic under "Number Formats" in Microsoft Excel Help.

SAVING AND EXPORTING TEXT FILES

In Microsoft Excel, you can save worksheets as text files in a comma- delimited (CSV) or tab-delimited (Text) text format. You can select either one of these options by choosing Save As from the File menu and selecting the desired format from the Save File As Type list. You can also export data from your worksheet to a text file that is space delimited using the Flat File add-in macro discussed in this Application Note. After you load this add-in macro, the Export command will be available on the Data menu. For additional information on using the Flat File add-in macro, see pages 151-152 of the "User's Guide 2".

How Text Entries Are Formatted

In Microsoft Excel, when you save a file in CSV or Text file format, each entry in the file retains its number format.

  NOTE: In the following example, the value 2E9 in cell E1 must be
  entered as a text value (by preceding the entry with an apostrophe)
  to prevent it from being converted to the scientific notation
  format.

For example, if you have a worksheet that resembles the following

      |    A   |       B     |      C      |      D      |      E
   ---|------- |-------------|-------------|-------------|----------
   1  |   9876 |    00000001 |        1/93 |    12/31/92 | 2E9
   ---|--------|-------------|-------------|-------------|----------
   2  | 12345  | 67-890      | 01234-5678  | alpha       | text string

when you open the file in a text editor, the table will resemble one of the following examples depending on how you saved the file in Microsoft Excel:
  • If you saved the file in the CSV file format, the file will resemble the following:

             09876,00000001,1/93,12-31-92,2E9
             12345,67-890,01234-5678,alpha,text string
    
    
  • If you saved the file in the Text file format, the file will resemble the following:

             09876     00000001  1/93  12-31-92      2E9
             12345     67-890    01234-5678  alpha   text string
    
    

Using the Flat File Add-in Macro to Export Data

When you export a text file, you can use the export feature of the Flat File add-in macro to retain not only the built-in and custom number formats of your selection, but the alignment of the cell contents as well. The resulting text file will contain evenly aligned columns.

When you load the Flat File add-in macro, two commands are added to the Data menu:

  • SMART PARSE - Use this command to parse a text file after it is opened.

    -and-

  • EXPORT - Use this command to export selected data to a space- delimited text file.

To retain custom number formats and the alignment of the cell contents, choose Export from the Data menu and select the Retain Cell Formats option in the Export dialog box. The Retain Cell Formats option allows you to retain number formats and alignments in the cell entries.

If you export the sample data with the Retain Cell Formats option enabled, the result will be:

   09876  00000001        1/93  12-31-92  2E9
   12345  67-890    01234-5678  alpha     text string

If you want evenly aligned output, where number formats are retained and alignments are the same, do the following before you export your data:

  1. Select the cells you want to export.

  2. From the Format menu, choose Alignment.

  3. Under Horizontal, select the Left or Right option (instead of the General option).

If you then export this sample data with the Retain Cell Formats option enabled, your data will resemble the following (assuming that you choose the Left option in the previous procedure):

   09876  00000001  1/93     12-31-92  2E9
   12345  67-890    01234-5678  alpha     text string

If you do not retain cell formats, the numbers will revert to the General number format; for example, dates will be converted to their appropriate serial numbers and all entries will be left-aligned -- regardless of whether a cell contains text or numbers.

For example, if you export this sample data when the Retain Formats option is not enabled, your data will resemble the following:

   9876   1       33970          33969     2E9
   12345  67-890  01234-5678   alpha  text string

  NOTE: When cells are formatted using the Justify, Fill, or Center
  Across Selection alignment options, choosing the Export command will
  result in the error message, "Error writing to disk."

For more information about the Flat File add-in macro, see pages 151- 152 of "User's Guide 2".

IMPORTING TEXT FILES WITH THE TEXT READER ADD-IN MACRO

You can use the Text Reader (TXTRDR.XLA) add-in macro, provided with this Application Note, to import a delimited text file into Microsoft Excel version 4.0 without changing the format or appearance of any entries in the file. For example, 09876 will still appear as 09876, 12- 31-92 will still appear as 12-31-92, and so on.

When you import a file using the Text Reader add-in macro, the file itself is not opened. Instead, the data is imported and each entry is read from the source file and placed in a new worksheet. The Text Reader add-in macro reads in each text entry individually and adds an apostrophe to the beginning of the entry. For example, the Text Reader add-in macro turns 09876 into '09876. Microsoft Excel interprets any value that is preceded by an apostrophe as a text value. As a result, when the entry is placed in a cell, it is displayed as it appeared in the original text file. Once the data has been imported, you can save the new worksheet in any available file format.

  NOTE: This method of importing a text file may take longer than
  opening the file by choosing the Open command from the File menu.

To use the Text Reader add-in macro with Microsoft Excel:

  1. In File Manager, copy the TXTRDR.XLA file from the enclosed WE0801 disk to the LIBRARY directory on your hard disk (this directory should be located in the same directory where Microsoft Excel is installed).

  2. Start Microsoft Excel.

  3. From the File menu, choose Open. Change to the LIBRARY directory, select TXTRDR.XLA from the list of files, and choose OK.

To import your text file, do the following:

  1. From the File menu, choose Open Text (this command is added to the File menu by the Text Reader add-in macro).

  2. Select the text file you want to import and choose OK.

  3. Specify the appropriate column delimiter. The default delimiter is Tab; however, you also have the option to select Comma, Space, or Other (if you select Other, you can specify any keyboard character to be the delimiter).

  4. Choose OK to import the file.

Converting Text Entries to Values

When you use the Text Reader add-in macro, each imported entry is formatted in text format and placed in a cell. If some of the columns contain entries that you want to use in calculations, you must convert these text entries to values. If you attempt to perform calculations on these cells without first converting them to values, you will receive a 0 (zero) or a #VALUE! error value.

For example, if you import the following text

  |       A      |      B     |      C     |     D
--|--------------|------------|------------|--------- 1 | Product No. | Month-Year | Units Sold | Revenue --|--------------|------------|------------|---------
2 | 2e190        | 1-93       | 1000       | 5000
--|--------------|------------|------------|---------
3 | 2e190        | 2-93       | 2000       | 10000
--|--------------|------------|------------|---------
4 | 2e190        | 3-93       | 3000       | 15000

and want to calculate totals for the entries in the Units Sold and Revenue columns, do the following to convert these entries to values:

  1. In cell E1 (or any blank cell), type 1.

  2. With cell E1 selected, choose Copy from the Edit menu.

  3. Select cells C2:D4 (the range of cells you want to convert).

  4. From the Edit menu, choose Paste Special. Under Operation, select the Multiply option and choose OK.

Multiplying each text entry by the number 1 will convert it to a value. You can then apply number formats and perform calculations.

Converting Fractions to Values

The exception to using the method described in the previous section is when you have fractions that could be interpreted as dates. For example, if you import the following fractions

  |  A
--|------ 1 | 1/4 --|------ 2 | 1/3 --|------ 3 | 1/2 --|------ 4 | 3/4

you need to format a blank range of cells with a Fraction number format and then use a formula to correctly convert them. If you use the procedure outlined in the previous section (choosing the Paste Special command and selecting the Multiply option), these fractions would be converted to dates.

To convert these text-formatted fractions to values:

  1. Select cells B1:B4 (or a range of blank cells equal in size to the range you want to convert to values). This range should remain selected through step 7.

  2. From the Format menu, choose Number. Under Category, select Fraction. Under Format Codes, select # ?/?. Choose OK.

  3. Type the following formula

          =VALUE(0&" "&A1)
    

    where A1 is the first cell in the range you want to convert.

  4. Hold down the CTRL key and press ENTER. This will automatically copy the formula to all cells in the selected range, B1:B4.

  5. From the Edit menu, choose Copy.

  6. From the Edit menu, choose Paste Special. In the Paste box, select the Values option and choose OK. This converts each of the VALUE() formulas to their underlying values.

  7. From the Edit menu, choose Cut.

  8. Select cells A1:A4. From the Edit menu, choose Paste. This replaces the text fractions with values.

You can now perform calculations using the fractional values.


KBCategory: kbusage
KBSubcategory:

Additional reference words: noupd
Keywords : kbusage
Version : 4.00
Platform : WINDOWS


THE INFORMATION PROVIDED IN THE MICROSOFT KNOWLEDGE BASE IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. MICROSOFT DISCLAIMS ALL WARRANTIES, EITHER EXPRESS OR IMPLIED, INCLUDING THE WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT SHALL MICROSOFT CORPORATION OR ITS SUPPLIERS BE LIABLE FOR ANY DAMAGES WHATSOEVER INCLUDING DIRECT, INDIRECT, INCIDENTAL, CONSEQUENTIAL, LOSS OF BUSINESS PROFITS OR SPECIAL DAMAGES, EVEN IF MICROSOFT CORPORATION OR ITS SUPPLIERS HAVE BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. SOME STATES DO NOT ALLOW THE EXCLUSION OR LIMITATION OF LIABILITY FOR CONSEQUENTIAL OR INCIDENTAL DAMAGES SO THE FOREGOING LIMITATION MAY NOT APPLY.

Last reviewed: October 6, 1997
© 1998 Microsoft Corporation. All rights reserved. Terms of Use.