|
MICROSOFT EXCEL FILE FORMAT(BIFF2.1)(1)
MICROSOFT Excel FILE FORMATXML:namespace prefix = o ns = "urn:schemas-microsoft-com:office:office" /> Microsoft Excel is a popular spreadsheet. It uses a file format called BIFF (Binary File Format). There are many types of BIFF records. Each has a 4 byte header. The first two bytes are an opcode that specifies the record type. The second two bytes specify record length. Header values are stored in byte-reversed form (less significant byte first). The rest of the record is the data itself (Figure 2-1). Figure 2-1. BIFF record header. Record Header Record Body Byte Number 0 1 2 3 0 1 ... ----------------------------------- Record Contents XX XX XX XX XX XX ... ----------------------------------- opcode length data Each X represents a hexadecimal digit Two X's form a byte. The least significant (low) byte of the opcode is byte 0 and the most significant (high) byte is byte 1. Similarly, the low byte of the record length field is byte 2 and the high byte is byte 3. BOF (Beginning of File) The first record in every spreadsheet is always of the BOF type (Figure 2-2). Figure 2-2. BOF record. Record Header Record Body Byte 0 1 2 3 0 1 2 3 ----------------------------------------- Contents 09 00 04 00 02 00 10 00 ----------------------------------------- opcode length version file number type The first two bytes, arranged with the low byte first, show that the opcode for BOF is 09h. The second two bytes indicate that the record body is 4 bytes long. The first two bytes of the body are the version number (2 for the initial version of Excel). The last two bytes are the file type. Type 10h is a worksheet file. Relating Spreadsheet Cells to Record Data Bytes A spreadsheet appears on a screen or printout as a matrix of rectangular cells. Each column is identified by a letter at its top, and each row is identified by a number. Thus cell A1 is in the first column and the first row. Cell C240 is in the third column and the 240th row. This scheme identifies cells in a way easily understood by people. However, it is not particularly convenient for computers, as they do not handle letters efficiently. They are best at dealing with binary numbers. Thus, Excel stores cell identifiers as binary numbers, that people can read as hexadecimal. The first number in the system is 0 rather than 1. Figure 2-3, which shows the form of an INTEGER record, illustrates the storage of column and row information. Figure 2-3. INTEGER record. Record Header Record Body Byte 0 1 2 3 0 1 2 3 4 5 6 7 8 ------------------------------------------------------------------ Value 02 00 09 00 00 00 02 00 00 00 00 39 00 ------------------------------------------------------------------ opcode length row column rgbAttr w Opcode 2 indicates an integer record. The length bytes show that the record body is 9 bytes long. Row CSC="0" NumberType="1" Negative="False" HasSpace="True" SourceValue="0" UnitName="in">0 in the body corresponds to spreadsheet row 1. Row 1 corresponds to spreadsheet row 2, and so on. Column 2 corresponds to spreadsheet column C. Thus, Figure 2-3 deals with cell C1. The next three bytes, labeled "rgbAttr," specify cell attributes (Table 2-3). The final pair of bytes, (labeled "w") holds the integer's value. Here it is 39H or 57 decimal. Thus the record specifies that cell C1 of the spreadsheet contains an integer with the value 57. Standard File Record Order Excel worksheet files have each record type in a predetermined position. A file need not have all types, but the ones that are present are always be in the same order. Table 2-1 lists the record types for Excel document (spreadsheet) files, in the order they would appear in a BIFF file. Table 2-2 lists the types in opcode order. Several record types in a BIFF file, namely, ROW, BLANK, INTEGER, NUMBER, LABEL, BOOLERR, FORMULA, and COLUMN DEFAULT, describe the contents of a cell. These records contain a 3 byte attribute field labeled "rgbAttr". The following table describes how the bits in the field correspond to cell attributes. Table 2-1. Cell Attributes Byte Offset Bit Description Contents 0 7 Cell is not hidden 0b Cell is hidden 1b 6 Cell is not locked 0b Cell is locked 1b 5-0 Reserved, must be 0 000000b 7-6 Font number (4 possible) 5-0 Cell format code 2 7 Cell is not shaded 0b Cell is shaded 1b 6 Cell has no bottom border 0b Cell has a bottom border 1b 5 Cell has no top border 0b Cell has a top border 1b 4 Cell has no right border 0b Cell has a right border 1b 3 Cell has no left border 0b Cell has a left border 1b 2-0 Cell alignment code general 000b left 001b center 010b right 011b fill 100b Multiplan default align. 111b The font number field is a zero-based index into the document's table of fonts. the cell format code is a zero-based index into the document's table of picture formats. There are 21 different standard formats. Additional custom formats may be defined by the user. See the FONT and FORMAT record descriptions form additonal details. Table 2-2. Excel Record Type in Order of Appearance Record Type Opcode (Hexadecimal) BOF 09 FILEPASS 2F INDEX 0B CALCCOUNT 0C CALCMODE 0D PRECISION 0E REFMODE 0F DELTA 10 ITERATION 11 1904 22 BACKUP 40 PRINT ROW HEADERS 2A PRINT GRIDLINES 2B HORIZONTAL PAGE BREAKS 1B VERTICAL PAGE BREAKS 1A DEFAULT ROW HEIGHT 25 FONT 31 FONT2 32 HEADER 14 FOOTER 15 LEFT MARGIN 26 RIGHT MARGIN 27 TOP MARGIN 28 BOTTOM MARGIN 29 COLWIDTH 24 EXTERNCOUNT 16 EXTERNSHEET 17 EXTERNNAME 23 FORMATCOUNT 1F FORMAT 1E NAME 18 DIMENSIONS 00 COLUMN DEFAULT 20 ROW 08 BLANK 01 INTEGER 02 NUMBER 03 LABEL 04 BOOLERR 05 FORMULA 06 ARRAY 21 CONTINUE 3C STRING 07 TABLE 36 TABLE2 37 PROTECT 12 WINDOW PROTECT 19 PASSWord 13 NOTE 1C WINDOW1 3D WINDOW2 3E PANE 41 SELECTION 1D EOF 0A Table 2-3. Excel Record Types in Opcode Order Record Type Opcode (hexadecimal) DIMENSIONS 00 BLANK 01 INTEGER 02 NUMBER 03 LABEL 04 BOOLERR 05 FORMULA 06 STRING 07 ROW 08 BOF 09 EOF 0A INDEX 0B CALCCOUNT 0C CALCMODE 0D PRECISION 0E REFMODE 0F DELTA 10 ITERATION 11 PROTECT 12 PASSWORD 13 HEADER 14 FOOTER 15 EXTERNCOUNT 16 EXTERNSHEET 17 NAME 18 WINDOW PROTECT 19 VERTICAL PAGE BREAKS 1A HORIZONTAL PAGE BREAKS 1B NOTE 1C SELECTION 1D FORMAT 1E FORMATCOUNT 1F COLUMN DEFAULT 20 ARRAY 21 1904 22 EXTERNNAME 23 COLWIDTH 24 DEFAULT ROW HEIGHT 25 LEFT MARGIN 26 RIGHT MARGIN 27 TOP MARGIN 28 BOTTOM MARGIN 29 PRINT ROW HEADERS 2A PRINT GRIDLINES 2B FILEPASS 2F FONT 31 FONT2 32 TABLE 36 TABLE2 37 CONTINUE 3C WINDOW1 3D WINDOW2 3E BACKUP 40 PANE 41 Worksheet Record Types in Opcode Order The following section lists all record types in opcode order. It gives a specification and byte-by-byte breakdown of each type. Note that Excel terminology refers to spreadsheets or worksheets as "documents." DIMENSIONS 00h 0d Record Type: DIMENSIONS Description: Entire dimensions or range of a spreadsheet Record Body Length: 8 bytes Record Body Byte StrUCture: Byte Number Byte Description Contents (hex) 0-1 First row 2-3 Last row plus 1 4-5 First column 6-7 Last column plus 1 Note: The last row and column in the record are both one greater than the highest numbered occupied ones. BLANK 01h 1d Record Type: BLANK Description: Cell with no formula or value Record Body Length: 7 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (Table 2-3) INTEGER 02h 2d Record Type: INTEGER Description: Constant unsigned integer Record Body Length: 9 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (Table 2-3) 7-8 Unsigned integer value (w) NUMBER 03h 3d Record Type: NUMBER Description: Constant floating point number Record Body Length: 15 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (Table 2-3) 7-14 Floating point number value (IEEE format, see Appendix A) LABEL 04h 4d Record Type: LABEL Description: Constant string Record Body Length: 8 to 263 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (Table 2-3) 7 Length of string 8-263 ASCII string, 0 to 255 bytes long BOOLERR 05h 5d Record Type: BOOLERR Description: Boolean constant or error value Record Body Length: 9 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (Table 2-3) 7 Boolean or error value Boolean true 1 false 0 Error #NULL! 0 #DIV/0! 7 #VALUE! 0Fh #REF! 17h #NAME? 1Dh #NUM! 24h #N/A 2Ah 8 Specifies Boolean or error Boolean 0 Error 1 FORMULA 06h 6d Record Type: FORMULA Description: Name, size, and contents of a formula cell Record Body Length: 17-272 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row 2-3 Column 4-6 Cell attributes (rgbAttr) (see Table 2-3) 7 Current value of formula (IEEE format, see Appendix A) 15 Recalc flag 16 Length of parsed eXPression 17 Parsed expression If a formula must be recalculated whenever it is loaded, the recalc flag (byte 15) must be set. Any nonzero value is a set recalc flag. However, a flag value of 3 indicates that the cell is a part of a matrix, and the entire matrix must be recalculated. Bytes 7 through 14 may contain a number, a Boolean value, an error code, or a string. The following tables apply. Case 1: Bytes 7 - 14 contain a Boolean value. Byte Number Byte Description Contents (hex) 7 otBool 1 8 Reserved 0 9 Boolean value 10-12 Reserved 0 13-14 fExprO FFFFh Case 2: Bytes 7 - 14 contain an error code. Byte Number Byte Description Contents (hex) 7 otErr 2 8 Reserved 0 9 error code 10-12 Reserved 0 13-14 fExprO FFFFh Case 3: Bytes 7 - 14 contain a string. Byte Number Byte Description Contents (hex) 7 otString 0 8-12 Reserved 0 13-14 fExprO FFFFh The string value itself is not stored in the field, but rather in a separate record of the STRING type. STRING 07h 7d Record Type: STRING Description: Value of a string in a formula Record Body Length: variable Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0 Length of the string 1-256 (max) The string itself The STRING record appears immediately after the FORMULA record that evaluates to the string, unless the formula is in an array. In that case, the string record immediately follows the ARRAY record. ROW 08h 8d Record Type: ROW Description: Specifies a spreadsheet row Record Body Length: 16 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Row number 2-3 First defined column in the row 4-5 Last defined column in the row plus 1 6-7 Row height 8-9 RESERVED 0 10 Default cell attributes byte Default attributes 1 Not default attributes 0 11-12 Offset to cell records for this row 13-15 Cell attributes (rgbAttr) (see Table 2-3) BOF 09h 9d Record Type: BOF Description: Beginning of file Record Body Length: 4 bytes Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Version number Excel 2 Multiplan 3 2-3 Document type worksheet 10h chart 20h macro sheet 40h If bit 8 of the version number byte pair is high (mask with 0100h to find out), the BIFF file is a Multiplan document. EOF 0Ah 10d Record Type: EOF Description: End of file Record Body Length: 0 bytes The EOF record is the last one in a BIFF file. It always takes the form 0A000000h. INDEX 0Bh 11d Record Type: INDEX Description: Contains pointers to other records in the BIFF file, and defines the range of rows used by the document. It is used to simplify searching a file for a particular cell or name. Record Body Length: variable Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-3 Absolute file position of first NAME record 4-5 First row that exists 6-7 Last row that exists plus 1 8-on Array of absolute file positions of the blocks of ROW records. The INDEX record is optional. If present, it must immediately follow the FILEPASS record. IF there is no FILEPASS record, it must follow the BOF record. CALCCOUNT 0Ch 12d Record Type: CALCCOUNT Description: Specifies the iteration count Record Body Length: 2 Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Iteration Count CALCMODE 0Dh 13d Record Type: CALCMODE Description: Specifies the calculation mode Record Body Length: 2 Record Body Byte Structure: Byte Number Byte Description Contents (hex) 0-1 Calculation mode Manual 0 Automatic 1 Automat |