DBF File Read/Write Package Philip J. Erdelsky 03-04-2013 1. Introduction This package reads a table from a .DBF file, writes a table to a .DBF file or updates records in a .DBF file with most of the generality of a database engine but without the overhead. It does not support memo fields. The source code for the package resides in the following files: dbftable.cpp dbftable.h The files are compiled and combined into the object file dbftable.obj, which must be linked to every application that calls on the package. The header file dbftable.h must be included in every module that calls on the package. The package is written in C++, with two features that may be nonstandard. A class type called "__property" is called as though it were a class member, but specified member functions are called when the member is read or written. The type called "__int64" is a 64-bit signed integer. 2. Opening and Closing a Table To read or write a table, first declare a variable of type dbf_table: dbf_table t; Then open the table by calling the member function open(): result = t.open(filespecs, mode); const char *filespecs; DBF file specifications int mode; one of the following: DBF_READ reading only DBF_WRITE create new file for writing DBF_APPEND append to existing file DBF_UPDATE update records in existing file bool result; true for a successful operation, false if the file cannot be opened in the desired mode or is not in the proper format When you are finished with the table, close it by calling the member function close(): result = t.close(); bool result; false if a table opened in the DBF_WRITE or DBF_APPEND mode encountered a file write error If this function is called for a closed table, it does nothing. The destructor automatically closes the table if it is still open. Most other member functions do nothing and return nonsense if the table is closed. The following property determines whether the table is open: result = t.is_open; bool result; true if the table is open 3. Information About an Open Table The following property gives the mode: result = t.mode; int result; DBF_READ, DBF_WRITE, DBF_APPEND or DBF_UPDATE The date when the file was last altered is contained in the header. You can retrieve this date as result = t.header_date; struct date result; The file date and time can be retrieved by two calls: dresult = t.file_date; tresult = t.file_time; struct date dresult; struct time tresult; Of course, file dates and times are meaningful only for tables in the DBF_READ, DBF_APPEND and DBF_UPDATE modes. When writing to a table opened in the DBF_WRITE or DBF_APPEND mode, a file write error can occur at any time. When this happens, the following property becomes true and further file writes are inhibited: result = t.error; bool result; true if there has been a file write error The number of records in the file is a table property that you can read but cannot change directly: n = t.number_of_records; long n; number of records (including deleted records) The number of records in a table opened in the DBF_WRITE mode is zero until records have been written to it. You can get information about the table fields by calling either of the following two member functions: pfield = t.field_info(n); pfield = t.field_info(name); int n; number of desired field, where first field is number zero const char *name; name of desired field const dbf_field_descriptor *pfield; pointer to field descriptor, or NULL if there is no such field The field descriptor has the following members: char type; field type: 'C' character 'N' numeric 'D' date 'L' logical char name[12]; field name, nul terminated int width; width of field int precision; precision of numeric field int offset; internal use only Field names are case-insensitive, and generally do not contain blanks. The width of a character field is the actual width of a field value, which is padded with blanks at the right end, if necessary. The width of a numeric field is the number of characters required to express a field value as a decimal number, with a prefixed minus sign if there are negative entries and a decimal point if the precision is nonzero. The width of a date field is 8, since it is expressed in character form as YYYYMMDD. The width of a logical field is 1. The precision of a numeric field is the number of digits to the right of the decimal point. 4. Building a New Table A table opened in the DBF_WRITE mode initially has no fields. You can add a field to it by calling the following member function: pfield = t.add_field(type, name, width, precision); char type; field type: 'C' character 'N' numeric 'D' date 'L' logical const char *name; name of field int width; width (ignored for date or logical field) int precision; precision (ignored for non-numeric field) const dbf_field_descriptor *pfield; pointer to field descriptor, or NULL if there were errors Ignored arguments may be omitted. For a Numeric field, the width must be positive and no greater than 19. The precision must be either zero or a positive value no greater than 15 and also no greater than two less than the width. These restrictions ensure that the number will fit into its field. Fields will appear in the table in the order in which they were added. You must add all fields before appending the first record. The following member function copies the field definitions from an open table to one that has been opened in the DBF_WRITE mode: t.copy_fields(u); const dbf_table &u; open table from which field definitions are to be copied dbf_table &t; table to which field definitions are to be copied; it must have been opened in the DBF_WRITE mode The function preserves the order of the fields. 5. Reading from a Table in the DBF_READ or DBF_UPDATE Mode For the purposes of this package, records are numbered starting at record 1 (not record 0), and deleted records are included. Immediately after a table is opened in the DBF_READ mode, the current record is undefined. You can change the current record by assigning to the following property: t.record_number = n; long n; number of desired record It is advisable to use a direct assignment, since changes made indirectly, such as t.record_number++, may not be handled correctly. If the record number is out of range (less than one or greater than t.number_of_records), then functions and properties dealing with the current record will do nothing and return nonsense. The following property indicates whether the current record is deleted: result = t.deleted; bool result; true if current record is deleted You can get a field value for the current record by calling one of the following member functions: result = t.get(buffer, buflen, name); character field result = t.get(buffer, buflen, pfield); character field result = t.get(nvalue, name); numeric field result = t.get(dvalue, name); numeric field result = t.get(ivalue, name); numeric field result = t.get(nvalue, pfield); numeric field result = t.get(dvalue, pfield); numeric field result = t.get(ivalue, pfield); numeric field result = t.get(bvalue, name); logical field result = t.get(bvalue, pfield); logical field result = t.get(dvalue, name, defdv); date field result = t.get(dvalue, pfield, defdv); date field bool result; true if the field was successfully read; false if the table was not open, the current record number was out of range, the field did not exist or was not of the proper type const dbf_field_descriptor *pfield; pointer to field descriptor (returned by t.field(n) or t.field(name)) const char *name; name of field char *buffer; buffer for character field value int buflen; size of buffer __int64 &nvalue; value of a numeric field double &dvalue; value of a numeric field int &ivalue; value of a numeric field bool &bvalue; value of a logical field struct date &dvalue; value of a date field struct date defdv; date to be used if a date field is blank The __int64 returned value for a numeric field is its value with the decimal point removed. The limitations of the DBF file format ensure that this is always a 64-bit signed integer. The floating-point value is (double) nvalue / pow10(pfield->precision), which is the value returned when the returned value for a numeric field is of type double. The value returned for a numeric field to a variable of type "int" is its value truncated to an integer value. The value returned for a character field is always of pfield->width characters wide, and it is nul-terminated. Hence buflen should be at least pfield->width+1. If the buffer is too small, the value will be truncated to fit, and it will still be nul-terminated. A numeric, date or logical field may be read as though it were a character field. The returned value is the representation used in the .DBF file. The following functions can be used to determine whether a field is blank, which is the usual default or "null" value for numeric and date fields: result = t.isblank(name); result = t.isblank(pfield); bool result; true if the field is blank const dbf_field_descriptor *pfield; pointer to field descriptor (returned by t.field(n) or t.field(name)) const char *name; name of field The functions containing pfield as a parameter will run somewhat faster than the others. 6. Writing to a Table in the DBF_WRITE, DBF_APPEND or DBF_UPDATE Mode You may start appending records to a table in the DBF_APPEND mode as soon as it is successfully opened. For a table in the DBF_WRITE mode, you must add all required fields to it before you start appending records to it. To create a record, first fill the fields in it by calling the following member functions. You need not call them in any particular order. Fields left unspecified will be filled with default values. result = t.put(buffer, name); character field result = t.put(buffer, pfield); character field result = t.put(nvalue, name); numeric field result = t.put(dvalue, name); numeric field result = t.put(ivalue, name); numeric field result = t.put(nvalue, pfield); numeric field result = t.put(dvalue, pfield); numeric field result = t.put(ivalue, pfield); numeric field result = t.put(bvalue, name); logical field result = t.put(bvalue, pfield); logical field result = t.put(dvalue, name); date field result = t.put(dvalue, pfield); date field bool result; true if the field was successfully written into the record; false if the table was not open, was out of range, the field did not exist or was not of the proper type, or the data was invalid or did not fit into the field const dbf_field_descriptor *pfield; pointer to field descriptor (returned by t.field(n) or t.field(name)) const char *name; name of field const char *buffer; buffer for character field value, nul terminated __int64 nvalue; value of a numeric field double nvalue; value of a numeric field int ivalue; value of a numeric field bool bvalue; value of a logical field struct date dvalue; value of a date field The field name is case-insensitive; any lowercase letters in it will be converted to uppercase. A numeric, logical or date field may be written as though it were a character field. However, the data being written must conform to dBase conventions for the kind of field being written. When a character field is written, or when a field of another type is written as a character field, the data are truncated or padded with blanks at the right end, if necessary, to make the data fit. A numeric or date field may be cleared to its default value by writing the empty string ("") to it. The formats of field values are consistent with those used by the get() functions. To mark a record as deleted, you can assign to the property deleted: t.deleted = true; If a record is deleted, you can undelete it in the same way: t.deleted = false; After assembling the record, you can append the record to the table by calling the following function: result = t.append(); bool result; false if the write failed (usually for lack of disk space) Default values are as follows: type default value --------- ------------- character blanks numeric blanks date blanks logical false By default, a record is not deleted. If the table was opened in the DBF_UPDATE mode, you can update the most recently read record by calling put() functions to make any necessary changes and then calling the following function: t.update(); Fields left unspecified will retain their previous values. 7. Special Functions and Operations If two tables have the same field definitions, the following operator copies the contents of the current record from one to the other: t << u; const dbf_table &u; table from which record is to be copied dbf_table &t; table to which record is to be copied CAUTION: This function was designed for speed, not safety. It does no checking and may wreak havoc if used improperly. Of course, t.append() or t.update() member function must be called to write the changed values to the table t. The following function scans a character string for a signed integer with the specified precision and returns a 64-bit integer containing its value without the decimal point: result = dbf_table::scan(n, buffer, precision); const char *buffer; buffer containing nul-terminated string __int64 &n; scanned integer int precision; number of digits to the right of the decimal point bool result; false if no number was present Leading and trailing spaces and embedded commas are ignored. If the number of digits to the right of the decimal point exceeds the specified precision, the excess digits are ignored. Some examples: call value returned in n --------------------------------- ------------------- dbf_table::scan(n, "12.3456", 2); 1234 dbf_table::scan(n, "12.3456", 6); 12345600 The following function reverses the action of dbf_table::scan(). It takes a 64-bit signed integer and edits it into a string with the specified number of digits to the right of the decimal point. width = dbf_table::edit(buffer, n, precision); const char *buffer; buffer to receive nul-terminated result, it must be at least 22 bytes long to handle the worst possible case __int64 &n; integer to be edited int precision; number of digits to the right of the decimal point int width; width of edited result, including sign and decimal point if the precision is nonzero The following member function removes all deleted records from a file: result = t.pack(filespecs); const char *filespecs; specifications of file to receive the packed result long result; for a successful operation, the number of records in the packed table; negative if the file cannot be opened or is not in the proper format The following command-line application removes all deleted records from a table: dbfpack The input and output files must be different, and the output file, if it already exists, will be overwritten. The following member function converts the current record in a table to a single line in comma-delimited format: result = t.delimited(buffer, buflen); const dbf_table t; current table char *buffer; buffer for comma-delimited format int buflen; size of buffer bool result; false if the table was not open, was not opened in the DBF_READ or DBF_UPDATE format, the buffer was not large enough, or the record number was out of range; true otherwise Each field is padded with spaces, if necessary, so it occupies the same space, regardless of its contents. Hence fields will be aligned when lines for different records are adjacent. An empty field is represented by two commas, spaced out in this manner; or it is simply omitted if it is at the end of the line. 8. Sorting a Table The following function sorts a table which has been opened in the DFB_READ mode: result = t.sort(filespecs, keys); const char *filespecs; file specifications for table to receive sorted result bool result; true for a successful operation, false records in the sorted table; negative if the file cannot be opened or is not in the proper format, or if the field list contains an unrecognizable field name or an invalid option const char *keys; space-delimited list of key field names; the maximum number of keys is 10 (If more than 10 are specified, only the first 10 will be used in the sort.) If the sorting order for a key is to be descending or case-insensitive, or both, its name must be followed by /D, /I or /DI, respectively. The usual (ascending) order is the alphabetical, numerical or chronological order, according to the key type. It also puts false values before true values for logical fields and blank dates before valid dates. A blank numeric key is sorted as though it were zero. The following command-line application can be used to sort a table: dbfsort key1 key2 ... keyn The input and output files must be different, and the output file, if it already exists, will be overwritten. The sort is stable, i.e., the order of records with equal sort keys is preserved. 9. Creating a Table with Specified Fields The following command-line application creates an empty table with specified fields: dbfcreate The structure file is a text file with one line for each field, containing the type, name, width and precision, respectively, of the field. The type must be C, N, D or L, as defined for the add_field() function, and the width and precision are expressed in decimal. The width is not required for fields of type D and L, and the precision is required only for fields of type N. The items on a line are separated by one or more spaces or tabs. A semicolon (;) and anything to the right are treated as a comment. A line containing only spaces, tabs and a comment will be ignored. Small letters are considered equivalent to the corresponding capital letters. Here is a typical structure file: ; This is my table structure D DATE C NAME 15 N PRICE 8 2 C SIZE 1 ; must be S, M, L or X L LISTED The following command-line application creates an empty table with the same fields as an existing table: dbfcopystruct
10. Listing Table Contents in Text Form The following command-line application lists the contents of a table in comma-delimited text form: dbflist The result is sent to the standard output, and may be redirected to a printer or file. The result is in comma-delimited form, with spacing added as required so fields will be aligned vertically. The format for dates is YYYYMMDD. Quotation marks and control characters in character fields are replaced with spaces. Character fields are enclosed in quotation marks. The output is compatible with the dBase APPEND command. The following command-line application shows the structure of a table: dbfstruct The structure includes the filespecs, the number of records, the date of most recent change, and a list of the names, types, widths and precisions of the fields. The result is sent to the standard output, and may be redirected to a printer or file. 11. Editing a Table The following Windows application can be used to edit the records of a table: editdbf It can be called up from an appropriately configured shortcut, or the file can be dragged and dropped onto a shortcut icon. The application cannot be used to add or remove records, and it will behave anomalously when asked to edit a table with no records. The application cannot be used with a table containing more than 25 fields. When the application is invoked, it displays the first record of the table. Buttons are available to move to the adjacent records, or to the last record. The desired record number can also be typed into a box near the top of the window. Each character and numeric field is displayed in an edit box. Standard Windows techniques can be use to make changes. Each date field is displayed in an edit box, accompanied by buttons to insert today's date, or to change to the next or previous date. The date can also be typed into the edit box in any of several reasonable formats in which the month, day and year appear in that order. The date will automatically be edited to a standard form when the edit box loses the input focus. The day of the week is shown automatically. Dates more than about a century ago or hence may not be processed correctly. Each numeric field is automatically edited to a standard form with the precision required for the field when the edit box loses the input focus. Each logical field is represented by a checkbox, which is checked if the value is TRUE and unchecked if the value is FALSE. Date and numeric fields have blank values, which can be entered as such in their respective edit boxes. When changes have been made, THE UPDATE BUTTON MUST BE PRESSED to preserve the changes before the application is closed or another record is invoked. If this is not done, the changes will be lost! When the UPDATE button is pressed (or the equivalent Alt-U key combination is pressed), the changes are written to the table, and the next following record is brought up. This facilitates editing records in sequence. When the Alt-U key combination is used, it also keeps the input focus on the same field. This is especially helpful when changes are being made to the same field in consecutive records. This application will not add or remove records, but one or more records can be marked as deleted. The dbfpack application can then be used to remove the deleted records. The following command-line application appends a single blank record to a table: dbfappendblank The editdbf application can be used to edit the new record. 12. Replacing Zeros with Blanks in Numeric Fields Most database programs will interpret a blank numeric field as a zero, as this package will do if it is called upon to read such a field into a numeric variable. However, some programs go so far as to REPLACE such fields by zeros when performing certain operations, such as sorting. The following command-line application undoes this damage by replacing every zero numeric field by a blank: dbfzero2blank 13. Conversion to HTML Format The following command-line application converts a DBF table to an HTML table: dbf2html The result is sent to the standard output. It may be redirected to a file or printer.