Previous
Topic:
The Import Command |
Next
Topic: The Map Specification |
Setting Up the Data Load with the Load Specification Script
The Record Specification
The next section (called the record specification or RECORD section) describes the format of input records, and assigns each field a label and length, expressed in number of characters (bytes). The field label can be any string that uniquely identifies the field within the script, and has no necessary relation to the name of the target column in the database table. Variable-length fields are designated by an asterisk ( * ), and must be followed by a delimiter character. All fields in the input data must be described in the record specification, and each record in the input file must have the same format.
The following example illustrates the general form of the record specification. The keyword RECORD is followed by the actual record description, which is enclosed within braces ( { } ).
RECORD
{
field1 8
field2 *\n
}
This example describes an input record consisting of two fields, the first
having a length of eight characters, the second being of variable length.
The last field description also indicates the end-of-record delimiter in the
source data by including the ASCII new-line character (\n).
Note: The end-of-record delimiter is \n on UNIX systems. In Windows, the end of record is indicated by the carriage return and new-line characters together: \r\n. For the sake of simplicity, the rest of the examples in this chapter use the UNIX end-of-record indicator only. In practice, if any data files are created in Windows, the \r\n characters should be used when describing the data in the specification file.
Mapping Fixed-Length Character Fields
The format of fixed-length character fields can be specified in an alternative way, by mapping fields in terms of their position within the record and their length. The basic method described above must be used when describing variable-length or numeric fields.
The alternative field specification syntax has the following form:
[ <delimiter> ] ( starting-position : field-length ) [ <delimiter> ]
where starting-position and field-length are integers.
In this syntax, starting-position indicates the beginning of the field, measured in number of characters from the beginning (or leftmost position) of the record. The field-length parameter must be separated from the starting-position entry by a colon ( : ), and the two parameters must be enclosed in parentheses.
Example
The following format specifications define a record composed of two consecutive fixed-length fields, each 15 characters long. The fields are not separated by a delimiter, but the end of the record is signaled by the new-line character (\n).
field1 (1:15)
field2 (16:15)\n
Remapping Fields
A field-format entry in the record specification can also define the remapping of a fixed-length, character-type input field onto one or more subsequent fields. That is, one field, or a part of it, can be used to create another field (in effect creating "subfields"). Field remapping is defined using the following syntax:
( reference-field-name : position : length )
where:
- reference-field-name is the label of the source field to be parsed
- position specifies the starting position of the subfield within the reference field
- length defines the length of the subfield (in characters).
These elements must be separated by colons ( : ), and the entire entry must be enclosed in parentheses. Because the presence of field delimiters would interfere with the interpretation of the remapping syntax, reference fields cannot contain delimiter symbols.
Previous
Topic:
The Import Command |
Next
Topic: The Map Specification |