SAND CDBMS Tools Guide - SAND Data Loader (ndlm) - Creating the Import Specification Script

SAND CDBMS Tools Reference Guide
Data Loader (ndlm)

Previous Topic:
Importing Data with ndlm

Chapter Index

Next Topic:
Import Record Specification

Creating the Import Specification Script

The import specification script contained in the load specification file instructs ndlm to perform an import operation, specifies the data source and the arrangement of data therein, identifies the database table which will receive the data, and indicates the correspondence between fields in the input file and the columns in the table. It also specifies any data filtering or manipulation operations to be performed before insertion into the database table. The import specification file is a standard ASCII text file that can be created using most standard text editors or word processors.
ndlm supports the processing of multiple input files in one import operation. The formatting of data in each input file must be identical.
An import specification script has the following structure:
IMPORT @input-file-name [@input-file-name1 ... @input-file-nameN]
{

[ SCTFILE:@<SCT output file> ]
[ DDLFILE:@<DDL file> ]

[ LOAD:<SCT load option> ]

[ SKIPHEADER ('line-delimiter', number-of-rows ) ]

[ <options> ]

[ LIBRARY @user-library { <function declarations> } ... ]

[ INIT
{
<initialization statements>
} ]

[ TYPE
{
<data type declarations>
} ]

[ VAR
{
<variable declarations>
} ]

RECORD
{
<record specification>
}

[ SKIPIF
{
[ {<logical expression> @skipfile-name}... ]
<logical expression> [ @skipfile-name ]
} ]

[schema-name.]table-name
{
<map specification>
}

}

Each element of the import specification script is described below.
Note:
Comment lines (beginning with // and ending with a line feed, or surrounded by /*... */ ) can be included anywhere in the load specification script, except in the PIC formatting string (see below).
By default, the field, column, and table names in the script will be read in a case-insensitive manner, and so can be specified in any combination of upper- and lowercase letters; to enable case-sensitivity, include the -q option flag in the ndlm invocation.

Specification Script Elements

IMPORT

Instructs ndlm to perform an import operation. Make sure a space character separates the IMPORT keyword and the @ symbol.

@Input-file-name

Identifies the ASCII file containing the data to be imported. This file name cannot be longer than 260 characters and must be prefixed by the "at" symbol ( @ ). Include a full path specification if the file is not located in the current directory. Note that if any referenced file or path contains a blank space, it must be enclosed in double quotation marks; that is, "<path with spaces>".
Multiple flat files (up to a maximum of 16,384) can be specified for a single import operation. The maximum number of records that can be loaded in a single import session, across all specified flat files, is 2,147,483,648 (2³¹), which is also the maximum number of unique values that can be loaded into a single domain during the import. The formatting of data in each file must be identical. Each file name is preceded by the "at" sign ( @ ), and separated from the other file specifications by one or more spaces, or by a line feed character. Note that when a specified input file is not found by ndlm, processing continues with the next specified file that is present.
Wildcard characters may also be used in the input file name specification: '?' represents a single variable character in the file name; '*' represents zero or more variable characters in the file name. All data files with names corresponding to the file name mask will be used for the import (up to the maximum of 16,384 files). For example, including the following at the beginning of the specification file:

IMPORT @*sample?.???

instructs ndlm to load data from all files in the execution directory that have in their name the substring 'sample' followed by any single character, a dot (.), and a three character file extension.
This file name mask will load data from these files:

mysample1.dat

samples.txt

2002sampled.000

but not from these files:

sample.txt

mysample5.data

sample2002.dat

A note about 8.3 file names and wildcards:
Windows server 8.3 file name creation should be disabled if wildcards are used to reference files in loader scripts. When 8.3 file name creation is turned on, the "?" wildcard character is treated as a "*" by the operating system. To avoid unexpected loader results from this Windows feature, disable 8.3 file name creation through the Windows registry. This can be accomplished by changing the value of the following registry entry to 1 (from the default 0).

HKLM\system\CurrentControlSet\Control\FileSystem\NtfsDisable8dot3NameCreation

Note that files created on or transferred to the Windows server while 8.3 file name creation was enabled are still susceptible to this wildcard issue, even after 8.3 file name creation is turned off. One way to correct this problem is to rename each relevant file that has an 8.3 file name, then change each file back to its original name.

{ }

Braces group components of the specification script. They must be placed in the relative positions indicated in the import specification structure outline. In the outline, braces are placed on separate lines to make the script more readable: tabs, spaces, and line feed characters are ignored by ndlm when reading the script, and so may be included to make the file easier to read.

SCTFILE
The presence of the SCTFILE parameter instructs ndlm to load the data described by the specification script into an SCT File, instead of, or in addition to, a standard database. The path to and name of the resulting SCT File is the value of the SCTFILE parameter. The path and file name cannot be longer than 260 characters and must be prefixed by the "at"symbol ( @ ). The path can be omitted if the SCT File will be located in the current directory.
Note that a file name or path containing spaces must be enclosed in double quotation marks.

DDLFILE
The DDLFILE parameter specifies the location of a text file containing SQL DDL commands that fully describe the destination table. When this optional parameter is included (pointing to a valid DDL file), a running SAND database is not required to create an SCT file.

The file referenced by DDLFILE can contain any number of CREATE TABLE and CREATE DOMAIN statements, not necessarily related only to one load specification script, but to any number of scripts. When the load operation is started, ndlm will look for the first CREATE TABLE statement that corresponds to the table specified in the load script. If that CREATE TABLE statement references a custom domain, ndlm will also look for the appropriate CREATE DOMAIN statement. If all of the required SQL statements are found in the DDL file, ndlm has sufficient information to create the SCT file.

The path and file name cannot be longer than 260 characters and must be prefixed by the "at"symbol ( @ ). A file name or path containing spaces must be enclosed in double quotation marks.

LOAD
The LOAD setting determines whether the data involved in an SCT File creation operation will also be loaded into the specified database. Setting LOAD to TRUE, YES, or 1 turns on simultaneous loading, so that the data will be written to both SCT File and database. Setting LOAD to any other value is equivalent to setting it to FALSE, which means that the data will be loaded only into the SCT File. If the LOAD parameter is empty or absent, the LOAD value defaults to TRUE.

SKIPHEADER ( 'line-delimiter', n )
The SKIPHEADER directive instructs ndlm to skip the first n rows of an input file, where the end of each row is identified by a user-specified delimiter ('line-delimiter'). The row delimiter consists of one or more characters, and must be contained in single quotation marks, for example, '\r\n'.
Note that the following restrictions apply to this option:

SKIPHEADER can be used only when there is a single input file

SKIPHEADER and the IGNORE option are mutually exclusive: they cannot both appear in the same script

If the -j option is used in the ndlm invocation, the -j value must not be greater than 1.

options

ndlm processing options may be specified before the RECORD section of the script. In the event of a conflict between a flag entered in the ndlm invocation and an option in the specification file, the flag in the invocation takes precedence. The available options are as follows (for each, n represents an integer):
IGNORE n
Instructs ndlm to ignore the first n bytes in an input file. This processing flag performs the same function as (and may be overridden by) the -i invocation flag.
SKIP n
Instructs ndlm to skip the first n records of an input file. This processing flag performs the same function as (and may be overridden by) the -s invocation flag.

PROCESS n
Instructs ndlm to process (that is, import) only n records of the input file. This processing flag performs the same function as (and may be overridden by) the -p invocation flag. Refer to the -p description for information about how the -j invocation flag interacts with the PROCESS option.

MAXERR n
Instructs ndlm to allow n errors before stopping the import. This processing flag performs the same function as (and may be overridden by) the -m invocation flag.

Note:
ndlm sends data to the server in "chunks" rather than as single records, making it possible for the MAXERR value to be exceeded. Ndlm reports all errors in a given chunk before testing whether the MAXERR value has been exceeded.

ERRINC n
When specified in conjunction with the MAXERR (or -m) flag, causes ndlm to pause and ask whether to continue after every n errors. This processing flag performs the same function as (and may be overridden by) the -v invocation flag.

NULL string
Establishes the specified character string as the null indicator. During import, ndlm inserts a NULL value into the appropriate database row and column for each occurrence of string in the input file. Do not enclose the string argument in single quotes (�). This processing flag performs the same function as (and may be overridden by) the -n invocation flag.

LIBRARY
The optional LIBRARY section contains a reference to a custom library file and declares one or more functions from the library. By specifying a custom library, a user may use declared functions from the library in NDL++ statements within the script. Each individual library requires its own separate LIBRARY section. See the Custom Library Functions section below for more information.

INIT
The optional INIT section is where custom libraries and LOOKUP function instances are initialized. See the Custom Library Functions section below for more information about custom libraries. See the LOOKUP function description for more information about the LOOKUP instances.

TYPE

The optional TYPE section contains the data-type declarations for any numeric data used in NDL++ statements within the script. See the Data Type Declaration section below for more information.

VAR

The optional VAR section contains the declarations of any variables used in NDL++ statements within the script. See the Declaring Variables section below for more information.

RECORD

The RECORD keyword is required in all import specification files; it signifies that the record specification follows.

record specification

The record specification is a "picture-like" description of the arrangement of data in the input data file. That is, it describes the fields that constitute each record, in the precise order that they occur (from left to right in the input file). NULLIF and/or MISSING IF clauses can be included to instruct the data loader to insert a NULL or "missing" (zero-length) value in a field when specified conditions are present. The record specification can also contain the definition of filtering options declared for the import process (in a PIC clause, see below). Record specifications are described in detail in the Import Record Specification section below.

SKIPIF

The SKIPIF clause specifies conditions under which input records will be skipped and written to a specified file instead of being loaded into the database. If there are multiple conditions, they must be contained between curly brackets { }. The brackets can be omitted if there is only one condition.
See the SKIPIF Clause section below for more information.

table-name

The table-name argument identifies the table into which the data will be imported. This table must exist in the database named in the ndlm invocation. The table must not have any PRIMARY KEY, FOREIGN KEY, or UNIQUE constraints. If the table does not belong to the default schema of the user authorization specified in the ndlm invocation, table-name must include a qualifying schema-name prefix as follows: schema-name.table-name.
If either the schema or table name contains spaces or other special characters, the name will have to appear between double quotation marks. If both schema and table names need to be delimited in this manner, there must be separate sets of quotation marks for each. The period ( . ) that separates the schema and table names must never be contained between quotation marks, otherwise it will be interpreted as part of an object name.
For example:

Legal
Illegal

"schema 1".table1
schema 1.table1

schema1."table\/1" 'schema 1'.table1

"schema.1"."table.1"
schema1.table"\/"1

"schema1"."table1" "schema1.table1"

map specification

The map specification describes the correspondence between the fields in the input data file and the columns in the table identified by table-name. NDL++ statements can be included in the map specification, allowing for specification of a broad range of data conversion and manipulation options (for details, refer to the Import Map Specification section below).

Previous Topic:
Importing Data with ndlm

Chapter Index

Next Topic:
Import Record Specification

Legal	Illegal
"schema 1".table1	schema 1.table1
schema1."table\/1"	'schema 1'.table1
"schema.1"."table.1"	schema1.table"\/"1
"schema1"."table1"	"schema1.table1"

SAND CDBMS Tools Reference Guide Data Loader (ndlm)

SAND CDBMS Tools Reference Guide
Data Loader (ndlm)