Help file for DBCSV
DBCSV is a utility to export IMAGE databases (all or part) to
comma separated value (CSV) files. CSV files are suitable for
importing into many applications, including Microsoft Excel and
Access.
We recommend that first-time users read the entire help text
(HELP OFFLINE ... will print the entire help file to LP).
The EXPORT command, which creates the CSV file(s), has many
options to give you complete control over various aspects of the
CSV files:
- is the output disk file fixed record, variable record, or
bytestream?
- should fields be separated by a tab, a blank, a comma, or a
comma + blank?
- should fields be emitted in a fixed width format?
- should the first output record be the name of each field
(i.e., a header line)?
- should quote marks (") that happen to be found in items of
type X or U be double-quoted ("") or escaped (\")?
- should numeric items (e.g., I1, J2, R2) be quoted ("123")?
- how many digits to the right of the decimal point should be
emitted for floating point numbers?
- should items of type P and Z be emitted with leading zeros?
- if quoting is being done, should items of type X and U have
trailing spaces trimmed, and items of type P and Z have the
leading space trimmed?
- do you want Omnidex datasets exported?
- do you want to skip automatic master datasets?
- do you want MANMAN databases recognized, and MANMAN / ASK
dates converted?
...in addition to other useful options.
For help on a specific command, enter: HELP
For a list of commands, enter: HELP COMMANDS
For information on EXPORT options, enter: HELP OPTIONS
To see pseudo-code describing how the output text is
generated for a single record, enter: HELP LOGIC
-----------------------------------------------------------------
Commands:
BASE CLOSEBASE DECLARE DIGITS DO
ECHO EXIT EXPORT HELP ITEMS
LISTREDO REDO RESET SET SHOWsets
USEQ VERSION :mpecommand
Each command is documented below, in alphabetic order:
BASE basename [OPENALLsets] [DBOPENSETS]
Tells DBCSV that you want to look at the specified IMAGE
database. If you later wish to export all or part of this
database, you can specify "*" for the database name in the
EXPORT command.
OPENALLSETS tells DBCSV to immediately open all of the
associated datasets of the database. This is normally not
necessary, but can be convenient if you suspect that some
of the datasets might not be on disk.
CLOSEBASE
Tells DBCSV to close the previously open database.
DECLARE itemname itemtype
Tells DBCSV that you are overriding the actual type
of an IMAGE item. (e.g., DECLARE POSTDATE CALENDAR)
Note: all DECLAREs are forgotten when the next database
is opened (whether explicity via BASE, or implicitly
via some command with a basename).
DIGits R&D testing command
ECHO text
Echos the rest of the line to $stdlist.
EXPORT [options]*
which_set ::=
output_filename ::= (an MPE name, or an HFS name,
or a star (*), or a dash (-),
or the keyword NULL)
options ::= |
noALL
noCOMMABLANK
noCRUNCH
DBEXPORT
noDELete
noDOUBLEQuote
noEMITTABS
noESCAPEDchars
EXCEL
noFASTUX
noFIXEDW
noHEADer
noHEADers
noMANMan
MAXFIND #
MAXLINES #
MBSPERchunk #
noNUMBERedfiles
noONEBLank
noQUOTEAll
RDIGits #
RECSIZE #
noSHOWAskdates
noSIZECAPacity
noSKIPAUTOs
noSKIPOMnidex
noSTRIP0
noTIMEs
noTRIM
noVERBOSE
noVERBOSEOPEN
Exports the specified set, or all sets in the databse, into
one (or more) CSV files. The various options can be
specified, and will override the current "global" value of
the options (see SET/RESET). The database name can be
replaced by a "*" if you previously opened the database
with the BASE command.
BYTESTREAM
FIXED
VARIABLE
... these options tell EXPORT that the output files
should be bytestream files, or fixed-record ASCII files,
or variable-record ASCII files (default).
Note: if FIXED is specified and if the output record
width is more than about 1024 characters, a single IMAGE
entry may require multiple output records to the file
... which could cause problems for some applications.
ALL ... this option tells EXPORT to temporarily set all
"config" options to "on". Thus, "EXPORT ... ALL NOTRIM"
would turn on all options except TRIM.
COMMABLANK ... this option tells EXPORT that you would like
to have a ", " in between fields, not just a ",". The
blank makes it easier for humans to read the output
file, and seems to be acceptable to most applications.
(This was called BLANKOK in some older versions of
DBCSV)
CHUNKED ... (obsolete)
(This option now simply implies MBSPERCHUNK 2047)
CRUNCH ... this option tells EXPORT you wish to "crunch"
the output file at the end (FCLOSE (file, 9, 0)) ...
this will lower the limit to the EOF, and release space
after the EOF on disk. SET NOCRUNCH tells EXPORT not
to crunch the file at the end of the export.
DELETE ... this option tells EXPORT you wish to delete any
existing output file of the same name as
"output_filename". (Note: if ALL is specified, DELETE is
assumed for the individual dataset filenames.)
DOUBLEQUOTES ... this option tells EXPORT to turn any quote
(") in an X/U field into a doubled-quote ("").
EMITTABS ... this option tells EXPORT that fields should be
separated by a tab (ASCII #8) instead of a blank.
ESCAPEDCHARS ... this option tells EXPORT that text from
fields of type U and X should be checked for unprintable
characters, double-quotes, and backslashes ... and (if
found) each should be "escaped", Unix-style.
Unprintable characters are ASCII 0 through ASCII 31, and
ASCII 127 through ASCII 255. If you would prefer to
have some of those characters "printable", please
contact support@allegro.com. Unprintables are converted
to a backslash followed by the three digit octal value
of their ASCII character code.
Printable escapable characters (double-quote and
backslash) have a backslash emitted first (e.g.: A"BC
--> A\"BC)
DOUBLEQUOTES, ESCAPEDCHARS, and FIXED interaction
An X field with an uppercase A, then a double quote, then
an uppercase BC (e.g.: 'A"BC') would be rendered as follows:
if FIXED then
A"BC (four bytes)
else
begin
if ESCAPEDCHARS then
if QUOTEALL
"A\"BC" (seven bytes)
else
A\"BC (five bytes)
else if NODOUBLEQUOTE then
if QUOTEALL then
"A"BC" (six bytes)
else
A"BC (four bytes)
else
"A"BC" (six bytes)
end;
EXCEL ... shorthand for: SET MAXLINES 65000 HEADERS
FIXEDW ... this option tells EXPORT that the output should
be in "fixed" (unchanging) width fields, and that you do
not want quote characters around the values. Some
applications can handle this kind of data easily, where
they might otherwise have trouble with quotes.
HEADER ... this option tells EXPORT that the first record
exported for each dataset should have a "header", which
is the list of the item names used in the set.
For example, if a dataset has only two items, CUSTOMER#
and CUST_NAME, then the HEADER option would emit:
"CUSTOMER#", "CUST_NAME"
HEADERS ... this option affect the "HEADER" option, and
only applies to chunks 2..n of a multiple-chunk output
file.
Normally (RESET HEADERS), only the first chunk has the
headers written to it (if at all).
SET HEADERS tells DBCSV to write the header to each
chunk, not just to the first chunk.
(SET/RESET HEADERS implies SET/RESET HEADER.)
LIMIT # ... this option tells EXPORT how large the output
file should be. Normally, EXPORT sets the size to be
the (dataset capacity * 1.1) + 100. LIMIT allows you to
override the default.
MANMAN ... this option tells EXPORT that the databases are
ManMan databases. This causes DBCSV to assume that
certain items are ManMan/ASK date items.
MAXFIND # ... this option tells EXPORT to stop extracting
data from each dataset after # entries are found. A
value of 0 for # means "don't limit the extraction".
MAXLINES # ... this option tells EXPORT to switch to the
next output file after # lines of output. This is
useful when creating csv files for applications like
EXCEL that limit the size of their input.
MBSPERCHUNK # ... this option tells EXPORT that each output
file should be a maximum of the specified number of
megabytes. This means that outut files will have ".##"
appended to their file name. (In turn, this implies
that output file names will be in the Hierarchical File
Space.)
A value of 0 means "ignore this option" ... note that
this will result in a Large File if you have more than
3.9 GB of output! (And if the output format is FIXED,
and not bytestream or variable record.)
For bytestream and variable record files, MBSPERCHUNK
may not exceed 2047 (and will be limited to that value
by DBCSV). In some cases, DBCSV will automatically
assume MBSPERCHUNK if you have not specified it.
(E.g., on a pre-MPE/iX 6.5 system)
Example: export sales 23 foo variable mbsperchunk 100
would create ./FOO.01, ./FOO.02, etc.
NUMBEREDFILES ... this option tells EXPORT that each output
file should have a numeric suffix (e.g., "001") instead
of a set name suffix (e.g., "CUSTOMER#S").
ONEBLANK ... this option tells EXPORT to put emit at least
one blank for a completely blank X or U field.
(e.g.: " " instead of "")
QUOTEALL ... this option tells EXPORT to put quotes around
all data (both text and numbers).
RDIGITS # ... this option tells EXPORT how many digits you
want to the right of the decimal point for floating
point (F, E) numbers.
RECSIZE # ... this option tells EXPORT that each output
file should have the specified number of bytes as a
record size (ignored for bytestream files).
A value of 0 means "ignore this option".
This option is normally not specified.
SHOWASK ... this option tells EXPORT that you want a
summary of the ManMan/ASK dates seen during extraction.
SKIPOMNIDEX ... this option tells EXPORT to skip Omnidex
datasets during an "EXPORT ... ALL".
SKIPAUTOS ... this option tells EXPORT that you wish to
skip extracting data from automatic masters.
NOSKIPAUTOS tells EXPORT to read and export automatic
masters.
STRIP0 ... this option tells EXPORT that numbers should not
have leading zeroes. Normally, items of type P and Z
(packed and zoned decimal numbers) are emitted with
leading zeroes to maintain the alignment of output
columns (to aid human readability).
TIMES ... this option tells DBCSV to report elapsed and CPU
time after some its operations.
TRIM ... this option tells EXPORT to "trim" any trailing
blanks from an X or U field before putting quotes around
it, and to drop the leading space before a non-negative
P or Z item.
If you had three entries in a set, where the item is an
X4 and the values are "FRED", "BOB ", and "A ":
SET TRIM SET NOTRIM
-------- ----------
"FRED" "FRED"
"BOB" "BOB "
"A" "A "
VERBOSE ... this option tells DBCSV to be a little more
verbose in output. For example, when opening an
output file, VERBOSE will report the requested
record size and limit.
VERBOSEOPEN ... this option tells DBCSV to be a little more
verbose when opening the CSV output file. (If either
VERBOSE or VERBOSEOPEN is set, this will happen.)
Output filenames:
There are five kinds of output filename you can use:
1. an MPE filename (e.g., FOO)
2. an HFS filename (e.g., /tmp/FOO_)
3. a star (e.g., *)
4. a dash (e.g., -)
5. "NULL" (without the quotes)
Case 1 (MPE) is suitable when exporting only a single
dataset.
Case 2 (HFS) is suitable in all cases. (see NOTE below)
(the NOTE explains the trailing underscore)
Case 3 (*) means "use the root database name (in HFS
format) as the base output name". This is valid only
when the ALL option is specified. If the database is
SALES.PUB.SYS, then "*" would be equivalent to saying:
./SALES (Note that the group and account parts of the
database name are not used here.)
Case 4 (-) means "send the output to $stdlist".
Case 5 (NULL) means "send the output to $NULL".
(This is primarily used for internal DBCSV testing.)
If you use the ALL option, then the output filenames
will be converted to HFS syntax (i.e., POSIX filenames).
If you use the NONUMBER option, then each POSIX filename
will end with the set name as the suffix, otherwise if
will end with the set number as the suffix.
The base output filename (specified in the EXPORT
command) must almost always be a filename that is not
in an MPE group. I.e., FOO is bad, /tmp/FOO_ is good.
(This is because DBCSV might generate a dataset output
filename like ./FOO_CUSTOMER_DATA_INFO, which is longer
than the longest filename that can reside in an MPE
*group* (files directly within a group are limited to
16 character filenames).)
NOTE: the suggested trailing "_" character in the
filename /tmp/FOO_. That trailing "_" is useful to
"break up" the left-hand part of the final filename
from the "right-hand" part. E.g., if DBCSV is emitting
set names, then a set named CUSTOMERS might become
/tmp/FOO_CUSTOMERS.
Another option is specifying an output filename like:
./foo/ which will create a local directory called ./foo,
and will have no limitations as to filename length.
If you use ALL along with an output filename like ./foo/
or /tmp/SALES_, you should have no naming problems.
Note that if you have "chunked" output (see
MBSPERCHUNK), the output filename will always be an HFS
name (e.g., ./FOO.01) with ".##" appended to each
chunk. (If more than 99 chunks are created, ".###" is
appended for chunks 100 through 999.)
You can specify "SET DBEXPORT" to have the options
setup to be similar to Eloquence's DBEXPORT utility.
Examples.
For the following examples, we will assume that the
SALES database has two datasets, CUSTOMERS and PRODUCTS.
Example: EXPORT SALES ALL /tmp/SALES_
Exports all datasets into separate files:
/tmp/SALES_ summary
/tmp/SALES_CUSTOMERS CUSTOMERS dataset
/tmp/SALES_PRODUCTS PRODUCTS dataset
Example: EXPORT SALES CUSTOMERS /tmp/SALES_CUSTOMERS
Exports just the CUSTOMERS dataset into a CSV file:
/tmp/SALES_CUSTOMERS CUSTOMERS dataset
Handling questionable/bad data:
IMAGE allows users to put invalid data into some fields
(e.g., non-digits into P and Z fields). If DBCSV sees a
non-digit in a P or Z field, the field is reported (if
it's one of the first five such problems), and the text
"ILLEGAL" is emitted instead.
IMAGE allows users to put lower-case letters in U
fields). Like IMAGE and QUERY, DBCSV does not object to
lower case letters in U fields, and they will be emitted
unchanged (and with no comment/warning).
DO [prefix]
If a prefix is specified, finds the more recent redo line
that matches the prefix and re-submits it. If no prefix is
specified, the most recent input is re-submitted.
Exit
Terminates DBCSV.
Help [OFFLINE] [command_name_prefix]
HELP displays the entire help text.
HELP E displays help for all commands starting
with E.
The OFFLINE option routes the help text to LP instead
of to the terminal.
ITEMS [ALL] [Verbose]
Lists all of the items for the currently open database (if
any ... see the BASE command), and their types. (Item
types may be overridden with the DECLARE command.)
The ALL option tells DBCSV to display the CSV output width
each item would require.
LISTREDO
Lists the REDO stack, which persists across runs (while in
the same session or job).
REDO [prefix]
If a prefix is specified, finds the more recent redo line
that matches the prefix and presents it for editing and
re-submitting.
If no prefix is specified, the most recent input is
presented for editing and re-submitting.
The editor used by REDO is QZMODIFY. Pressing ^W and then
H will result in help information.
RESET [...]
See SET
SET (or RESET) [ALL | ] [] [...]
options ::= (see the EXPORT command)
general_options ::= <
other_options ::= <
[no]ALLOWMPE |
DEFAULT |
[no]LP |
[no]MANMAN |
MAXFIND # |
[no]PAGING |
RESETq |
RDIGits # |
SET >*
Turns on (or off) the specified flags.
The state of all options is reported at the end of the SET
(or RESET) command. To see the options and flags without
changing any of them, enter: SET
SETQ and RESETQ act like SET and RESET, but do not report
the settings at the end of the command.
SET ALL will set all of the EXPORT command's options to
"on". RESET ALL will set all of the EXPORT command's
options to "off".
Each of the general options is documented next.
SET ALLOWMPE tells DBCSV that the user may enter MPE
commands (e.g., :showtime). This option may be SET only
from the startup.dbcsv.allegro file, or by a user with
SM capability. NOTE: allowing MPE commands might open a
security hole. The default is RESET ALLOWMPE.
SET DEFAULT will set all "config" options to their default
values:
General controls:
Reset ALLOWMPE Set PAGING
Reset LP Reset TIMES
Reset VERBOSE Reset VERBOSEOPEN
EXPORT file options:
Set FIXEDformat Set CRUNCH
Set LIMIT 0 Set MAXFIND 0
Set MBSPERCHUNK 0 Set RDIGITS 9
Set RECSIZE 0
EXPORT options:
Reset DELETE Reset NUMBEREDfiles
Set COMMABLANK Reset EMITTABS
Reset HEADER Reset HEADERS
Set SKIPAUTOS Set SKIPOMNIDEX
Reset STRIP0
Text quoting/escaping options:
Reset DOUBLEQUOTE Reset ESCAPEDCHARS
Reset FIXEDW Set ONEBLANK
Reset QUOTEALL Set TRIM
(Allowed to create Large Files for output (MBSPERCHUNK = 0).)
The default file type varies:
Set FIXEDformat (if on MPE/iX 6.5 or later)
Set VARiable (if on pre-MPE/iX 6.5)
The MANMAN and SHOWASKdates are not reported if or you are
not logged into the MANMAN account.
SET LP routes all subsequent output to the printer
instead of to the terminal. RESET LP cancels this.
SET MANMAN tells DBCSV that if the database name is a known
ManMan database, then we should automatically declare
certain items as ASKDATEs instead of as I1 items. Known
ManMan database names are: ARKDB, CRPDB, FINDB, GLDB,
HISDB, MANDB, PAYDB, PHYDB, RESDB, and SYSDB.
SET MAXFIND #, where # is a positive integer,
tells DBCSV to stop exporting after # entries.
This can be useful for generating a small CSV file for
testing your exports. A value of 0 means "export
all entries".
SET PAGING tells DBCSV to paginate the output sent to
the terminal. This is the default for ordinary
interactive (session) runs.
SET RDIGits # tells DBCSV how many digits you want to
see to the right of the decimal point in floating point
numbers. E.g., for the value 3.14159, SET RDIGITS 3
would cause DBCSV to emit 3.141.
The "options" options are the documented as in
the EXPORT command documentation. Setting/resetting them
via SET/RESET makes the new values the default for
subsequent EXPORT commands. For example:
SET FIXEDW
export ... (will use FIXEDW)
export ... NOFIXEDW (will NOT use FIXEDW)
export ... (will use FIXEDW)
SET DEFAULT
export ... (will NOT use FIXEDW)
(The "..." in the immediate above lines indicates that we're
not listing the other options to the "export" command here;
the user did not (and would not) enter "...".)
Example:
SET HEADER
EXPORT sales 1 sales001.csv
... will have a header line (due to prior "SET HEADER")
EXPORT sales 2 sales002.csv NOHEADER
... will not have a header line
EXPORT sales 3 sales001.csv
... will have a header line, because the "NOHEADER"
on the prior "EXPORT" command affected just that
one command, and didn't change the default
(global) value for HEADER.
SHOWsets [] [ITEMS | ALL] [options] [Verbose]
Displays the names of all datasets in the database, and
optionally of all items in each set.
The "CSV size" of each set is shown, which is how long
comma-separated output lines for entries in the set are
expected to be. (If "ITEMS" is specified, the "CSV size"
of each item is shown, which is how many characters each
item is expected to be).
The database name can be replaced by a "*" (or omitted) if
you have previously opened the database with the BASE
command.
For a description of "[options]", enter: HELP SET
(Note: not all options are applicable to the SHOW command.)
See also the ITEMS command, which lists items in item
number order.
USEq filename
Tells DBCSV to read commands from the specified file.
Normally, USE will echo the lines as it reads them.
USEQ tells DBCSV to not do such an echo.
VERSION
Reports the version number of DBCSV.
:mpecommand
DBCSV allows you to execute MPE commands, but only if
a "SET ALLOWMPE" has been done.
-----------------------------------------------------------------
At startup, DBCSV does an implicit:
USEQ startup.dbcsv.allegro
Note: file equates are disallowed for this USEQ command.
This allows you to have common system-wide startup commands
(e.g., SET/RESET).
It then does an implicit:
USEQ dbcsv.cfg
This allows a user to have account-specific startup commands.
-----------------------------------------------------------------
Large Files
DBCSV is aware that the CSV output from a big database might
exceed 3.9 GB. MPE/iX limits the size of bytestream files to
2 GB, and the size of variable record files to 2 GB. Prior to
MPE/iX 6.5, MPE limited fixed record files to about 3.9 GB.
As of MPE/iX 6.5, "Large Files" are supported, which are fixed
record files of 4 GB up to 127 GB.
DBCSV automatically determines if you are on MPE/iX 6.5 or later,
and will default to creating fixed record (possibly Large) files
in that case. Otherwise, DBCSV defaults to creating (possibly
multiple) variable record files of about 2 GB.
-----------------------------------------------------------------
WRQ Reflection File transfer
When using WRQ Reflection File transfer to move a DBCSV output
file to a PB, we have noticed significantly better performance
when using VARIABLE or FIXED output files (as opposed to
BYTESTREAM).
-----------------------------------------------------------------
FTP
When FTP'ing a byte stream CSV file from the HP 3000, we have had
the best results by telling the FTP process that the file is
binary. (When saying ASCII, the Windows FTP client sometimes
injects a after every character from the HP 3000.)
-----------------------------------------------------------------
This is how DBCSV strings values together when building a line
of text for output:
for each field do:
if X or U field then
begin
get n characters of data ("the field").
if TRIM then
if text is all blanks and ONEBLANK then
use ' ' (single blank)
else
trim trailing blanks, use result.
if FIXEDW then
use the field as is
else
begin
if ESCAPEDCHARS then
escape any unprintable characters, and any double quotes
(e.g., 'A"B" --> (A\"B")
if DOUBLEQUOTES then
double any doublequotes ('"' --> '""')
if QUOTEALL or any characters in QUOTECHARS then
use a quote ("), the field, and a quote (")
else
use the field
end;
end
else
convert number to text, use that text.
if FIXEDW then
append blanks (if needed) to make the text a uniform length
emit the text
if field# < last_field# then
begin (more fields to come, emit separator)
if FIXEDW then
append a blank
else if EMITTABS then
append a tab
else if COMMABLANK then
append ", " (a comma and a blank)
else
append "," (a comma)
end;
end;
-----------------------------------------------------------------
Notes on FTP (download/upload)
Most users will want to transfer the csv files created by DBCSV
to another computer. If you are experiencing slow transfers via
FTP from your HP 3000, the following tips may be of use, because
they generate a smaller csv file (which, therefore, has less data
to transfer over the network).
- use VARIABLE or BYTESTREAM output
(this may avoid some trailing blanks at the end of records)
- use SET NOFIXEDW
(this will avoid some trailing blanks at the end of records)
If you plan on compressing the csv output file prior to transfer,
you should be aware that some compression programs on the HP 3000
do not accept files larger than 2 GB. If you are using such a
program, you will probably need to specify MBSPERCHUNK to tell
DBCSV to emit multiple "chunks" (separate files) of the specified
size instead of a single file. (E.g., SET MBSPERCHUNK 200 will
create output files of up to 200 MB.)
-----------------------------------------------------------------
News (recent changes)
2009-10-24
Fixed problem that might have led to extraneous data in
header for chunks 2..n.
Added MAXLINES option, to allow users to set limit to 65000
output lines for Microsoft Excel.
Added EXCEL option ... shorthand for "HEADERS MAXLINES 65000".
2009-10-20
Changed quoting a bit to avoid quoting if not requested/needed.
Fixed interaction between QUOTEALL, ESCAPEDCHARS, and DOUBLEQUOTE.
Fixed MAXFIND to impose limit more accurately.
Added HEADERS option to emit column headers on every chunk file.
2009-04-21
Expanded SHOWSETS a bit, removed SETS command (moved its
functionality into SHOWSETS).
Reworked start of help file a bit.
2008-10-10
Added undocumented ROSETTA option.
Added code to notice fserr 426 and try to adapt to it.
Changed ".info" suffix to be "info" if filename ends in
"-" or in "/".
2008-10-09
Added conditional display of bad Z string character.
Added code to handle blank Z string characters as 0.
Changed ".info" suffix to be "info" if filename ends in "_".
2008-10-10
Added code to notice fserr 426 and try to adapt to it.
Changed ".info" suffix to be "info" if filename ends in
"-", "_", or "/".
Added conditional display of bad Z string character.
Added code to treat blank Z string characters as 0.
Changed ".info" suffix to be "info" if filename ends in "_".
Revised progress to report percent done (based on number of
entries in a dataset, not the capacity).
2008-02-07
Fixed problem eating prompt
HSIT 2007-10-27
Fixed escapedchars to escape backslash.
2007-10-25
Revised progress to also report # entries read.
Added ONEBLANK
-----------------------------------------------------------------
NOTE: if you find problems, please email support@allegro.com
//