|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Object PIRL.Conductor.Conductor
public class Conductor
Conductor is a queue management mechanism for the sequential processing of data files.
A Conductor processes a list of data source files through a list of
procedures to be invoked on each file. These lists are obtained from
a Database
as a pair of tables. The list of files is
contained in a Sources table and the procedures are defined in
a Procedures table. A pair of Sources and Procedures tables
constitutes a Pipeline: each Sources record is processed in
the order it occurs in the table (FIFO); each procedure specified by
a Procedures record is executed in sequence number order. Each
pipeline has a name that is used to find its database tables, where
the pipeline tables are named:
<pipeline>_Sources
<pipeline>_Procedures
A Conductor must be run with a command line
argument
that specifies the pipeline it is to process. Multiple Conductors
may safely process the same pipeline from the same or separate host
systems.
Database
package.
This package abstracts the particulars of database access. The
Conductor configuration file specifies the database access
information.
A Sources table must contain at least these fields:
Status_Indicators
method which, along
with other Status_xxx
methods, provides a convenient
means for other Java classes to interpret and manipulate these field
values. Note: This field should initially be NULL or empty. It
is important that this field only be modified consistent with the
operation of Conductor (i.e. change at your own risk!).
A Sources table may contain the required fields in any order, and it may contain additional fields as desired. For example, it is recommended that a timestamp field be provided that will automatically be updated with the last update time of each record.
A Procedures table must contain at least these fields:
references
to be substituted
with database field values and/or configuration parameter values.
Obviously this field must be neither empty nor NULL.
pattern
.
A Procedures table may contain the required fields in any order, and it may contain additional fields as desired. If a "Description" field is present, which is higly recommended, it is used as a text description of each procedure included in the processing log. It is also recommended that a timestamp field be provided that will automatically be updated with the last update time of each record.
A Conductor maintains a connection with the Database
server
while it is operating. If any access to the Database by Conductor
fails due to a loss of the connection, Conductor will attempt to
reconnect and, if successful, repeat the access operation again (a
connection failure of the repeated access operation does not result
in a reconnection attempt). If the reconnection fails because the
connection can not be established with the server, the attempt will
be retried after a delay period of 5 minutes (this can be overridden
with the RECONNECT_DELAY_PARAMETER
). Up to 16 retries (this
can be overridden with the RECONNECT_TRIES_PARAMETER
) will
be attempted before a database access failure is deemed to have
occurred. N.B.: Database connection resilience does not
automatically apply to any database access operations by pipeline
procedures.
When a Conductor is started it first reads its Configuration
file. This is "Conductor.conf" by default, but another
filename may be specified on the command line
. The
configuration file contains parameter definitions in Parameter Value
Language (PVL) format. This file
contains the information needed to access the database used by
Conductor as well as any other parameters that may be useful to
resolve parameter references embedded
in procedure definition record field values. Since references may
contain nested references it is quite appropriate for users to
provide configuration parameters with values that are database field
references (perhaps with complex conditionals and multiple field
combinations) so that Command_Line (for example) definitions use the
user specified parameter references rather than the more complicated
definitions. This also makes it easy to modify the field reference
definitions, just by editing the configuration file, without
necessarily needing to change the contents of a Procedures pipeline
table.
N.B.: By default, when the configuration file is read during
startup by the application's main method should any parameters have
the same pathname the last duplicate encountered is given preference.
This is especially important to keep in mind when the configuration
file includes
another configuration
file, such as a site-wide configuration. For example, a site-wide
configuration file included in all pipeline-specific configuration
files (a typical scenario) might have a Conductor group that includes
default parameters such as Stop_on_Failure with a small value (e.g. 1
or 2) to prevent a bug in a pipeline procedure from generating a
large number of source processing failures, while a configuration
file for a specific pipeline that is expected to have failures (which
may actually be branches off to some other pipeline depending on the
outcome of some condition testing procedure) might have a Conductor
group that includes a Stop_on_Failure with a large (or possibly zero)
value. As long as the site-wide configuration file is included before
the pipeline-specific Conductor/Stop_on_Failure parameter is
specified the latter will take precedence over the former.
The Conductor automatically provides a set of parameters in the configuration Conductor group:
Database
access
package.
The following parameters are reset for each source record being processed:
<Pipeline>-<Source_ID>_<Source_Number>.log
The Pipeline name includes the leading database catalog name separated by a period ('.') character.
The Source_ID and Source_Number are obtained from the current source record. Note, however, that there is a chance that the Source_ID will include characters that are unsafe for use as part of a filename. Assuming that the only unsafe character is the system property "file.separator" character ('/' for Unix), it will be replaced with a percent ('%') character.
Note: If either the source record field or configuration parameter value is not empty and does not refer to a an existing directory, then that value will be used unconditionally to determine the log filename.
The following parameters are reset for each procedure record being processed:
Status_Conductor_Code_Description
static method may be used to
obtain a brief, one line description of this code.
The following configuration parameters will be used if they are present in the Conductor group or parent of this group:
The Conductor will try to connect to a Stage_Manager process on the local host system. The Stage_Manager provides remote Management capabilities. A host system may be running multiple Stage_Maangers, each using its own communications port, to provide mutiple Theater management contexts for different sets of Conductors. A Conductor can operate in only one Theater as determined by its Stage_Manager parameters. The following configuration parameters, which must be in a Stage_Manager sub-group of the Conductor group, control the Conductor connection to the Theater's Stage_Manager:
HELLO_PORT_PARAMETER_NAME
- Get and Set
HELLO_ADDRESS_PARAMETER_NAME
- Get and Set
After the Conductor has been initialized using its configuration file and connected to the database it begins to process the pipeline. Note: If the Conductor is run with a Manager (using the -Manager command line option), processing does not begin immediately; the Conductor will wait until it is told to start by the Manager. The records from the Procedures table are read, confirmed that they contain all necessary fields, and sorted into Sequence number order. A Conductor may be told to stop processing by a Manager (including a remote Manager), which will cause the Conductor to stop further processing when the current source record processing is complete. Processing is resumed with the next source record when a Manager sends the start signal. Each time pipeline processing is started the Procedures is loaded again and the configuration file is read again and used to reconfigure the Conductor. Changes to the Procedures records and configuration file can safely be made while Conductor is running.
When pipeline processing has been started Conductor begins processing records from its Sources table. All unprocessed records, up to a maximum (set by default to 1000 to prevent memory exhaustion) configurable with the Max_Source_Records parameter, are read into an internal cache. An unprocessed record has a NULL in its Conductor_ID field. These records are processed in first-in first-out (FIFO) order.
An exclusive lock must be acquired on a record before it can be processed. To acquire a lock on a source record an attempt is made to update the record's Conductor_ID field to the Conductor's identification value (hostname and possibly process ID) with the condition that the field value is currently NULL. The update operation by the database server is atomic; once the operation is started by the database server it will go to completion without the possibility of interruption by any other database operation. This guarantees that only one process will be able to gain access to any source record even in the context of multiple processes contending for the same record at the same time. If some other Conductor has already acquired the record, as indicated by the failure of the update operation (because the Conductor_ID field is no longer NULL as required), the record will be removed from the cache and the Conductor will try to acquire a lock on the next record in the cache. If the update succeeds then the Conductor has acquired exclusive control of the record. It will be safe to process the source without concern that some other process may interfere.
The first step in processing a source record is to open a log file to
record the processing. If the source record Log_Pathname field is
empty the user's Log_Pathname configuration parameter will be used.
If that is absent or empty the Log_Directory parameter will be used.
If that, too, is absent or empty a default filename will be generated
- as described in the Log_Filename parameter description, above - and
the log file will be written the current working directory. If either
the source record field or configuration parameter is not empty the
value is resolved for any embedded references. If the resulting pathname refers to an existing
directory the default filename is added. Otherwise the pathname is
taken to refer to a file to which processing log output is to be
appended (the file will be created if it does not exist).
The log file always begins with the Conductor class identification:
PIRL.Conductor.Conductor (2.47 2012/04/16 06:04:09)
This line is immediately followed by a
The Status field of the source record is checked for any status
indicators from possible previous processing. If present they are
logged. If the last status indicator is for a failure condition
this is logged and any further processing of the source is skipped.
Note: It is possible to (re)start processing of a source
midway in its procedures pipeline. This is done by first setting
its Status field to include status indicators for procedures to be
skipped; e.g. by removing a failure indicator at the end of the
list after correcting the cause of the failure. Then the
Conductor_ID field is set to NULL (as long as this is done last it
will be safe even if the actively being processed by a Conductor).
When the source record is acquired by a Conductor its processing
will begin with the next procedure without a status indicator, and
log output will be appended to its previous log file if present or
a new file if needed. Caution: Procedure pipelines to be
used in this way must, of course, ensure that any dependencies on
previous procedures are taken into account.
At this point configuration parameters dependent on the current
source record are updated.
The Source_Pathname file is confirmed to be a normal file (not a
directory) that is readable. If the file is not accessible the
Status field of the source record is updated in the database with
the
The source now enters the procedures pipeline. The procedure
definition records are all cached and sorted by their Sequence
number when Conductor first starts. Thus changes to a Procedures
table will not take effect until after a Conductor (re)starts, and
it is safe to change a Procedures table while a Conductor is
running.
Each procedure record is applied to the current source record in
Sequence number order; the Sequence parameter is updated before
each procedure is processed.
All of the required Procedure fields, except the Sequence number,
may contain embedded references.
Each embedded reference is effectively a variable that is
substituted with the value from a database field or configuration
parameter specified by the reference. References may be arbitrarily
nested; for example, the condition for selecting a record in a
database field reference may be a parameter reference supplied in a
configuration parameter. Reference resolution is also recursive;
the value obtained from resolving a reference may itself contain
embedded references. Thus a parameter reference may resolve to a
parameter value that contains references. This allows the values of
database fields to contain references to user defined parameters
that are set as desired in the configuration file without needing
to change the contents of database tables to effect the change.
Reference resolved values in Procedure fields allow dynamic
definition of procedure attributes. References that are unresolved
are fatal to Conductor unless the Unresolved_Reference
configuration parameter has been set to a substitute string (e.g.
""). References that have incorrect syntax (e.g. unbalanced curly
brace enclosures) are always fatal to Conductor.
The Command_Line value is reference resolved and parsed into an initial command name and
command arguments. An empty or NULL Command_Line is fatal. Before
each procedure is run the log file is written with the
The command name and arguments are passed to the Java Runtime for
execution as a Process by the host operating system. Note:
the command is not run in a shell. It is, however, quite
appropriate to run shell, or any other interpreted language,
scripts (e.g. PERL). The only restriction on the procedure to be
run is that it is accessible and executable. If the procedure can
not be executed for any reason the source record's Status field
value is appended with Conductor's
All standard output from the procedure is copied into the log file
with an annotation before each line that indicates whether the
source is the procedure's stdout or stderr streams. Because these
are separate streams read by asynchronous threads attached to each
process stream there can be no guarantee of the relative logging
order of lines from the two sources; while each stream is always
logged in the order in which the procedure output to it, the
uncertainties of system stream buffering and thread scheduling are
likely to result in lines from stdout appearing in the log before
or after where they might occur relative to stderr lines appearing
in a shell terminal listing.
Conductor waits for the procedure to complete before proceeding.
However, it will not wait longer than the number of seconds from
the Time_Limit field (which may have embedded references and may be
a mathematical expression). If the value is NULL, empty or zero
then there is no limit to the amount of time Conductor will wait
for the procedure to complete. It is generally a good idea to place
a maximum running time limit on any procedure that could become
"hung" (for example in a loop or on an inaccessible). If the time
limit is reached Conductor will destroy the procedure. This is done
by sending the procedure a terminate signal (SIGTERM). This signal
can be caught by the procedure so it has an opportunity to clean up
open files or child processes of its own. For scripts that have
launched long running computational programs it is correct practice
to catch the terminate signal and halt these programs; failure to
do so is likely to leave these child programs running as orphans.
If a procedure does not catch the terminate signal it will be
automatically terminated by the operating system. If Conductor must
terminate a procedure due to a timeout the source record's Status
field will be updated with Conductor's
When the procedure execution is done the Completion_Number
parameter is updated. This will be a negative value if the
procedure did not run to completion (could not be executed or
exceed the Time_Limit), otherwise it will be the procedure's exit
status value.
When a procedure completes normally Conductor uses either the
Success_Status or Success_Message field values to determine if the
procedure completed successfully. Usually the exit status is set by
the procedure to a value that indicates if it succeeded. However, it
may be necessary to examine the output of the procedure if the exit
status is not reliable. There may also be unfortunate cases where
there is no reliable indicator and all that can be done is assume
that because the procedure completed it was successful.
If neither the Success_Status or Success_Message field values
has been set to a non-empty value and an Empty_Success_Any
configuration parameter was found with a "true" value then the
procedure success of the proedure is implied (i.e. in this case
the procedure is always successful if it completes normally);
otherwise the Success_Status value is asserted to be "0".
If the Success_Status field is not empty it is reference resolved.
The result is evaluated as a logical expression and if a result is
obtained it determines if the procedure.was successful. Typically,
the expression uses a reference to the Completion_Number parameter.
The logical operators &, |, ~, =, <, >, <>, <=, >= may be used. The
words "and", "or", and "not" can be used in place of &, |, and ~.
Caution: Use &, not &&; |, not ||; ~, not !; =, not ==; and
<>, not !=. A logical expression may contain embedded numeric
expressions as well.
If the Success_Status does not contain a valid logical expression it
is evaluated as a numeric expression. If the result, cast as an
integer value, is equal to the procedure's exit status, then the
procedure succeeded; otherwise it failed. A numeric expression may
simply be a constant value (the symbols "pi" and "e" are recognized
as constants) or may use the +, -, *, /, ^ operators; ** may be used
instead of the ^ exponentiation operator. The tertiary operator ?
with : may be used following an embedded logical expression such that
if the logical expression is true then the following value before the
: is used, else the value after the : is used (e.g. (4<5)?1:2
evaluates to 1). The functions sin, cos, tan, cot, sec, csc, arcsin,
arccos, arctan, exp, ln, log2, log10, sqrt, cubert, abs, round,
floor, ceiling, trunc may also be used with their argument following
inside parentheses.
The Success_Message, if not empty, is used if the Success_Status is
empty. It is referenced resolved and then matched, as a regular expression, against what was
obtained from the procedure's stdout and stderr. If there is a match
with either output, then the procedure succeeded; otherwise it
failed. A resolved Success_Message value that does not produce a
valid regular expression is fatal to Conductor. Note: Regular
expressions are very powerful expression matching syntax similar to
that used by PERL, but also can be daunting to the beginner.
Regardless of the outcome of procedure execution the Status field of
the source record is updated in the database with the procedure's
status indicator. This indicator
always includes the Conductor completion code which can be translated
into a descriptive line of text by the
When Conductor determines that the procedure completed successfully
it repeats the procedure execution operation with the next procedure
in the pipeline Sequence. If Conductor determines that the procedure
did not complete successfully, then it resolves any embedded
references in the On_Failure field value and uses that as a command
line for a procedure to be executed. This procedure is executed
without any time limit. In the log file, where a normal procedure
would have a
When the number of sequential source processing failures reaches
the Stop_on_Failure amount further processing is halted after
the On_Failure procedure has been run.
When operating under the direction of a Manager (local and/or remote)
if the Conductor halts for any reason it will send an email message
to its Notify list and then wait to be told to start processing again.
Without the possibility of a Manager to take charge the Conductor
will exit after halting.
The completion of the last of the Procedures in pipeline sequence,
or the first On_Failure procedure, completes the processing of a
source record. The log file is now closed. While there is another
source record in the cache Conductor will continue trying to
acquire an exclusive database lock. Once the cache is exhausted
Conductor refreshes it from the Sources table with any new
unprocessed records. If no unprocessed records are available then
Conductor will sleep for the number of seconds indicated by the
Poll_Interval configuration parameter. If no Poll_Interval
parameter is present the default interval of 30 seconds is used. If
the interval is less than or equal to 0, then Conductor processing
will stop when it can find no source records to process.
Conductor is known to compile and run correctly with Java 1.4 and
1.5. Java was chosen for the implementation to maximize portability:
as long as the host system provides a standard Java environment it
should be able to run Conductor.
The Conductor_ID value will include the process ID of Conductor if
it is available. Obtaining the process ID (PID) of Conductor - i.e.
the Java Virtual Machine (JVM) that runs the Java classes - requires
using a Java Native Interface (JNI) to the host system function that
provides this information. Though this is trivial to implement it is
outside the pure Java implementation of Conductor. Without the JNI
code the Conductor_ID will only be the hostname of the system on
which Conductor is running. With the JNI code the Conductor_ID will
include the JVM PID after the hostname separated by a colon (':')
character. The availability of the JVM PID will not have any effect
on Conductor's operation. However, it is quite useful to have the
JVM PID for procedures to use in disambiguating filenames in a
parallel processing shared storage environment, and it can assist in
systems administration work. A Native_Methods.c file that provides
JNI access to the required system function is included in the source
code distribution of Conductor. When the Conductor source code is
compiled Native_Methods.c is also compiled to produce a dynamically
loadable Native_Methods.so (or .jnilib on Apple OS X/Darwin systems)
shared object library file in the Conductor/
A null value means that the Reference_Resolver will throw an exception
if a reference can not be resolved.
This value may be overriden by the
When operating in batch mode, this value control the minimum
number of source records to be processed before stopping.
This value may be set in the configuration by the
Due to NFS filesystems latency it is possible that a source file
that has just been created and registered in a Sources table by a
Conductor on one system will not appear if accessed too soon by a
Conductor on another system. The workaround is to repeatedly try to
see the file, with a ten second delay between tries. The difficulty
is knowing how many tries to make before giving up. As of this
writing it seems that sufficient tries should be allowed for up to
two minutes of trying.
The Configuration
This value may be set by the configuration
A value of zero sets batch mode: Conductor is to stop when no
unprocessed source records are available and at least
Min_Source_Records have been processed. If polling for more sources
is required in batch mode a polling interval of
This value may be set by the configuration
A value of zero sets batch mode: Conductor is to stop when no
unprocessed source records are available and at least
Min_Source_Records have been processed. If polling for more sources
is required in batch mode a polling interval of
This value may be overridden by the
Between attempts to
N.B.: If the Conductor is repeatedly trying to confirm the
existence of a source file or to reconnect to the database when the
stop request is received retrying will be discontinued which can be
expected to result in a source record processing failure or
database connectivity failure halt.
This state is only entered when the user requests that processing
The problem may be the result of the maximum number of sequential
source records processing procedure failures having occured, a
database access failure, or some other system error. This state
remains in effect until a
If true and a Stage_Manager connection can not be established the
Conductor will throw an exception; otherwise the Conductor will
proceed without a Stage_Manager.
The initial value is false;
The Configuration object is expected to contain all the necessary
information Conductor needs to connect to the database as well as
any other Conductor parameters it might use.
The Configuration object is expected to contain all the necessary
information Conductor needs to connect to the database as well as
any other Conductor parameters it might use.
The database server access parameters will be obtained from the
Configuration Group named by the first entry in the Server parameter
list.
If no Configuration is provided an effort is made to load the
Essential configuration information is obtained from, and set in,
the
The following
The parameter is sought in the CONDUCTOR_GROUP, but defaults are
acceptable.
If a value is found it is resolved for embedded references. An
unresolved reference that would throw a Database_Exception results
in a null value. A syntax error (ParseException) leaves the value
unchanged.
The parameter is set in the
If the specified name is not an absolute pathname the
N.B.: All "password" (case insensitive) parameters will have
their values masked out.
The connection will be retried up to the number of times specified
by the RECONNECT_TRIES_PARAMETER with the number of seconds delay
between retries specified by the RECONNECT_DELAY_PARAMETER.
Connection retries stop when the connection to the database is
successful, the number of retries is exhausted, or an exception
occurs that is not associated with a connection failure.
If the reference can not be resolved because the Database has
become disconnected the Database is
Normally when the Conductor Reference_Resolver is unable to resolve a
reference it will throw an
If the specified value is different from the current Reference_Resolver
default value the Configuration
The table is expected to be delivered with the record fields in
N.B.: The entire contents of the database procedures table is
delivered.
The entire Procedures_Table is obtained from the Database.
If a
The entire Procedures table is cached. The first, field names,
record is removed and used to construct a Field_Map
of required field names/indexes to the available field names.
The database table is first loaded into a temporary records list. A
temporary Fields_Map is constructed from the first, field names,
record which is removed from the table. The map is used to confirm
that all the required fields are present. Then the fields of each
record are mapped to their expected order. All the records are then
sorted on the
If the tentative new table content is different than the current
Procedure_Records table the latter is set to the former, and the
Procedures_Map set to the new map, while synchronized to prevent the
Management from obtaining it while it is in an inconsistent state.
Then the Configuration is updated: the
N.B.: The Procedure_Records and the Procedures_Map are not
changed, nor a processing event reported, unless a valid table with
different contents from the current table is loaded.
The table will be delivered with the record fields in
N.B.: Only the contents of the Conductor source records cache
is delivered. The contents of the database sources table may be
much, much larger.
If the source records cache is not empty nothing is done.
Records in the Sources table for which the Processing_Host field is
NULL are cached. The maximum size of the cache is determined by the
The first, field names, record from the Database is removed and
used to construct the Sources_Map Field_Map of required field
names/indexes to the available field names.
The possible processing state values are:
The WAITING and HALTED state codes are negative; all others are positive.
All Processing_Changes state variables will be set to the current
values from this Conductor, except the flag variables -
If a pipeline processing thread is not running one is started.
Otherwise, if the
The Conductor reconfigures itself by re-reading its Configuration
source in case it was changed while processing was waiting.
N.B.: The
The
Source procesing will continue indefinately unless: the
The current contents of the
The new value will override the value set when the Conductor is
The new value will override the value set when the Conductor is
N.B.: A processing event notification is sent to all listeners
If the
If the count of sequential source processing failures and/or the
processing state was changed a processing event notification with
these changes is sent to all listeners.
N.B.: When Conductor processing is
If the Conductor is in a positive
Any open log file is closed. The database server is disconnected. An
N.B.: If source processing is
The Conductor sends its
The command line arguments are delimited by any combination of
space (' '), tab ('\t'), new-line ('\n') or carriage return ('\r')
characters. However, character sequences in quotes - either single
('\'') or double ('"') quote characters - remain unbroken.
After parsing each argument String is also scanned to convert
escaped characters - preceded by a backslash ('\') - into their
unescaped character equivalents.
A Source table Status field value contains a comma delimited list
of procedure status indicators. Each has the form:
The PID is the positive, non-zero value of the system's process ID
for a running procedure. This value should only be present if the
procedure is currently executing.
When the procedure processing has completed the PID is replaced
with a Conductor procedure completion code. This will be zero if
Conductor determined that the procedure completed successfully. It
will be one (1) if the procedure completed unsuccessfully. It will
be a negative value if the procedure could not be run to completion
for any reason; the code value in this case indicates the reason.
When the procedure has been run to completion, whether successful
or not, its exit status value is appended inside parentheses to the
Conductor procedure completion code.
If the code value is not a recognized value, the description will
be "Unknown procedure completion code (
This method is the converse of the
The Conductor writes its processing log reports, including the
output from all pipeline procedures it runs, to all registered
log Writers.
The default text style is used.
The identity Message contains the following parameters:
The -Pipeline switch specifies which pipeline the
Conductor is to manage. A pipeline must be specified. A command line
argument without a preceding switch name is assumed to be the
pipeline name. The pipeline name has the form:
where <pipeline> is the name of the pipeline and is
used as the prefix of the Sources and Procedures table names:
The -Configuration option is used to specify the
filename, or URL (http or ftp), where the configuration parameters
are to be found. If this option is not specified the
"Conductor.conf" filename will be used. If the configuration file is
not in the current working directory, it will be looked for in the
user's home directory.
The configuration file must contain the necessary database access
information needed to identify and connect with the database server
(as required by the
In addition to the "Catalog" parameter (described below), the
"Log_Directory" parameter may optionally be located in the
"/Conductor" group. The "Log_Directory" parameter specifies the
directory where log files will be written. The value of this
parameter may contain embedded references for the database
The -Server or -Database option may be
used to specify the Group of database server access parameters in the
configuration file. A configuration file may contain more than one
Group of database server access parameters where the name of each
such Group must be in the Server parameter list. By default the first
name in the Server list is the Group of database access parameters
that will be used.
The -CAtalog option may be used to specify the name of
the database catalog containing the pipeline tables. However, it is
ambiguous command line syntax to use this option and include a
different <catalog> in the pipeline name.
The <catalog> name may alternatively be specified by a
"Catalog" parameter in the configuration file. This parameter is first
sought in the "/Conductor" parameters Group, then in the database server
parameters Group or any parent of that group. It is necessary that a
<catalog> name be found somewhere.
The -Manage or -Monitor option may be
used to run Conductor with a Manager GUI. In this mode Conductor will
not proceed to process the pipeline immediately. Instead, the Manager
will enable processing to be started, stopped, aborted, and the
pipeline processing continuously monitored. By default the Conductor
is run without a Manager.
The -Wwait-to-start option will cause the Conductor to
remain in the wait state until it receives a message to start source
record processing. If the Conductor is run with a Manager
-wait-to-start is implicit. Remote management can be provided via a
Stage_Manager and a Kapellmeister client. By default the Conductor
will not wait-to-start unless it is run with a Manager.
The -Version option will cause the Conductor version
identification to be listed without running the Conductor.
The -Help option will list the brief command line
syntax usage and then exit.
N.B.: This method always results in a System.exit with the
SOURCE_FILE_LOG_DELIMITER
line which is expected to be 70 equals
('=') characters. This is followed by a date and time stamp and the
source record description including the database server type and
hostname, the fully qualified name of the Sources Table
(Sources_Table) and the Source_Number, Source_ID, and
Source_Pathname values.
INACCESSIBLE_FILE
Conductor completion code, the
Completion_Number parameter is set to the same value, the condition
is logged and further processing of the source record is canceled.
Procedures Pipeline
Embedded References
Procedure Execution
PROCEDURE_LOG_DELIMITER
line. This is followed by a date and time
stamp, the Sequence number, the Description field value (if it is
not empty), and then the command line to be executed.
NO_PROCEDURE
error
status. Otherwise the Status field is updated with the host system
Process ID for the executed procedure; this is always an integer
value greater than 1 that uniquely identifies the executing
procedure in the host operating system.
PROCEDURE_TIMEOUT
error status and the log will be written with notice of the
timeout. If the procedure completes normally, then the standard
output streams are drained and copied to the log file and the exit
status from the procedure is also noted in the log file.
Procedure Status
Status_Conductor_Code_Description
static method. If the procedure
completed with an exit status that value is included in the status
indicator. The meaning of this value is procedure dependent. Of
course the log file is also annotated accordingly.
On Failure
PROCEDURE_LOG_DELIMITER
the On_Failure
procedure has an ON_FAILURE_PROCEDURE_LOG_DELIMITER
and no
Description. Although the completion status of this procedure is not
included in the final status indicator of the source record's Status
field it is noted in the log file.
Sources Completion
System Dependencies
Java
Conductor_ID
PIRL.Database
,
PIRL.Conductor.Maestro
Field Summary
static int
BAD_REGEX
Conductor procedure completion code.
static int
BATCH_POLL_INTERVAL
The polling interval, in seconds, for unprocessed source
records when no unprocessed source records are obtained from
the Sources table.
protected String
Catalog
The name of the database catalog containing the pipeline tables.
static String
CATALOG_PARAMETER
Conductor Configuration parameters.
static int
COMMAND_LINE_FIELD
Procedures table fields indexes.
static String
CONDUCTOR_GROUP
Conductor Configuration parameters.
static int
CONDUCTOR_ID_FIELD
Sources table fields indexes.
static String
CONDUCTOR_ID_PARAMETER
Conductor Configuration parameters.
static String
CONFIGURATION_SOURCE_PARAMETER
Conductor Configuration parameters.
static String
DATABASE_HOSTNAME_PARAMETER
Conductor Configuration parameters.
static String
DATABASE_SERVER_NAME_PARAMETER
Conductor Configuration parameters.
static String
DATABASE_SERVER_PARAMETER
Conductor Configuration parameters.
static String
DATABASE_TYPE_PARAMETER
Conductor Configuration parameters.
static String
DEFAULT_CONFIGURATION_FILENAME
The default configuration filename.
static int
DEFAULT_DUPLICATE_PARAMETER_ACTION
The default action
should a duplicate parameter pathname occur in the Conductor
Configuration file.
static boolean
DEFAULT_EMPTY_SUCCESS_ANY
The default for whether or not empty Success_Status and
Success_Message fields in a procedure definition may imply any
completion of the procedure is a success.
static int
DEFAULT_POLL_INTERVAL
The polling interval, in seconds, for unprocessed source
records when no unprocessed source records are obtained from
the Sources table.
static int
DEFAULT_STOP_ON_FAILURE
The default maximum number of sequential source processing failures.
static int
DESCRIPTION_FIELD
Procedures table fields indexes.
static String
EMPTY_SUCCESS_ANY_PARAMETER
Conductor Configuration parameters.
static int
EXIT_COMMAND_LINE_SYNTAX
Command line syntax problem exit status (1).
static int
EXIT_CONFIGURATION_PROBLEM
Configuration problem exit status (2).
static int
EXIT_DATABASE_PROBLEM
Configuration problem exit status (3).
static int
EXIT_IO_FAILURE
I/O failure exit status (4).
static int
EXIT_STAGE_MANAGER
The required
Stage_Manager
connection could not be established.
static int
EXIT_SUCCESS
Conductor success exit status (0).
static int
EXIT_TOO_MANY_FAILURES
The number of sequential source processing failures reached the
Stop-on-Failure amount.
static int
EXIT_UNEXPECTED_EXCEPTION
An unexpected exception occured (9).
static String[]
FAILURE_DESCRIPTION
Conductor status failure code descriptions.
static int
HALTED
Processing state: A failure condition caused processing to halt.
static String
HELLO_ADDRESS_PARAMETER_NAME
Conductor Configuration parameters.
static String
HELLO_PORT_PARAMETER_NAME
Conductor Configuration parameters.
static String
HOSTNAME_PARAMETER
Conductor Configuration parameters.
static String
ID
Class identification name with source code version and date.
static int
INACCESSIBLE_FILE
Conductor procedure completion code.
static int
INVALID_DATABASE_ENTRY
Conductor procedure completion code.
static String
LOG_DIRECTORY_PARAMETER
Conductor Configuration parameters.
static String
LOG_FILENAME_PARAMETER
Conductor Configuration parameters.
static int
LOG_PATHNAME_FIELD
Sources table fields indexes.
static String
LOG_PATHNAME_PARAMETER
Conductor Configuration parameters.
static int
MAX_SOURCE_RECORDS_DEFAULT
The maximum number of unprocessed source records that will be
obtained when the Conductor cache is refreshed.
static String
MAX_SOURCE_RECORDS_PARAMETER
Conductor Configuration parameters.
static int
MIN_SOURCE_RECORDS_DEFAULT
The minimum value for the Max_Source_Records value.
static String
MIN_SOURCE_RECORDS_PARAMETER
Conductor Configuration parameters.
protected static String
NL
static int
NO_PROCEDURE
Conductor procedure completion code.
static String
NOTIFY_PARAMETER
Conductor Configuration parameters.
static int
ON_FAILURE_FIELD
Procedures table fields indexes.
static String
ON_FAILURE_PROCEDURE_LOG_DELIMITER
Marks the beginning of On_Failure procedure processing in a log file.
protected static String
Pipeline
The name of the pipeline (
static String
PIPELINE_PARAMETER
Conductor Configuration parameters.
static String
POLL_INTERVAL_PARAMETER
Conductor Configuration parameters.
static int
POLLING
Processing state: No unprocessed source records are available
and the poll interval
for new records is
positive.
static String
PROCEDURE_COMPLETION_NUMBER_PARAMETER
Conductor Configuration parameters.
static String
PROCEDURE_COUNT_PARAMETER
Conductor Configuration parameters.
static int
PROCEDURE_FAILURE
Conductor procedure completion code.
static String
PROCEDURE_LOG_DELIMITER
Marks the beginning of procedure processing in a log file.
protected Vector<Vector<String>>
Procedure_Records
The content of the pipeline procedures table, without the field
names, sorted by sequence number.
static String
PROCEDURE_SEQUENCE_PARAMETER
Conductor Configuration parameters.
static int
PROCEDURE_SUCCESS
Conductor procedure completion code.
static int
PROCEDURE_TIMEOUT
Conductor procedure completion code.
static String[]
PROCEDURES_FIELD_NAMES
Procedures table field names.
protected Fields_Map
Procedures_Map
protected String
Procedures_Table
The name of the pipeline procedures table in the database.
static String
PROCEDURES_TABLE_NAME_SUFFIX
Procedures table name suffix.
static String
PROCEDURES_TABLE_PARAMETER
Conductor Configuration parameters.
static String
RECONNECT_DELAY_PARAMETER
Conductor Configuration parameters.
static String
RECONNECT_TRIES_PARAMETER
Conductor Configuration parameters.
static boolean
Require_Stage_Manager
Flag that determines if the Conductor requires a Stage_Manager.
static String
REQUIRE_STAGE_MANAGER_PARAMETER
Conductor Configuration parameters.
protected Reference_Resolver
Resolver
The Reference_Resolver object being used.
static String
RESOLVER_DEFAULT_VALUE
The default value to be used by the Reference_Resolver
if a
reference can not be resolved.
static int
RUN_TO_WAIT
Processing state: When the current source record completes processing
the WAITING
state will be entered unless a failure condition
caused the HALTED
state to occur.
static int
RUNNING
Processing state: Source records are being processed.
static int
SEQUENCE_FIELD
Procedures table fields indexes.
static int
SOURCE_AVAILABLE_NO_CHECK
When the SOURCE_AVAILABLE_TRIES_PARAMETER
is this value
source file availability confirmation is disabled.
static int
SOURCE_AVAILABLE_TRIES_DEFAULT
Default number of source file availability tests.
static int
SOURCE_AVAILABLE_TRIES_MAX
Maximum number of source file availability tests.
static String
SOURCE_AVAILABLE_TRIES_PARAMETER
Conductor Configuration parameters.
static String
SOURCE_DIRECTORY_PARAMETER
Conductor Configuration parameters.
static String
SOURCE_FAILURE_COUNT
Conductor Configuration parameters.
static String
SOURCE_FILE_LOG_DELIMITER
Marks the beginning of source file processing in a log file.
static String
SOURCE_FILENAME_EXTENSION_PARAMETER
Conductor Configuration parameters.
static String
SOURCE_FILENAME_PARAMETER
Conductor Configuration parameters.
static String
SOURCE_FILENAME_ROOT_PARAMETER
Conductor Configuration parameters.
static int
SOURCE_ID_FIELD
Sources table fields indexes.
static String
SOURCE_ID_PARAMETER
Conductor Configuration parameters.
static int
SOURCE_NUMBER_FIELD
Sources table fields indexes.
static String
SOURCE_NUMBER_PARAMETER
Conductor Configuration parameters.
static int
SOURCE_PATHNAME_FIELD
Sources table fields indexes.
static String
SOURCE_PATHNAME_PARAMETER
Conductor Configuration parameters.
static String
SOURCE_SUCCESS_COUNT
Conductor Configuration parameters.
static String[]
SOURCES_FIELD_NAMES
Sources table field names.
static String
SOURCES_TABLE_NAME_SUFFIX
Sources table name suffix.
static String
SOURCES_TABLE_PARAMETER
Conductor Configuration parameters.
static String
STAGE_MANAGER_PASSWORD_PARAMETER
Conductor Configuration parameters.
static String
STAGE_MANAGER_PORT_PARAMETER
Conductor Configuration parameters.
static String
STAGE_MANAGER_TIMEOUT_PARAMETER
Conductor Configuration parameters.
static int
STATUS_FIELD
Sources table fields indexes.
static String
STDERR_NAME
Prefix applied to procedure stderr lines.
static String
STDOUT_NAME
Prefix applied to procedure stdout lines.
static String
STOP_ON_FAILURE_PARAMETER
Conductor Configuration parameters.
static int
SUCCESS_MESSAGE_FIELD
Procedures table fields indexes.
static int
SUCCESS_STATUS_FIELD
Procedures table fields indexes.
protected Configuration
The_Configuration
The Configuration object containing the configuration parameters.
protected Database
The_Database
The Database object used to access the database server.
static int
TIME_LIMIT_FIELD
Procedures table fields indexes.
static String
TOTAL_FAILURE_COUNT
Conductor Configuration parameters.
static String
TOTAL_PROCEDURE_RECORDS_PARAMETER
Conductor Configuration parameters.
static int
UNRESOLVABLE_REFERENCE
Conductor procedure completion code.
static String
UNRESOLVED_REFERENCE_PARAMETER
Conductor Configuration parameters.
static String
UNRESOLVED_REFERENCE_THROWS
Conductor Configuration parameters.
static int
WAITING
Processing state: Idle; waiting for a start
request.
Constructor Summary
protected
Conductor()
Constructs an uninititalized Conductor.
Conductor(String pipeline,
Configuration configuration)
Construct a Conductor for a pipeline from a Configuration.
Conductor(String pipeline,
Configuration configuration,
String database_server_name)
Construct a Conductor for a pipeline from a Configuration.
Method Summary
Management
Add_Log_Writer(Writer writer)
Register a Writer to receive processing log stream output.
Management
Add_Processing_Listener(Processing_Listener listener)
Register a processing state change listener.
static String
Config_Pathname(String name)
Get an absolute Conductor Configuration pathname.
protected String
Config_Value(String name)
Get a String parameter value from the configuration.
protected boolean
Config_Value(String name,
Object value)
Set a parameter in the configuration.
Configuration
Configuration()
Get the Conductor Configuration.
protected Database_Exception
Connect_to_Database()
Establish the Database connection.
boolean
Connected_to_Stage_Manager()
Test if the Conductor is connected to a Stage_Manager.
Management
Enable_Log_Writer(Writer writer,
boolean enable)
Enable or disable output to a registered log stream Writer
.
protected Vector<Vector<String>>
Get_Procedures_Table()
Get the Procedures_Table
from the Database.
Message
Identity()
Get the identity description Message for this Conductor.
protected void
Load_Procedure_Records()
Load the Procedure_Records table.
protected boolean
Load_Source_Records()
Load the Sources_Records table.
protected void
Log_Message(String message)
Logs a message to the Logger.
protected void
Log_Message(String message,
AttributeSet style)
Logs a message to the Logger.
static void
main(String[] args)
Instantiate a Conductor application.
static String[]
Parse_Command_Line(String command_line)
Parse a String into command line arguments.
int
Poll_Interval()
Get the interval at which the Conductor will poll for unprocessed
source records.
Management
Poll_Interval(int seconds)
Set the time interval to poll for unprocessed source records.
protected void
Postconfigure(Configuration configuration)
Update the configuration and application control values after
the Database and Reference_Resolver have been constructred.
protected Configuration
Preconfigure(Configuration configuration)
Set the effective configuration.
Vector<Vector<String>>
Procedures()
Get the procedures table.
Exception
Processing_Exception()
Get the exception that caused Conductor processing to halt.
int
Processing_State()
Get the current Conductor processing state.
void
Quit()
Immediately stop processing and exit.
boolean
Remove_Log_Writer(Writer writer)
Unregister a log Writer.
boolean
Remove_Processing_Listener(Processing_Listener listener)
Unregister a processing state change listener.
Management
Reset_Sequential_Failures()
Reset the count of sequential source processing failures that the
Conductor has accumulated.
protected String
Resolve(String reference)
Resolve a reference
String
Resolver_Default_Value()
Get the Conductor default Reference_Resolver value.
Management
Resolver_Default_Value(String value)
Set the Conductor default Reference_Resolver value.
int
Sequential_Failures()
Get the count of sequential source processing failures that the
Conductor has accumulated.
Vector<Vector<String>>
Sources()
Get the current cache of source records.
void
Start()
Start pipeline processing.
Processing_Changes
State()
Get the current Conductor processing conditions state.
static String
Status_Conductor_Code_Description(int code)
Get a description String for a Conductor procedure completion code.
static int
Status_Conductor_Code(String status)
Get the Conductor procedure completion status code value from a
procedure status indicator String.
static String
Status_Field_Value(Vector<String> status)
Assemble a properly formatted String for a Source table Status field
value.
static String
Status_Indicator(int conductor_status)
Assemble a properly formatted procedure status indicator String
as used in a Sources table Status field value.
static String
Status_Indicator(int conductor_status,
int procedure_status)
Assemble a properly formatted procedure status indicator String
as used in a Sources table Status field value.
static Vector<String>
Status_Indicators(String status_field)
Parse a Source table Status field value into a Vector of procedure
status indicator Strings.
static int
Status_Procedure_Exit_Value(String status)
Get the procedure exit status value from a procedure status indicator
String.
int
Stop_on_Failure()
Get the number of Conductor sequential source processing failures
at which to stop processing.
Management
Stop_on_Failure(int failure_count)
Set the sequential failure limit at which to halt processing source
records.
void
Stop()
Stop pipeline processing after the current source processing has
completed.
static void
Usage()
Prints the command line usage syntax.
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Field Detail
ID
public static final String ID
DEFAULT_CONFIGURATION_FILENAME
public static final String DEFAULT_CONFIGURATION_FILENAME
The_Configuration
protected Configuration The_Configuration
DEFAULT_DUPLICATE_PARAMETER_ACTION
public static final int DEFAULT_DUPLICATE_PARAMETER_ACTION
action
should a duplicate parameter pathname occur in the Conductor
Configuration file.
CONDUCTOR_GROUP
public static final String CONDUCTOR_GROUP
CONFIGURATION_SOURCE_PARAMETER
public static final String CONFIGURATION_SOURCE_PARAMETER
DATABASE_SERVER_NAME_PARAMETER
public static final String DATABASE_SERVER_NAME_PARAMETER
DATABASE_SERVER_PARAMETER
public static final String DATABASE_SERVER_PARAMETER
HOSTNAME_PARAMETER
public static final String HOSTNAME_PARAMETER
DATABASE_HOSTNAME_PARAMETER
public static final String DATABASE_HOSTNAME_PARAMETER
DATABASE_TYPE_PARAMETER
public static final String DATABASE_TYPE_PARAMETER
PIPELINE_PARAMETER
public static final String PIPELINE_PARAMETER
CATALOG_PARAMETER
public static final String CATALOG_PARAMETER
PROCEDURES_TABLE_PARAMETER
public static final String PROCEDURES_TABLE_PARAMETER
SOURCES_TABLE_PARAMETER
public static final String SOURCES_TABLE_PARAMETER
UNRESOLVED_REFERENCE_PARAMETER
public static final String UNRESOLVED_REFERENCE_PARAMETER
UNRESOLVED_REFERENCE_THROWS
public static final String UNRESOLVED_REFERENCE_THROWS
EMPTY_SUCCESS_ANY_PARAMETER
public static final String EMPTY_SUCCESS_ANY_PARAMETER
MIN_SOURCE_RECORDS_PARAMETER
public static final String MIN_SOURCE_RECORDS_PARAMETER
MAX_SOURCE_RECORDS_PARAMETER
public static final String MAX_SOURCE_RECORDS_PARAMETER
POLL_INTERVAL_PARAMETER
public static final String POLL_INTERVAL_PARAMETER
SOURCE_AVAILABLE_TRIES_PARAMETER
public static final String SOURCE_AVAILABLE_TRIES_PARAMETER
STOP_ON_FAILURE_PARAMETER
public static final String STOP_ON_FAILURE_PARAMETER
RECONNECT_TRIES_PARAMETER
public static final String RECONNECT_TRIES_PARAMETER
RECONNECT_DELAY_PARAMETER
public static final String RECONNECT_DELAY_PARAMETER
NOTIFY_PARAMETER
public static final String NOTIFY_PARAMETER
SOURCE_NUMBER_PARAMETER
public static final String SOURCE_NUMBER_PARAMETER
SOURCE_PATHNAME_PARAMETER
public static final String SOURCE_PATHNAME_PARAMETER
SOURCE_ID_PARAMETER
public static final String SOURCE_ID_PARAMETER
CONDUCTOR_ID_PARAMETER
public static final String CONDUCTOR_ID_PARAMETER
SOURCE_DIRECTORY_PARAMETER
public static final String SOURCE_DIRECTORY_PARAMETER
SOURCE_FILENAME_PARAMETER
public static final String SOURCE_FILENAME_PARAMETER
SOURCE_FILENAME_ROOT_PARAMETER
public static final String SOURCE_FILENAME_ROOT_PARAMETER
SOURCE_FILENAME_EXTENSION_PARAMETER
public static final String SOURCE_FILENAME_EXTENSION_PARAMETER
LOG_PATHNAME_PARAMETER
public static final String LOG_PATHNAME_PARAMETER
LOG_DIRECTORY_PARAMETER
public static final String LOG_DIRECTORY_PARAMETER
LOG_FILENAME_PARAMETER
public static final String LOG_FILENAME_PARAMETER
SOURCE_SUCCESS_COUNT
public static final String SOURCE_SUCCESS_COUNT
SOURCE_FAILURE_COUNT
public static final String SOURCE_FAILURE_COUNT
TOTAL_FAILURE_COUNT
public static final String TOTAL_FAILURE_COUNT
TOTAL_PROCEDURE_RECORDS_PARAMETER
public static final String TOTAL_PROCEDURE_RECORDS_PARAMETER
PROCEDURE_COUNT_PARAMETER
public static final String PROCEDURE_COUNT_PARAMETER
PROCEDURE_SEQUENCE_PARAMETER
public static final String PROCEDURE_SEQUENCE_PARAMETER
PROCEDURE_COMPLETION_NUMBER_PARAMETER
public static final String PROCEDURE_COMPLETION_NUMBER_PARAMETER
REQUIRE_STAGE_MANAGER_PARAMETER
public static final String REQUIRE_STAGE_MANAGER_PARAMETER
STAGE_MANAGER_PASSWORD_PARAMETER
public static final String STAGE_MANAGER_PASSWORD_PARAMETER
STAGE_MANAGER_PORT_PARAMETER
public static final String STAGE_MANAGER_PORT_PARAMETER
STAGE_MANAGER_TIMEOUT_PARAMETER
public static final String STAGE_MANAGER_TIMEOUT_PARAMETER
HELLO_PORT_PARAMETER_NAME
public static final String HELLO_PORT_PARAMETER_NAME
HELLO_ADDRESS_PARAMETER_NAME
public static final String HELLO_ADDRESS_PARAMETER_NAME
The_Database
protected Database The_Database
Catalog
protected String Catalog
Pipeline
protected static String Pipeline
Resolver
protected Reference_Resolver Resolver
RESOLVER_DEFAULT_VALUE
public static final String RESOLVER_DEFAULT_VALUE
Reference_Resolver
if a
reference can not be resolved.
UNRESOLVED_REFERENCE_PARAMETER
of the configuration file. A
parameter value of UNRESOLVED_REFERENCE_THROWS
(case
insensitive) is equivalent to a null default value.
SOURCES_FIELD_NAMES
public static final String[] SOURCES_FIELD_NAMES
SOURCE_NUMBER_FIELD
public static int SOURCE_NUMBER_FIELD
SOURCE_PATHNAME_FIELD
public static int SOURCE_PATHNAME_FIELD
SOURCE_ID_FIELD
public static int SOURCE_ID_FIELD
CONDUCTOR_ID_FIELD
public static int CONDUCTOR_ID_FIELD
STATUS_FIELD
public static int STATUS_FIELD
LOG_PATHNAME_FIELD
public static int LOG_PATHNAME_FIELD
SOURCES_TABLE_NAME_SUFFIX
public static final String SOURCES_TABLE_NAME_SUFFIX
MIN_SOURCE_RECORDS_DEFAULT
public static final int MIN_SOURCE_RECORDS_DEFAULT
MIN_SOURCE_RECORDS_PARAMETER
. The default is MIN_SOURCE_RECORDS_DEFAULT
.
MAX_SOURCE_RECORDS_DEFAULT
public static final int MAX_SOURCE_RECORDS_DEFAULT
MAX_SOURCE_RECORDS_PARAMETER
. The default is MAX_SOURCE_RECORDS_DEFAULT
.
SOURCE_AVAILABLE_TRIES_DEFAULT
public static final int SOURCE_AVAILABLE_TRIES_DEFAULT
SOURCE_AVAILABLE_TRIES_PARAMETER
can be
used to override the default.
SOURCE_AVAILABLE_TRIES_MAX
public static final int SOURCE_AVAILABLE_TRIES_MAX
SOURCE_AVAILABLE_NO_CHECK
public static final int SOURCE_AVAILABLE_NO_CHECK
SOURCE_AVAILABLE_TRIES_PARAMETER
is this value
source file availability confirmation is disabled.
DEFAULT_POLL_INTERVAL
public static final int DEFAULT_POLL_INTERVAL
POLL_INTERVAL_PARAMETER
.
BATCH_POLL_INTERVAL
seconds will be used.
BATCH_POLL_INTERVAL
public static final int BATCH_POLL_INTERVAL
POLL_INTERVAL_PARAMETER
.
BATCH_POLL_INTERVAL
seconds will be used.
PROCEDURES_FIELD_NAMES
public static final String[] PROCEDURES_FIELD_NAMES
SEQUENCE_FIELD
public static final int SEQUENCE_FIELD
DESCRIPTION_FIELD
public static final int DESCRIPTION_FIELD
COMMAND_LINE_FIELD
public static final int COMMAND_LINE_FIELD
SUCCESS_STATUS_FIELD
public static final int SUCCESS_STATUS_FIELD
SUCCESS_MESSAGE_FIELD
public static final int SUCCESS_MESSAGE_FIELD
TIME_LIMIT_FIELD
public static final int TIME_LIMIT_FIELD
ON_FAILURE_FIELD
public static final int ON_FAILURE_FIELD
Procedures_Map
protected Fields_Map Procedures_Map
PROCEDURES_TABLE_NAME_SUFFIX
public static final String PROCEDURES_TABLE_NAME_SUFFIX
Procedures_Table
protected String Procedures_Table
Procedure_Records
protected Vector<Vector<String>> Procedure_Records
PROCEDURE_SUCCESS
public static final int PROCEDURE_SUCCESS
PROCEDURE_FAILURE
public static final int PROCEDURE_FAILURE
INACCESSIBLE_FILE
public static final int INACCESSIBLE_FILE
UNRESOLVABLE_REFERENCE
public static final int UNRESOLVABLE_REFERENCE
NO_PROCEDURE
public static final int NO_PROCEDURE
PROCEDURE_TIMEOUT
public static final int PROCEDURE_TIMEOUT
BAD_REGEX
public static final int BAD_REGEX
INVALID_DATABASE_ENTRY
public static final int INVALID_DATABASE_ENTRY
FAILURE_DESCRIPTION
public static final String[] FAILURE_DESCRIPTION
DEFAULT_EMPTY_SUCCESS_ANY
public static final boolean DEFAULT_EMPTY_SUCCESS_ANY
EMPTY_SUCCESS_ANY_PARAMETER
configuruation parameter.
DEFAULT_STOP_ON_FAILURE
public static final int DEFAULT_STOP_ON_FAILURE
RUNNING
public static final int RUNNING
POLLING
public static final int POLLING
poll interval
for new records is
positive.
load new source
records
that have not been unprocessed the processing thread will
sleep for the poll interval. Once additional unprocessed source
records have been obtained the RUNNING
state will be
entered, unless the RUN_TO_WAIT
state has been set.
RUN_TO_WAIT
public static final int RUN_TO_WAIT
WAITING
state will be entered unless a failure condition
caused the HALTED
state to occur.
stop
while the RUNNING
or POLLING
state is in effect.
WAITING
public static final int WAITING
start
request.
HALTED
public static final int HALTED
start
request is received.
NL
protected static final String NL
STDOUT_NAME
public static final String STDOUT_NAME
STDERR_NAME
public static final String STDERR_NAME
SOURCE_FILE_LOG_DELIMITER
public static final String SOURCE_FILE_LOG_DELIMITER
PROCEDURE_LOG_DELIMITER
public static final String PROCEDURE_LOG_DELIMITER
ON_FAILURE_PROCEDURE_LOG_DELIMITER
public static final String ON_FAILURE_PROCEDURE_LOG_DELIMITER
EXIT_SUCCESS
public static final int EXIT_SUCCESS
EXIT_COMMAND_LINE_SYNTAX
public static final int EXIT_COMMAND_LINE_SYNTAX
EXIT_CONFIGURATION_PROBLEM
public static final int EXIT_CONFIGURATION_PROBLEM
EXIT_DATABASE_PROBLEM
public static final int EXIT_DATABASE_PROBLEM
EXIT_IO_FAILURE
public static final int EXIT_IO_FAILURE
EXIT_TOO_MANY_FAILURES
public static final int EXIT_TOO_MANY_FAILURES
EXIT_STAGE_MANAGER
public static final int EXIT_STAGE_MANAGER
required
Stage_Manager
connection could not be established.
EXIT_UNEXPECTED_EXCEPTION
public static final int EXIT_UNEXPECTED_EXCEPTION
Require_Stage_Manager
public static boolean Require_Stage_Manager
Constructor Detail
Conductor
public Conductor(String pipeline,
Configuration configuration,
String database_server_name)
throws Database_Exception,
Configuration_Exception,
IOException
pipeline
- The name of the pipeline to be managed.configuration
- A Configuration object.database_server_name
- The name of a Configuration Group that
will provide the database server access parameters.
Database_Exception
- if there is a problem connecting to the
database.
Configuration_Exception
- if there is a problem with the
configuration file.
IOException
- if a connection could not be made to the
Stage_Manager and Require_Stage_Manager
is true;
Conductor
public Conductor(String pipeline,
Configuration configuration)
throws Database_Exception,
Configuration_Exception,
IOException
pipeline
- The name of the pipeline to be managed.configuration
- A Configuration object.
Database_Exception
- if there is a problem connecting to the
database.
Configuration_Exception
- if there is a problem with the
configuration file.
IOException
- if a connection could not be made to the
Stage_Manager and Require_Stage_Manager
is true;
Conductor
protected Conductor()
Method Detail
Preconfigure
protected Configuration Preconfigure(Configuration configuration)
throws Configuration_Exception
DEFAULT_CONFIGURATION_FILENAME
.
CONDUCTOR_GROUP
parameters of the Configuration:
Theater.CLASS_ID_PARAMETER_NAME
- Set
ID
of this class.
CONFIGURATION_SOURCE_PARAMETER
- Set
source
of the Configuration that
is being used.
HOSTNAME_PARAMETER
- Set
Host.FULL_HOSTNAME
of the host system.
CONDUCTOR_ID_PARAMETER
- Set
Host.SHORT_HOSTNAME
of the system. If the host system process ID
can be obtained it is appended to the hostname after a
colon (':') delimiter.
NOTIFY_PARAMETER
- Get
Notify
message if
Conductor stops because the STOP_ON_FAILURE_PARAMETER
limit
has been reached or an exception has been thrown. If this parameter
is not present or has an empty value no notification message will
be sent.
REQUIRE_STAGE_MANAGER_PARAMETER
- Get and Set
flag
that determines
if a connection to a Stage_Manager is required for this Conductor.
N.B.: This parameter is only read once when the Conductor
is first configured; when the Conductor is reconfigured on being
restarted after a Stop or Halt state the initial value is reset in
the internal Configuration regardless of any change to the external
configuration source. Default: Require_Stage_Manager
.
STAGE_MANAGER_PORT_PARAMETER
- Get and Set
the
default port for Theater Management
.
STAGE_MANAGER_TIMEOUT_PARAMETER
- Get and Set
STAGE_MANAGER_PASSWORD_PARAMETER
- Get
HELLO_PORT_PARAMETER_NAME
- Get and Set
HELLO_ADDRESS_PARAMETER_NAME
- Get and Set
SOURCE_SUCCESS_COUNT
- Set
SOURCE_FAILURE_COUNT
- Set
TOTAL_FAILURE_COUNT
- Set
configuration
- The Configuration to be used. If null the
DEFAULT_CONFIGURATION_FILENAME
will be tried. If that
fails the filename will be qualified relative to the location of
this class in the CLASSPATH.
Configuration_Exception
- If the specified configuration is
null and a default configuration source could not be found and
loaded; or there was a problem setting a configuration value.
Postconfigure
protected void Postconfigure(Configuration configuration)
throws Configuration_Exception
CONDUCTOR_GROUP
parameters of the
Configuration are affected:
DATABASE_SERVER_NAME_PARAMETER
- Set
connection
.
DATABASE_TYPE_PARAMETER
- Set
Data_Port
class used to access the
database server.
DATABASE_HOSTNAME_PARAMETER
- Set
CATALOG_PARAMETER
- Get and Set
CONDUCTOR_GROUP
only if it
could not be obtained as part of the user specified pipeline name
nor from the database access parameters of the DATABASE_SERVER_NAME_PARAMETER
Group. This is a required value. If
it can not be obtained a Configuration_Exception will be thrown. It
is always set in the specified configuration.
PIPELINE_PARAMETER
- Set
PROCEDURES_TABLE_PARAMETER
- Set
SOURCES_TABLE_PARAMETER
- Set
RECONNECT_TRIES_PARAMETER
- Get and Set
reconnect
tries to do if the database connection is lost.
UNRESOLVED_REFERENCE_PARAMETER
- Get and Set
Reference_Resolver
if a
reference can not be resolved. A parameter value of UNRESOLVED_REFERENCE_THROWS
(case insensitive) means that the
Reference_Resolver will throw an exception if a reference can not be
resolved. If the Configuration does not contain this parameter the
RESOLVER_DEFAULT_VALUE
will be used; a null default value
means that the Reference_Resolver will throw an exception if a
reference can not be resolved.
LOG_PATHNAME_PARAMETER
or LOG_DIRECTORY_PARAMETER
- Get
EMPTY_SUCCESS_ANY_PARAMETER
- Get and Set
flag
that
determines whether or not empty Success_Status and Success_Message
fields in a procedure definition may imply any completion of the
procedure is a success. If this parameter is not present the
DEFAULT_EMPTY_SUCCESS_ANY
value will be used.
MIN_SOURCE_RECORDS_PARAMETER
- Get and Set
polling interval
is zero) this value specifies the minimum number
of source records to be processed before processing will stop.
The minimum for this value is 1.
MAX_SOURCE_RECORDS_PARAMETER
- Get and Set
SOURCE_AVAILABLE_TRIES_PARAMETER
- Get and Set
SOURCE_AVAILABLE_TRIES_DEFAULT
will be used. The value is limited
to be no more than SOURCE_AVAILABLE_TRIES_MAX
. If the value
is negative it will be set to (@link #SOURCE_AVAILABLE_NO_CHECK} and
source availability confirmation will be disabled. N.B. Source
availability confirmation ensures that procedures will not fail due
to a missing source file; disabling this confirmation would be
appropriate for a pipeline that does not process a Source_Pathname,
or for which the Source_Pathname is not a filename but some other
information used by a procedure.
POLL_INTERVAL_PARAMETER
- Get and Set
DEFAULT_POLL_INTERVAL
will be used.
STOP_ON_FAILURE_PARAMETER
- Get and Set
DEFAULT_STOP_ON_FAILURE
value will be used.
configuration
- The Configuration to be used. If null nothing
is done.
Configuration_Exception
- If there was a problem setting a
configuration value.
Config_Value
protected String Config_Value(String name)
name
- The name of the Assignment parameter for which to get
a value. If not an absolute pathname an absolute pathname for the
name in the CONDUCTOR_GROUP
is used.
Config_Value
protected boolean Config_Value(String name,
Object value)
throws Configuration_Exception
CONDUCTOR_GROUP
.
name
- The name of the Assignment parameter to have its
value set. If not an absolute pathname an absolute pathname for the
name in the CONDUCTOR_GROUP
is used.value
- An Object to use for the parameter's value.
Configuration_Exception
- If there was a problem setting
the parameter.Configuration.Set(String, Object)
Config_Pathname
public static String Config_Pathname(String name)
CONDUCTOR_GROUP
is used as the root for to make an absolute pathname
from the name.
name
- The parameter name to be made absolute.
Configuration
public Configuration Configuration()
Configuration
in interface Management
Preconfigure(Configuration)
,
Postconfigure(Configuration)
Connect_to_Database
protected Database_Exception Connect_to_Database()
Resolve
protected String Resolve(String reference)
throws Database_Exception,
ParseException,
Unresolved_Reference
connected
and the reference resolution is tried again.
reference
- The reference String to be resolved
.
Database_Exception
- If there was a problem accessing the
Database.
ParseException
- If the reference contained a mathematical
expression that could not be correctly parsed.
Unresolved_Reference
- If the reference could not be resolved
and a non-null default
value
had not been assigned.Reference_Resolver
Resolver_Default_Value
public Management Resolver_Default_Value(String value)
exception
and enter the halted processing state
, or
exit if it is not connected to a
Stage_Manager
at the time. If, however, the Reference_Resolver default
value
is set to a non-null String that value will be used for the
unresolved_reference instead of throwing an exception.
UNRESOLVED_REFERENCE_PARAMETER
is set to this value (or UNRESOLVED_REFERENCE_THROWS
if the
value is null) and processing event notification with the updated
Configuration is sent to all listeners.
Resolver_Default_Value
in interface Management
value
- The default Reference_Resolver String value. If this
starts with UNRESOLVED_REFERENCE_THROWS
(case insensitive)
null will be used.
Reference_Resolver
Resolver_Default_Value
public String Resolver_Default_Value()
Resolver_Default_Value
in interface Management
Resolver_Default_Value(String)
Procedures
public Vector<Vector<String>> Procedures()
PROCEDURES_FIELD_NAMES
order, and with the records in
processing order.
Procedures
in interface Management
Get_Procedures_Table
protected Vector<Vector<String>> Get_Procedures_Table()
throws Database_Exception
Procedures_Table
from the Database.
() disconnected
Database_Exception
is thrown an attempt is made to reconnect
to the Database.
Database_Exception
- If the the procedures table could not be
obtained.
Load_Procedure_Records
protected void Load_Procedure_Records()
throws Database_Exception,
Configuration_Exception
SEQUENCE_FIELD
using a String_Vector_Comparator
to do the comparisons.
TOTAL_PROCEDURE_RECORDS_PARAMETER
is set to the number of table
records (not including the field names record, which was removed to
contrstuct the Fields_Map), the PROCEDURE_COUNT_PARAMETER
is
reset to zero, the PROCEDURE_SEQUENCE_PARAMETER
is reset to
the empty string, and the PROCEDURE_COMPLETION_NUMBER_PARAMETER
is reset to PROCEDURE_SUCCESS
. Finally a processing event report
is sent with procedures changed
set.
Database_Exception
- If the procedures table
could not be obtained from the Database,
it was empty, or any required PROCEDURES_FIELD_NAMES
are missing.
Configuration_Exception
- If the Configuration could not
be updated with the total number of procedures records.Fields_Map
Sources
public Vector<Vector<String>> Sources()
SOURCES_FIELD_NAMES
order, and with the records
sorted by increasing SEQUENCE_FIELD
order.
Sources
in interface Management
Load_Source_Records
protected boolean Load_Source_Records()
throws Database_Exception
Max_Source_Records
value, which can not be less than the
Min_Source_Records
value.
Database_Exception
- If the sources table records
could not be obtained from the Database
or any required SOURCES_FIELD_NAMES
are missing.Fields_Map
Processing_State
public int Processing_State()
RUNNING
POLLING
polling
for source
records to process.
RUN_TO_WAIT
WAITING
HALTED
sequential failures
of source record
processing having occured, a database access failure, or some other
system error
.
Processing_State
in interface Management
State
public Processing_Changes State()
Processing_Changes.Sources_Refreshed(boolean)
,
Processing_Changes.Procedures_Changed(boolean)
and
Processing_Changes.Exiting(boolean)
- will always be false.
State
in interface Management
Processing_Changes
Start
public void Start()
RUN_TO_WAIT
state is in effect it
is reset to the RUNNING
state.
Pipeline Processing:
poll interval
and
stop-on-failure limit
are not
reset if they were set to a non-negative value.
table of procedures
is
refreshed in case it was changed while processing was waiting.
polling interval
for newly available sources is
zero (i.e. batch mode) and at least Min_Source_Records have been
processed in the current batch; processing of the current source has
completed and further processing has been flagged to stop
; the number of sequential procedure failures has reached its
(@link #Stop_on_Failure() limit}, or an unrecoverable exception
occurred during processing, in which case the processing exception
will be non-null. The
possible processing exception types are a Configuration_Exception,
Database_Exception or IOException; any other exception is due to a
programming error (any exception is caught and saved for possible
subsequent retrieval);
source
records cache
is processed first. An unprocessed source record is
acquired
from the cache and processed
. Source record acquisition is always
done in the order in which source records occur in the list of
records obtained from the database; there is no priority ordering.
Only after the cache is emtpy and the polling interval sleep time
has passed will it be refreshed. If there is
no polling interval the cache will not be refreshed and processing
will stop. The polling interval may be interrupted by stopping and
restarting
processing.
Start
in interface Management
Start()
Poll_Interval
public Management Poll_Interval(int seconds)
reconfigured
after being
stopped. However, if the value is negative then the value from the
reconfiguration will be used. In this case a zero value will still be
set for the current poll interval; this will cause the Conductor to
stop if it is currently polling or when no unprocessed source records
can be obtained from the database.
Poll_Interval
in interface Management
seconds
- The number of seconds between querying the database
for unprocessed source records. If negative zero will be used.
Poll_Interval(int)
Poll_Interval
public int Poll_Interval()
Poll_Interval
in interface Management
Poll_Interval(int)
Stop_on_Failure
public Management Stop_on_Failure(int failure_count)
reconfigured
after being
stopped. However, if the value is negative then the value from the
reconfiguration will be used. In this case the current value will not
be changed.
Stop_on_Failure
in interface Management
failure_count
- The number of sequential failures at which to
halt processing. Zero means never halt. If negative the current
value is not changed.
Management.Sequential_Failures()
,
Stop_on_Failure(int)
Stop_on_Failure
public int Stop_on_Failure()
Stop_on_Failure
in interface Management
Stop_on_Failure(int)
Sequential_Failures
public int Sequential_Failures()
Sequential_Failures
in interface Management
Stop_on_Failure(int)
,
Reset_Sequential_Failures()
Reset_Sequential_Failures
public Management Reset_Sequential_Failures()
processing state
is HALTED
it is reset to WAITING
.
Reset_Sequential_Failures
in interface Management
Management.Stop_on_Failure(int)
,
Reset_Sequential_Failures()
Processing_Exception
public Exception Processing_Exception()
started
the previous processing exception is cleared.
Processing_Exception
in interface Management
Stop
public void Stop()
processing state
it will enter the run-to-wait
state in which will enter the waiting
state when the current source record completes processing.
If the Conductor is in the polling
state
it will immediately stop polling for new source records. There will
be no effect for any negative state.
Stop
in interface Management
Stop()
Quit
public void Quit()
exiting
processing event
is sent to all processing listeners
. The application exits with a success status
.
running
it is aborted.
Quit
in interface Management
Quit()
Add_Processing_Listener
public Management Add_Processing_Listener(Processing_Listener listener)
processing event
notifications to all registered listeners.
Add_Processing_Listener
in interface Management
listener
- A Processing_Listener.
Processing_Listener
Remove_Processing_Listener
public boolean Remove_Processing_Listener(Processing_Listener listener)
Remove_Processing_Listener
in interface Management
listener
- The Processing_Listener to be removed from the
Management list of registered listeners.
Add_Processing_Listener(Processing_Listener)
Parse_Command_Line
public static String[] Parse_Command_Line(String command_line)
command_line
- A String to be parsed.
Status_Indicators
public static Vector<String> Status_Indicators(String status_field)
<PID> | <Conductor code>[(<procedure exit status>)]
status_field
- The String from a Source table Status field
value (may be null).
Status_Conductor_Code
public static int Status_Conductor_Code(String status)
throws NumberFormatException
status
- A procedure status indicator String.
NumberFormatException
- if a value could not be formed.Status_Indicators(String)
Status_Conductor_Code_Description
public static String Status_Conductor_Code_Description(int code)
code
- The code value.
Status_Conductor_Code(String)
Status_Procedure_Exit_Value
public static int Status_Procedure_Exit_Value(String status)
throws NumberFormatException
status
- A procedure status indicator String.
NumberFormatException
- if a value could not be formed.Status_Indicators(String)
Status_Field_Value
public static String Status_Field_Value(Vector<String> status)
Status_Indicators
method.
status
- A Vector of procedure status indicator Strings.
Status_Indicator
public static String Status_Indicator(int conductor_status,
int procedure_status)
conductor_status
- A conductor procedure completion code.procedure_status
- A procedure exit status value.
Status_Indicator
public static String Status_Indicator(int conductor_status)
conductor_status
- A conductor procedure completion code.
Add_Log_Writer
public Management Add_Log_Writer(Writer writer)
Add_Log_Writer
in interface Management
writer
- A Writer object.
Enable_Log_Writer(Writer, boolean)
,
Remove_Log_Writer(Writer)
Remove_Log_Writer
public boolean Remove_Log_Writer(Writer writer)
Remove_Log_Writer
in interface Management
writer
- A Writer object.
Add_Log_Writer(Writer)
Enable_Log_Writer
public Management Enable_Log_Writer(Writer writer,
boolean enable)
registered log stream Writer
.
Enable_Log_Writer
in interface Management
writer
- A Writer that has been registered to receive
Conductor log stream output. If the writer is not registered
to receive the Conductor log
stream nothing is done.enable
- If false, Conductor log stream output to the Writer is
suspended without having to unregister the Writer. If true, a
Writer that has had its log stream output suspended will begin
receiving it again.
Log_Message
protected void Log_Message(String message,
AttributeSet style)
throws IOException
message
- The message String to write to the Logger.style
- An AttributeSet style to apply to the message. This
may be null to use the default text style.
IOException
- if the Log_File_Writer could not be written. If a
Writer other than the Log_File_Writer throws an exception it is
closed and removed
from the
Logger.
Log_Message
protected void Log_Message(String message)
throws IOException
message
- The message String to write to the Logger.
IOException
- if the Log_File_Writer could not be written.Log_Message(String, AttributeSet)
Connected_to_Stage_Manager
public boolean Connected_to_Stage_Manager()
Connected_to_Stage_Manager
in interface Management
Local_Theater
Identity
public Message Identity()
Message.ACTION_PARAMETER_NAME
Message.IDENTITY_ACTION
value.
Message.NAME_PARAMETER_NAME
CONDUCTOR_GROUP
.
HOSTNAME_PARAMETER
Host.FULL_HOSTNAME
of the host system.
CONDUCTOR_ID_PARAMETER
Host.SHORT_HOSTNAME
followed by a
colon (':') and the system process ID of this Conductor. If the
process ID can not be obtained only the short hostname is included.
CATALOG_PARAMETER
PIPELINE_PARAMETER
CONFIGURATION_SOURCE_PARAMETER
source
of the Configuration that
is being used.
Message.CLASS_ID_PARAMETER_NAME
ID
of this Conductor class.
DATABASE_SERVER_PARAMETER
DATABASE_SERVER_PARAMETER
. This group contains the following
parameters:
Database.TYPE
database
Configuration
parameter of the same name.
Configuration.HOST
database
Configuration
parameter of the same name. However, if the value is
"localhost" then the Host.FULL_HOSTNAME
is used instead.
Configuration.USER
database
Configuration
parameter of the same name.
Identity
in interface Management
Identity()
main
public static void main(String[] args)
args
- The Conductor command line
arguments.
Usage
public static void Usage()
Usage: Conductor <Switches>
Switches -
[-Pipeline] <pipeline>
[-Configuration <source>]
[-Database|-Server <server name>]
[-CAtalog <catalog>]
[-Monitor]
[-Wait-to-start]
[-Version]
[-Help]
[<catalog>.]<pipeline>
<pipeline>_Sources
<pipeline>_Procedures
Database
constructor
and its Connect
method).
The database "Type" parameter must be provided that specifies the
type of database server (e.g. "MySQL") that will be accessed.
Additional database access parameters typically provided are the
server "Host" name and database "User" and "Password" access values.
Depending on the type of database server and the driver and its
Data_Port implementation (e.g. MySQL_Data_Port
) there may be other required and
optional parameters that can be included, such as a "Port" parameter
to specify the database host system's network port to use for server
communications.
Reference_Resolver
. Other parameters that will be
used if present in the Conductor configuration group are described in
the Conductor control parameters section of the Conductor class
description.
EXIT_COMMAND_LINE_SYNTAX
status value.
Overview
Package
Class
Tree
Deprecated
Index
Help
PREV CLASS
NEXT CLASS
FRAMES
NO FRAMES
SUMMARY: NESTED | FIELD | CONSTR | METHOD
DETAIL: FIELD | CONSTR | METHOD
Copyright (C) \
2003-2009 Bradford Castalia, University of Arizona