Planetary Image Research Laboratory

Department of Planetary Sciences
University of Arizona
Tucson, Arizona

PIRL Parameter Value Logic
Introduction


Design

The Parameter Value Language

Perhaps one of the most common concepts in scientific research is the named parameter that refers to a value of some type:

Name = Value

The Name is a single textual token that provides the referential handle to the value. The choice of parameter names for values is completely arbitrary, though it may be guided by various usage conventions. Thus there is no guarantee that a parameter of any given name will refer to a value of any particular type. While this provides for complete freedom in the use of parameter names, it also allows for considerable confusion.

The Value may be quite simple - for example, a single decimal integer - or very complex - for example the complete characterization of an instrument - yet the type of value must be specifically defined to avoid any ambiguity in its use. Thus the value definition mechanism must be broadly encompassing to allow for the needed freedom in the use of values, while providing an interpretation that allows for no misinterpretation.

A Parameter Value Language (PVL) provides a text syntax for associating a named parameter with a specifically defined type of value. While there are various forms of parameter value language to choose from, the needs of the Planetary Image Research Lab (PIRL) to handle files distributed by the Planetary Data System (PDS) led directly to the PVL specified for use in PDS products.

The image files (and others) distributed on PDS CDs contain an initial label section that provides descriptive information about the image, including parameters necessary for the correct processing of the binary data in the file. The syntax of the PVL used by PDS products has been specified by the Consultative Committee for Space Data Systems in the Blue Book "Parameter Value Language Specification (CCSDS0006,8)", June 2000 [CCSDS 641.0-B-2] and Green Book "Parameter Value Language - A Tutorial", May 1992 [CCSDS 641.0-G-2] documents. PVL has been accepted by the International Standards Organization (ISO), as a Draft Standard (ISO/CD 14961:1997). The Object Definition Language (ODL) used to describe the metadata in Earth Observing System (EOS) image files is also an implementation of PDS PVL, though it is embedded in the HDF-EOS extension to the NCSA HDF format.

The actual form of the PDS PVL, however, has evolved somewhat since it was first used in early products, such as imagery from the Voyager and Viking missions. This resulted in inconsistencies that made it difficult to write software which could handle all PDS products. Previous to the implementation of the PPVL no single application program interface (API) library that interprets PVL seemed able to successfully read any PVL directly from any PDS product.

PPVL Design

The primary design goal for the PIRL Parameter Value Logic (PPVL) was to provide a single C-language API that would correctly handle all PVL in all PDS products. To be an effective solution, the API would need to be simple to use and easy to manage. It should also be flexible enough to go beyond the necessary core capability of reading and writing PVL to offer convenience features and be extensible to meet new demands and special circumstances.


Structure

Parameters

Every parameter is described by a simple structure:

typedef struct PPVL_parameter
    {
    PPVL_Class                   classification;
    char                         *name;
    union
        {
        struct PPVL_value        *value;
        struct PPVL_parameter    **parameters;
        }
                                 content;
    char                         *comments;
    void                         *user_data;
    }
    PPVL_Parameter;

The classification is a bit-field code that indicates the class of parameter. There are three basic types: a token, an assignment, and an aggregate. A token is a parameter with no value (other than its existence). An assignment is a parameter that has been assigned a value. An aggregate is a parameter composed of additional parameters. PPVL_Class_Is_<class> macros are provided to determine the parameter classification before accessing the parameter content. The content of a parameter is NULL for a token, a pointer to a value structure for an assignment, or a pointer to a list of parameter structure pointers for an aggregate. The name of a parameter is the string of characters that gives the reference name to the parameter. Note that the meaning of the name, if any, is completely the responsibility of the application software (e.g. in certain contexts a special initial character implies that the parameter's value has special units). If the PVL included one or more lines of non-interpreted comments before the definition of the parameter, they are collected in the comments text. The user_data pointer is provided so the software developer can link any other information to the parameter that may be useful to the application.

Values

Every value is described by a simple structure:

typedef struct PPVL_value
    {
    PPVL_Type                    type;
    union
        {
        unsigned long            integer;
        double                   real;
        char                     *string;
        struct PPVL_value        **array;
        }
                                 data;
    int                          base;
    char                         *units;
    }
    PPVL_Value;

The type of a value is a bit-field code that indicates the type of the data value. There are three general types of value: numeric, string, or array. Each of these will be one of the specific types: numeric data may be integer or real; a string of characters may be an identifier, a symbol, or text; and an array may be a set or sequence. While the specific type of string or array may be of interest to the application (the details of the PVL syntax differ in each case), they are treated identically by the PPVL. As for a parameter classification, PPVL_Type_Is_<type> macros are provided to determine the type of value before the data is accessed. Normally integers are represented in decimal form in PVL statements. But it may be more convenient to represent integer values using some other base - for example binary for bit fields - so PVL offers a notation for specifying an alternate base which, if used, is stored in the base variable. This is only used by PPVL to generate the appropriate PVL notation from the value. The units string is a feature of the PVL which may be optionally applied to any value and is the responsibility of the application software to use or not. It's worth noting that the data of an array type value is a pointer to a list of value structure pointers, which allows each specific value of an array to be a different type of value, including another array.


Function: Flexible

Tolerance

The philosophy guiding the development of the PPVL is that the application software developer wants to get the information contained in the PVL and is not so concerned with its syntax. While this may seem obvious, it has resulted in a policy of being very tolerant in the implementation of the language parser, rather than requiring strict adherence to a rigid specification; i.e. the intent is to "get the information" if possible, and let the user decide what to do with it. For example, PPVL will successfully parse this text you are reading now. All the information in even the most complex PVL is still obtained - including any syntax details such as the distinction between symbols (in single quotes) and text (in double quotes) and sets (within curly braces) and sequences (within parentheses) - though some interpretations are left to the application or other API libraries (e.g. the parsing of date and time strings); and as long as the PVL syntax is reasonable correct results will be provided, while unreasonable syntax will usually still return results that will be easily recognizable as unreasonable. Strict conformance to the CCSDS specification can be turned on, however, if that is needed.

Input data formats

Beyond tolerance of the PVL syntax used, PPVL is able to handle the "variable" record format data of the Voyager files produced on VMS systems which contain embedded binary record size data amongst the ASCII text. Some files do not contain a PVL END statement or any other marker delimiting the image label from the image data. Instead the image label contains a PVL parameter that provides the size of the label. Since the PPVL avoids any knowledge of specific product parameters it offers several tactics to deal with difficult image data formats. While ingesting data from a file for PVL parsing, if no parameter can be formed interpretation of the file stops. This is usually sufficient. However, the PPVL parser can be given a limit on how much file data to read, so on the assumption that the necessary label size parameter is near the front of the file a relatively small limit can be set, and then PPVL can be given the label size limit to interpret all of the PVL. Using application knowledge, data can also be read from the file by some other means and passed to PPVL for interpretation (this is how, for example, EOS metadata would be handled).

Obtaining PVL parameters from a file usually requires only a single call to the PPVL_read_aggregate function:

PPVL_Parameter      *PPVL_read_aggregate
    (
    FILE            *File,
    unsigned long   Read_Limit,
    unsigned long   *Total_Scanned
    );

The File is a data stream that has been opened by the application. If the Read_Limit is NULL then the PPVL will obtain as many parameters from the file, starting at its current input location, as possible. If the Total_Scanned argument is not NULL then the variable it points to will receive the total number of data bytes scanned by the parser. The function returns a pointer to an aggregate class parameter named "The_Container" which contains all of the parameters found in the file. The file is positioned (if it is seekable) after the last PVL parameter found, which will be the same as the Total_Scanned byte if the file was read from the beginning.

A string of data may be passed to the PPVL_scan_aggregate function for parsing:

PPVL_Parameter       *PPVL_scan_aggregate
    (
    char            *The_String,
    char            **Next_Character
    );

This function also returns an aggregate class parameter named "The_Container" which contains all of the parameters found in the The_String. The character pointer variable pointed to by Next_Character, if not NULL, refers to the character of the string immediately following the last parameter that was formed (which may be the end of string marker).

The PPVL functions that assemble individual parameters, values and their components are not usually used directly, though they may be for special needs. All of these functions follow the same pattern of being passed a string of data to parse and returning their appropriate component along with the location in the string where they left off.

Output data format

The PPVL_write_parameter function is used to generate formatted PVL output:

PPVL_Error_Code     PPVL_write_parameter
    (
    PPVL_Parameter  *The_Parameter,
    FILE            *The_File,
    int             Indent_Level
    );

The_Parameter may be a pointer to any parameter structure, though it is typically an aggregate class parameter that contains all the parameters to be written. If the name of the aggregate is "The_Container" then only the parameters it contains will be written. The PVL is written to The_File, a data stream which has been opened by the application, which may be NULL to indicate output to stdout. PVL parameters are written one to a line, with Indent_Level indicating the number of tab characters to put at the beginning of each line. The Indent_Level is increased by one for each recursively processed aggregate parameter encountered. If the Indent_Level is negative, however, no indenting is done.

The PVL produced by PPVL strictly conforms to the CCSDS specification. However, producing text representations of real numbers can have difficulty with the actual precision of the internal value. By default real values are output using the "%#G" format. However, the PPVL_real_number_format global variable may be assigned a pointer to an alternate format string. For example, "%#.15E" might be used to provide scientific notation with 15 digits of precision. The format string must be used cautiously or invalid PVL output could result.

Searching

Finding a specific parameter or value is a necessary task. A function has been provided to easily accomplish this:

PPVL_Object         *PPVL_find
    (
    PPVL_Object     *The_Container,
    PPVL_Object     *Last_Object,
    int             Select,
    PPVL_Object     *The_Selection,
    PPVL_Object     **The_Parent
    );

The_Container is either an aggregate parameter or an array value that will be searched for The_Selection following the Last_Object specified, which may be the symbol PPVL_SEARCH_FROM_THE_TOP to begin with the first object in The_Container list. A pointer to the first object found in the list that matches The_Selection is returned, and if The_Parent argument is not NULL a pointer to the found object's parent object is placed in the referenced variable.

Select is a symbol code that indicates what The_Selection is. For aggregate parameters the Select codes are:

PPVL_SELECT_ANY - The next parameter after the Last_Object (or the first parameter if it is PPVL_SEARCH_FROM_THE_TOP) is selected. This allows a parameter hierarchy to be stepped through in the order in which the corresponding PVL statements appear externally.

PPVL_SELECT_NAME - The_Selection is a string that is compared with the parameter name. The_Selection string may be a simple name which will match with the first occurrence of a parameter with the same (but case insensitive) name. It may also be a pathname of the form:

[/]Name[/Name[...]]

A pathname beginning with a forward slash is "absolute" in that the first Name must be found in the top aggregates's parameter list before the next Name will qualify for a match in an aggregate parameter in this list; etc. This is the same as file pathnames in Unix. A pathname that does not begin with a forward slash is "relative" in that the first Name is searched for like a simple name and then the remainder of the path is searched for relative to the location of the first parameter found.

PPVL_SELECT_CLASS - The_Selection is a parameter classification code (the value itself, not a pointer to the value). Any classification code is valid.

PPVL_SELECT_PARAMETER - The_Selection is a pointer to a specific parameter structure to be found. This is typically used to traverse back up a parameter hierarchy by searching for the parent parameter returned from a successful search, which returns its parent, until the parent returned is the original aggregate.

PPVL_SELECT_USER_DATA - The_Selection is a generic pointer (void *). The first parameter that has user_data with the identical value is selected.

For array values the Select code is any type value and The_Selection is a pointer to a variable of the corresponding type; for string type codes, however, The_Selection is a pointer to the string (char *), not a pointer to a pointer variable (char **). The special symbol PPVL_TYPE_ANY works like PPVL_SELECT_ANY (they are interchangeable). Also, if The_Selection is PPVL_SELECT_ANY then no data match is done to find a value, only the type match.


Function: Manageable

Structure manipulation

Both parameter and value structures can be created directly from application software variables using PPVL functions:

PPVL_Parameter      *PPVL_new_parameter
    (
    char            *name,
    PPVL_Class      class,
    PPVL_Object     *content,
    char            *comments,
    void            *user_data
    );

PPVL_Value          *PPVL_new_value
    (
    PPVL_Type       type,
    PPVL_Object     *data,
    int             base,
    char            *units
    );

Each of these functions takes arguments that correspond to the entries in the structures they create and to which they return a pointer. A specific, rather than general, class or type code must be provided. For a parameter, however, the classification code may be NULL if the class of parameter can be determined from the content: if a pointer to a parameter structure is provided an aggregate is created, but if a pointer to a value structure is provided an assignment is created. The data used to create a value structure is treated according to the specified type.

Functions that duplicate existing parameter structures simply take a pointer to the structure to be duplicated and return a pointer to the copy.

Functions are also provided to add or remove parameters or values from other parameters (aggregates) or values (arrays). They each have the same form:

PPVL_Object      *PPVL_{add|remove}
    (
    PPVL_Object  *The_Container,
    PPVL_Object  *The_Object
    );

The Parameter or Value is added or removed from the Aggregate or Array and a pointer to the modified structure is returned.

These structure manipulation functions are often used in combination to assemble PVL descriptions or modify those read in from files before they are written out to another file.

While it possible to construct parameter or value structures directly in static or automatic (stack) memory controlled by the application software, this is not generally recommended. The PPVL library uses dynamically allocated memory throughout. All PPVL objects - parameter structures and aggregate lists, value structures and array lists, and any string contents - are stored in dynamic memory. Though the PPVL library does not reallocate the memory for parameter and value structures, parameter and value lists for aggregates and arrays may be reallocated whenever a PPVL function feels the need. Functions to free all of the memory associated with a parameter or value, including all of its contents, are provided.

Counting and indexing

The PPVL_count function is provided to count the number of parameters or values in an aggregate or array. The PPVL_count_all function will recursively count any aggregate parameters or array values.

The parameter and value lists of aggregates and arrays are NULL terminated pointer lists. When the application software wishes to treat these lists as pointer arrays it may be desirable to obtain the index of a structure pointer - e.g. as returned from PPVL_find - in this pointer array:

int                 PPVL_index
    (
    PPVL_Object     *The_Container
    PPVL_Object     *The_Object
    );

The PPVL_index function will find the index of The_Object in The_Container.

Warnings and errors

Every PPVL function sets the global integer variable PPVL_errno. Initially it is set to PPVL_SUCCESS (0) on entering the function. If an error condition is detected, it is set to an appropriate value before the function cleans up and returns an error status value. The condition, however, may be considered to be only a warning (this may depend on PPVL_strict being FALSE) in which case PPVL_errno is set accordingly but no error return is taken (though some recovery action may be taken). Only the first warning condition in a function is registered. If an error condition occurs after a warning condition has been encountered the error condition takes precedence over the warning.

There are three levels of error codes (four if you count success): fatal errors, syntax errors, and abstract (high level) usage problems. Fatal errors are non-recoverable and always result in a failure status value from the function. Syntax errors are usually recoverable and are treated as warnings if PPVL_strict is FALSE. Abstract errors are treated like syntax errors.

Every error code has a corresponding brief (one short line) descriptive error message contained in the global PPVL_errmsg strings array. The macro PPVL_ERROR_MESSAGE provides the description string for the current value of PPVL_errno (this is commonly used as a printf function argument).


Function: Extensible

Selection tables

While the PPVL_find function provides the capabilities to easily find parameters in aggregates, managing the relationship between external parameter names and variable names internal to the application software can be an annoying chore, especially when files from many different projects, each with their own naming scheme for the same parameter values, must be handled. To solve this problem a new function was added to the PPVL library as an extension to the core capabilities:

PPVL_Error_Code                PPVL_selections
    (
    PPVL_Parameter             *The_Aggregate,
    PPVL_Parameter_Selections  *The_Selections
    );

The_Aggregate is a pointer to an aggregate parameter that will be processed using The_Selections table, which is an array of PPVL_Parameter_Selections:

typedef struct
    {
    char        *name;
    PPVL_Type   type;
    int         count;
    void        *value;
    }
    PPVL_Parameter_Selections;

The PPVL_Parameter_Selections table provides a generic way to specify what to look for in an aggregate, and the program variables that will store what is found. The parameter name may be in absolute or relative pathname form. The parameter, if found, must contain values of the specified type, which should be one of the primary types. If an array type is specified it will be ignored and processing will continue, but PPVL_errno will be set to PPVL_ERROR_ILLEGAL_SYNTAX. This is just a warning and can be ignored if desired.

Up to count values of the specified type will be copied from the parameter to sequential variable locations starting at the value address; this should be a pointer to an appropriate variable (or array, or array member). For an integer type an unsigned long will be copied, for a real type a double, and for a string type a pointer to the string. The characters of a string will not be copied. Since arrays of parameter values may have different types for each value, only the values of the selected type are used.

Parameters that are not found result in no change to the value storage for that parameter. So it may be appropriate to initialize value storage with a default value. However, the same value storage may be referred to by more than one parameter name, in which case the value of the last parameter name found, if any, will be placed in the value storage. It may be desirable to initialize value storage with a special value that can be used to indicate that no parameter was found for that variable.

Parameters that are found to have the selected name but do not have the selected type will be silently ignored. However, the same parameter name may be listed more than once with a different selected type (and probably a different value storage reference).

The_Selections list must be NULL terminated . This is easily done by using PPVL_PARAMETER_SELECTIONS_END as the last entry in the list.

PDS EOL labels

Some PDS image files were found to contain a special description label appended to the end of the file. A special function was provided to handle this case:

PPVL_Parameter      *PPVL_get_PDS_EOL
    (
    FILE            *The_File,
    PPVL_Parameter  *The_Aggregate
    );

If the special EOL parameter is present in The_Aggregate, as well as all the other parameters necessary to determine the location of the EOL label after the end of the image data, then The_File is repositioned at the beginning of the EOL label and it is read in as a new aggregate. This new aggregate is renamed "EOL" and added to the end of The_Aggregate.