Perhaps one of the most common concepts in scientific research is the named parameter that refers to a value of some type:
Name = Value
The Name is a single textual token that provides the referential handle to the value. The choice of parameter names for values is completely arbitrary, though it may be guided by various usage conventions. Thus there is no guarantee that a parameter of any given name will refer to a value of any particular type. While this provides for complete freedom in the use of parameter names, it also allows for considerable confusion.
The Value may be quite simple - for example, a single decimal integer - or very complex - for example the complete characterization of an instrument - yet the type of value must be specifically defined to avoid any ambiguity in its use. Thus the value definition mechanism must be broadly encompassing to allow for the needed freedom in the use of values, while providing an interpretation that allows for no misinterpretation.
A Parameter Value Language (PVL) provides a text syntax for associating a named parameter with a specifically defined type of value. While there are various forms of parameter value language to choose from, the needs of the Planetary Image Research Lab (PIRL) to handle files distributed by the Planetary Data System (PDS) led directly to the PVL specified for use in PDS products.
The image files (and others) distributed on PDS CDs contain an initial label section that provides descriptive information about the image, including parameters necessary for the correct processing of the binary data in the file. The syntax of the PVL used by PDS products has been specified by the Consultative Committee for Space Data Systems in the Blue Book "Parameter Value Language Specification (CCSDS0006,8)", June 2000 [CCSDS 641.0-B-2] and Green Book "Parameter Value Language - A Tutorial", May 1992 [CCSDS 641.0-G-2] documents. PVL has been accepted by the International Standards Organization (ISO), as a Draft Standard (ISO/CD 14961:1997). The Object Definition Language (ODL) used to describe the metadata in Earth Observing System (EOS) image files is also an implementation of PDS PVL, though it is embedded in the HDF-EOS extension to the NCSA HDF format.
The actual form of the PDS PVL, however, has evolved somewhat since it was first used in early products, such as imagery from the Voyager and Viking missions. This resulted in inconsistencies that made it difficult to write software which could handle all PDS products. Previous to the implementation of the PPVL no single application program interface (API) library that interprets PVL seemed able to successfully read any PVL directly from any PDS product.
The primary design goal for the PIRL Parameter Value Logic (PPVL) was to provide a single C-language API that would correctly handle all PVL in all PDS products. To be an effective solution, the API would need to be simple to use and easy to manage. It should also be flexible enough to go beyond the necessary core capability of reading and writing PVL to offer convenience features and be extensible to meet new demands and special circumstances.
Every parameter is described by a simple structure:
typedef struct PPVL_parameter { PPVL_Class classification; char *name; union { struct PPVL_value *value; struct PPVL_parameter **parameters; } content; char *comments; void *user_data; } PPVL_Parameter;
The classification
is a bit-field code that
indicates the
class of parameter. There are three basic types: a token,
an assignment, and an aggregate. A token is a
parameter with no value (other than its existence). An assignment
is a parameter that has been assigned a value. An aggregate is a
parameter composed of additional parameters.
PPVL_Class_Is_<class>
macros are provided to determine the parameter classification
before accessing the parameter content. The content
of
a parameter is NULL
for a token, a pointer to a value
structure for an assignment, or a pointer to a list of parameter
structure pointers for an aggregate. The name
of a
parameter is the string of characters that gives the reference
name to the parameter. Note that the meaning of the name, if any,
is completely the responsibility of the application software
(e.g. in certain contexts a special initial character implies that
the parameter's value has special units). If the PVL included one
or more lines of non-interpreted comments before the definition of
the parameter, they are collected in the comments
text.
The user_data
pointer is provided so the software
developer can link any other information to the parameter that
may be useful to the application.
Values
Every value is described by a simple structure:
typedef struct PPVL_value { PPVL_Type type; union { unsigned long integer; double real; char *string; struct PPVL_value **array; } data; int base; char *units; } PPVL_Value;
The type
of a value is a bit-field code that
indicates the type of the data
value. There are three
general
types of value: numeric, string, or array.
Each of these will be one of the specific types: numeric data may
be integer or real; a string of characters may be an
identifier,
a symbol, or text; and an array may be a set
or sequence. While
the specific type of string or array may be of interest to the
application (the details of the PVL syntax differ in each case),
they are treated identically by the PPVL. As for a parameter
classification, PPVL_Type_Is_<type>
macros are
provided to determine the type of value before the data is accessed.
Normally integers are represented in decimal form in PVL statements.
But it may be more convenient to represent integer values using
some other base - for example binary for bit fields - so PVL
offers a notation for specifying an alternate base which, if used,
is stored in the base variable. This is only used by PPVL to
generate the appropriate PVL notation from the value. The
units
string is a feature of the PVL which may be optionally applied to
any value and is the responsibility of the application software to
use or not. It's worth noting that the data of an array type value
is a pointer to a list of value structure pointers, which allows
each specific value of an array to be a different type of value,
including another array.
The philosophy guiding the development of the PPVL is that the
application software developer wants to get the information
contained in the PVL and is not so concerned with its syntax.
While this may seem obvious, it has resulted in a policy of being
very tolerant in the implementation of the language parser, rather
than requiring strict adherence to a rigid specification; i.e. the
intent is to "get the information" if possible, and let the user
decide what to do with it. For example, PPVL will successfully
parse this text you are reading now. All the information in even
the most complex PVL is still obtained - including any syntax
details such as the distinction between symbols (in single quotes)
and text (in double quotes) and sets (within curly braces) and
sequences (within parentheses) - though some interpretations are
left to the application or other API libraries (e.g. the parsing
of date and time strings); and as long as the PVL syntax is
reasonable correct results will be provided, while unreasonable
syntax will usually still return results that will be easily
recognizable as unreasonable. Strict conformance to the CCSDS
specification can be turned on, however, if that is needed.
Input data formats
Beyond tolerance of the PVL syntax used, PPVL is able to handle
the "variable" record format data of the Voyager files produced on
VMS systems which contain embedded binary record size data amongst
the ASCII text. Some files do not contain a PVL END
statement or
any other marker delimiting the image label from the image data.
Instead the image label contains a PVL parameter that provides the
size of the label. Since the PPVL avoids any knowledge of specific
product parameters it offers several tactics to deal with difficult
image data formats. While ingesting data from a file for PVL
parsing, if no parameter can be formed interpretation of the file
stops. This is usually sufficient. However, the PPVL parser can
be given a limit on how much file data to read, so on the
assumption that the necessary label size parameter is near the
front of the file a relatively small limit can be set, and then
PPVL can be given the label size limit to interpret all of the PVL.
Using application knowledge, data can also be read from the file
by some other means and passed to PPVL for interpretation (this
is how, for example, EOS metadata would be handled).
Obtaining PVL parameters from a file usually requires only a
single call to the PPVL_read_aggregate
function:
PPVL_Parameter *PPVL_read_aggregate ( FILE *File, unsigned long Read_Limit, unsigned long *Total_Scanned );
The File
is a data stream that has been
opened by the application. If the Read_Limit
is NULL
then the PPVL will obtain as many parameters
from the file, starting at its current input location, as possible.
If the Total_Scanned
argument is not
NULL
then the variable it points to will receive the
total number of data bytes scanned by the parser. The function
returns a pointer to an aggregate class parameter named
"The_Container" which contains all of the parameters found in the
file. The file is positioned (if it is seekable) after the last
PVL parameter found, which will be the same as the
Total_Scanned
byte if the file was read from the beginning.
A string of data may be passed to the
PPVL_scan_aggregate
function for parsing:
PPVL_Parameter *PPVL_scan_aggregate ( char *The_String, char **Next_Character );
This function also returns an aggregate class parameter named
"The_Container" which contains all of the parameters found in the
The_String
. The character pointer variable
pointed to by Next_Character
, if not
NULL
, refers to the character of the string
immediately following the last parameter that was formed
(which may be the end of string marker).
The PPVL functions that assemble individual parameters, values
and their components are not usually used directly, though they
may be for special needs. All of these functions follow the same
pattern of being passed a string of data to parse and returning
their appropriate component along with the location in the string
where they left off.
Output data format
The PPVL_write_parameter
function is used to
generate formatted PVL output:
PPVL_Error_Code PPVL_write_parameter ( PPVL_Parameter *The_Parameter, FILE *The_File, int Indent_Level );
The_Parameter
may be a pointer to any
parameter structure, though it is typically an aggregate class
parameter that contains all the parameters to be written. If the
name of the aggregate is "The_Container" then only the parameters
it contains will be written. The PVL is written to
The_File
, a
data stream which has been opened by the application, which may be
NULL
to indicate output to stdout
. PVL
parameters are written one to a line, with
Indent_Level
indicating the number of tab
characters to put at the beginning of each line. The
Indent_Level
is increased by one for each recursively
processed aggregate parameter encountered. If the
Indent_Level
is negative, however, no indenting is
done.
The PVL produced by PPVL strictly conforms to the CCSDS
specification. However, producing text representations of real
numbers can have difficulty with the actual precision of the
internal value. By default real values are output using the
"%#G"
format. However, the
PPVL_real_number_format
global variable may be
assigned a pointer to an alternate format string. For example,
"%#.15E"
might be used to provide scientific notation
with 15 digits of precision. The format string must be used
cautiously or invalid PVL output could result.
Searching
Finding a specific parameter or value is a necessary task. A function has been provided to easily accomplish this:
PPVL_Object *PPVL_find ( PPVL_Object *The_Container, PPVL_Object *Last_Object, int Select, PPVL_Object *The_Selection, PPVL_Object **The_Parent );
The_Container
is either an aggregate
parameter or an array value that will be searched for
The_Selection
following the
Last_Object
specified, which may be the symbol
PPVL_SEARCH_FROM_THE_TOP
to begin with the first
object in The_Container
list. A pointer to the first
object found in the list that matches The_Selection is returned,
and if The_Parent
argument is not
NULL
a pointer to the found object's parent object
is placed in the referenced variable.
Select
is a symbol code that indicates what
The_Selection
is. For aggregate parameters the
Select
codes are:
PPVL_SELECT_ANY
- The next parameter after the
Last_Object
(or the first parameter if it is
PPVL_SEARCH_FROM_THE_TOP
) is selected. This allows a
parameter hierarchy to be stepped through in the order in which
the corresponding PVL statements appear externally.
PPVL_SELECT_NAME
- The_Selection
is a
string that is compared with the parameter name.
The_Selection
string may be a simple name which will
match with the first occurrence of a parameter with the same
(but case insensitive) name. It may also be a pathname of the form:
[/]Name[/Name[...]]
A pathname beginning with a forward slash is "absolute" in that the first Name must be found in the top aggregates's parameter list before the next Name will qualify for a match in an aggregate parameter in this list; etc. This is the same as file pathnames in Unix. A pathname that does not begin with a forward slash is "relative" in that the first Name is searched for like a simple name and then the remainder of the path is searched for relative to the location of the first parameter found.
PPVL_SELECT_CLASS
- The_Selection
is
a parameter classification code (the value itself, not a pointer
to the value). Any classification code is valid.
PPVL_SELECT_PARAMETER
- The_Selection
is a pointer to a specific parameter structure to be found. This
is typically used to traverse back up a parameter hierarchy by
searching for the parent parameter returned from a successful
search, which returns its parent, until the parent returned is
the original aggregate.
PPVL_SELECT_USER_DATA
- The_Selection
is a generic pointer (void *
). The first parameter
that has user_data
with the identical value is
selected.
For array values the Select
code is any type value
and The_Selection
is a pointer to a variable of the
corresponding type; for string type codes, however,
The_Selection
is a pointer to the string
(char *
), not a pointer to a pointer variable
(char **
). The special symbol
PPVL_TYPE_ANY
works like
PPVL_SELECT_ANY
(they are interchangeable). Also,
if The_Selection
is PPVL_SELECT_ANY
then
no data match is done to find a value, only the type match.
Function: Manageable
Structure manipulation
Both parameter and value structures can be created directly from application software variables using PPVL functions:
PPVL_Parameter *PPVL_new_parameter ( char *name, PPVL_Class class, PPVL_Object *content, char *comments, void *user_data ); PPVL_Value *PPVL_new_value ( PPVL_Type type, PPVL_Object *data, int base, char *units );
Each of these functions takes arguments that correspond to
the entries in the structures they create and to which they
return a pointer. A specific, rather than general,
class
or type
code must be provided.
For a parameter, however, the classification code may be
NULL
if the class of parameter can be determined
from the content: if a pointer to a parameter structure is
provided an aggregate is created, but if a pointer to a value
structure is provided an assignment is created. The
data
used
to create a value structure is treated according to the
specified type.
Functions that duplicate existing parameter structures simply take a pointer to the structure to be duplicated and return a pointer to the copy.
Functions are also provided to add or remove parameters or values from other parameters (aggregates) or values (arrays). They each have the same form:
PPVL_Object *PPVL_{add|remove} ( PPVL_Object *The_Container, PPVL_Object *The_Object );
The Parameter or Value is added or removed from the Aggregate or Array and a pointer to the modified structure is returned.
These structure manipulation functions are often used in combination to assemble PVL descriptions or modify those read in from files before they are written out to another file.
While it possible to construct parameter or value structures
directly in static or automatic (stack) memory controlled by the
application software, this is not generally recommended. The PPVL
library uses dynamically allocated memory throughout. All PPVL
objects - parameter structures and aggregate lists, value
structures and array lists, and any string contents - are stored
in dynamic memory. Though the PPVL library does not reallocate
the memory for parameter and value structures, parameter and value
lists for aggregates and arrays may be reallocated whenever a PPVL
function feels the need. Functions to free all of the memory
associated with a parameter or value, including all of its
contents, are provided.
Counting and indexing
The PPVL_count
function is provided to count the
number of parameters or values in an aggregate or array. The
PPVL_count_all
function will recursively count any
aggregate parameters or array values.
The parameter and value lists of aggregates and arrays are
NULL
terminated pointer lists. When the application
software wishes to treat these lists as pointer arrays it may be
desirable to obtain the index of a structure pointer - e.g. as
returned from PPVL_find - in this pointer array:
int PPVL_index ( PPVL_Object *The_Container PPVL_Object *The_Object );
The PPVL_index
function will find the
index of The_Object
in
The_Container
.
Warnings and errors
Every PPVL function sets the global integer variable
PPVL_errno
. Initially it is set to
PPVL_SUCCESS
(0) on entering the function. If an
error condition is detected, it is set to an appropriate value
before the function cleans up and returns an error status value.
The condition, however, may be considered to be only a warning
(this may depend on PPVL_strict
being
FALSE
) in which case PPVL_errno
is set
accordingly but no error return is taken (though some recovery
action may be taken). Only the first warning condition in a
function is registered. If an error condition occurs after a
warning condition has been encountered the error condition
takes precedence over the warning.
There are three levels of error codes (four if you count
success): fatal errors, syntax errors, and abstract (high level)
usage problems. Fatal errors are non-recoverable and always result
in a failure status value from the function. Syntax errors are
usually recoverable and are treated as warnings if
PPVL_strict
is FALSE
. Abstract errors
are treated like syntax errors.
Every error code has a corresponding brief (one short line)
descriptive error message contained in the global
PPVL_errmsg strings
array. The macro
PPVL_ERROR_MESSAGE
provides the description string
for the current value of PPVL_errno
(this is commonly
used as a printf function argument).
Function: Extensible
Selection tables
While the PPVL_find
function provides the
capabilities to easily find parameters in aggregates, managing
the relationship between external parameter names and variable
names internal to the application software can be an annoying
chore, especially when files from many different projects, each
with their own naming scheme for the same parameter values, must
be handled. To solve this problem a new function was added to
the PPVL library as an extension to the core capabilities:
PPVL_Error_Code PPVL_selections ( PPVL_Parameter *The_Aggregate, PPVL_Parameter_Selections *The_Selections );
The_Aggregate
is a pointer to an aggregate
parameter that will be processed using
The_Selections
table, which is an array of
PPVL_Parameter_Selections
:
typedef struct { char *name; PPVL_Type type; int count; void *value; } PPVL_Parameter_Selections;
The PPVL_Parameter_Selections
table provides a
generic way to specify what to look for in an aggregate, and the
program variables that will store what is found. The parameter
name
may be in absolute or relative pathname
form. The parameter, if found, must contain values of the
specified type
, which should be one of the
primary types. If an array type is specified it will be ignored
and processing will continue, but PPVL_errno
will be
set to PPVL_ERROR_ILLEGAL_SYNTAX
. This is just a
warning and can be ignored if desired.
Up to count
values of the specified
type
will be copied from the parameter to
sequential variable locations starting at the
value
address; this should be a pointer to
an appropriate variable (or array, or array member). For an
integer type an unsigned long
will be copied, for a
real type a double
, and for a string type a pointer
to the string. The characters of a string will not be copied.
Since arrays of parameter values may have different types for
each value, only the values of the selected type are used.
Parameters that are not found result in no change to the value storage for that parameter. So it may be appropriate to initialize value storage with a default value. However, the same value storage may be referred to by more than one parameter name, in which case the value of the last parameter name found, if any, will be placed in the value storage. It may be desirable to initialize value storage with a special value that can be used to indicate that no parameter was found for that variable.
Parameters that are found to have the selected name but do not have the selected type will be silently ignored. However, the same parameter name may be listed more than once with a different selected type (and probably a different value storage reference).
The_Selections
list must be NULL
terminated . This is easily done by using
PPVL_PARAMETER_SELECTIONS_END
as the last entry in
the list.
PDS EOL labels
Some PDS image files were found to contain a special description label appended to the end of the file. A special function was provided to handle this case:
PPVL_Parameter *PPVL_get_PDS_EOL ( FILE *The_File, PPVL_Parameter *The_Aggregate );
If the special EOL
parameter is present in
The_Aggregate
, as well as all the other
parameters necessary to determine the location of the EOL label
after the end of the image data, then The_File
is repositioned at the beginning of the EOL label and it is read
in as a new aggregate. This new aggregate is renamed "EOL" and
added to the end of The_Aggregate
.