PIRL

PIRL.PVL
Class Parser

java.lang.Object
  extended by PIRL.Strings.String_Buffer
      extended by PIRL.Strings.String_Buffer_Reader
          extended by PIRL.PVL.Parser

public class Parser
extends String_Buffer_Reader

The Parser extends the String_Buffer_Reader to interpret the characters as a sequence of Parameter Value Language (PVL) syntax statements.

This Parser implements the syntax of the PVL used by the Planetary Data System (PDS) as specified by the Consultative Committee for Space Data Systems in the Blue Book "Parameter Value Language Specification (CCSDS0006,8)", June 2000 [CCSDS 641.0-B-2] and Green Book "Parameter Value Language - A Tutorial", May 1992 [CCSDS 641.0-G-1] documents. PVL has been accepted by the International Standards Organization (ISO), as a Draft Standard (ISO/CD 14961:1997). The PVL syntax defines a Parameter with this basic format:

[Comments]
Name [= Value][;]

The optional Comments are enclosed in C-style delimiters, or optionally preceeded by a crosshatch ('#') character on each line. The PVL syntax for a Value follows this format:

[(|{]Datum [<Units>][, Datum [...]][)|} [<Units>]]

The purpose of a Parser object is to assemble Parameter and Value objects using the PVL statements obtained from the associated String_Buffer_Reader.

The class methods that perform the parsing of the character source are organized into a hierarchy:

Higher level methods utilize lower level methods to assemble their constituent parts. At the top level an aggregate of all Parameters that can be interpreted from the input will be collected by getting as many Parameters as possible; a Parameter is produced from the input stream by getting any comments, a name String, and a Value; a Value includes as many datum and optional units descriptions that can be sequentially found in the input stream; and a datum is composed from primitive syntactic elements, including integer or real number representations or character strings which may be quoted (a Parameter name may also be a quoted string). Typically applications will only use the top method(s). Applications needing finer grained control over input stream parsing may, of course, use the lower level methods directly, however it is much easier to just get all of the Parameters from an input source and then manipulate the Parameter-Value object hierarchy.

Each method that parses the input stream interprets the contents of the logical String_Buffer_Reader which presents the entire virtual contents of the stream from the current location onwards. Except for the top level which does not interpret the character source directly, these methods first seek forward from the current location to the beginning of a potentially relevant syntactic character sequence. If the sequence is recognized as suitable for the item the method is responsible for interpreting then the appropriate end of the sequence is found and the characters it contains are translated into the corresponding internal form of object class variable. If the translation is successful then the logical Next_Location of the String_Buffer_Reader is moved forward to the end of the sequence before the iterative interpretation of the stream continues. If, however, the beginning of a recognizable syntactic sequence is not found, or the translation of a sequence fails, then the method returns empty handed, and without having advanced the logical String_Buffer_Reader, to the invoking method which may invoke a different method in an attempt to get a different item or itself discontinue its efforts to assemble an item. Thus each parsing method either gets its item and advances the current location in the character stream, or does not get its item nor advance the stream; i.e. the PVL statements encountered in the character source are sequentially translated at the same time that the input stream is incrementally moved forward.

Version:
1.35
Author:
Bradford Castalia, UA/PIRL
See Also:
String_Buffer_Reader, Parameter, Value

Field Summary
static boolean All_Values_Strings_Default
          The default for treating all values as strings.
static String CHARACTER_ENCODING
          The PVL character encoding: "US-ASCII".
static String COMMENT_END_DELIMITERS
          Marks the end of a comment string: '*' and '/'.
static String COMMENT_START_DELIMITERS
          Marks the start of a comment string: '/' and '*'.
static String CONTAINER_NAME
          The default name of the aggregate Parameter to contain all Parameters when a Parser Get finds more than one Parameter: "The Container".
static char CROSSHATCH
          Begins a "crosshatch comment" that extends to the end of the line: '#'.
static boolean Crosshatch_Comments_Default
          The default for allowing crosshatched-to-EOL comments.
static String DATE_TIME_DELIMITERS
          Set of characters that suggests a DATE_TIME type of STRING Value: "-:".
static String ID
          Class name and version identification.
static String LINE_BREAK
          Character sequence that separates PVL statement lines: "\r\n" (CR-NL).
static char NUMBER_BASE_DELIMITER
          Encloses the datum of a Value in radix base notation: '#'.
static char PARAMETER_NAME_DELIMITER
          Delimits a Parameter name from its Value: '='.
static char PARAMETER_VALUE_DELIMITER
          Delimits elements of an ARRAY Value: ','.
static String RESERVED_CHARACTERS
          Characters reserved by the PVL syntax.
static char SEQUENCE_END_DELIMITER
          Marks the end of a SEQUENCE ARRAY Value: ')'
static char SEQUENCE_START_DELIMITER
          Marks the start of a SEQUENCE ARRAY Value: '('.
static char SET_END_DELIMITER
          Marks the end of a SET ARRAY Value: '}'.
static char SET_START_DELIMITER
          Marks the start of a SET ARRAY Value: '{'.
static char STATEMENT_CONTINUATION_DELIMITER
          Indicates that the statement continues in the next record: '&'.
static char STATEMENT_END_DELIMITER
          Marks the end of a PVL statement: ';'.
static boolean Strict_Default
          The default for enforcing strict PVL syntax rules.
static boolean String_Continuation_Default
          String continuation default.
static char STRING_CONTINUATION_DELIMITER
          Indicates that the quoted string continues unbroken in the next record: '-'.
static char SYMBOL_DELIMITER
          Encloses a SYMBOL STRING Value: '\''.
static char TEXT_DELIMITER
          Encloses a TEXT STRING Value: '"'.
static char UNITS_END_DELIMITER
          Marks the end of a Value units description: '>'.
static char UNITS_START_DELIMITER
          Marks the start of a Value units description: '<'.
static String VERBATIM_STRING_DELIMITERS
          Encloses a verbatim (uninterpreted) string: "\\v".
static boolean Verbatim_Strings_Default
          Verbatim strings default.
static String WHITESPACE
          Set of "whitespace" characters between PVL tokens: " \t\r\n\f\013" (SP, HT, CR, NL, FF, and VT).
 
Fields inherited from class PIRL.Strings.String_Buffer_Reader
DEFAULT_READ_LIMIT, DEFAULT_SIZE_INCREMENT, INVALID_CHARACTER, NO_READ_LIMIT
 
Fields inherited from class PIRL.Strings.String_Buffer
QUESTIONABLE_CHARACTER
 
Constructor Summary
Parser()
          Creates a Parser with no source of PVL statements.
Parser(char[] char_array)
          Creates a Parser using a character array as the source of PVL statements.
Parser(char[] char_array, int offset, int length)
          Creates a Parser using a characer array subset as the source of PVL statements.
Parser(File file)
          Creates a Parser using a File to create a new Reader as the source of PVL statements, with no limit on the amount to read.
Parser(File file, long read_limit)
          Creates a Parser using a File to create a new Reader as the source of PVL statements, and sets a limit on the amount to read.
Parser(InputStream input_stream)
          Creates a Parser using an InputStream to create a new Reader as the source of PVL statements, with no limit on the amount to read.
Parser(InputStream input_stream, long read_limit)
          Creates a Parser using an InputStream to create a new Reader as the source of PVL statements, and sets a limit on the amount to read.
Parser(Reader reader)
          Creates a Parser using a Reader as the source of PVL statements, with no limit on the amount to read.
Parser(Reader reader, long read_limit)
          Creates a Parser using a Reader as the source of PVL statements, and sets a limit on the amount to read.
Parser(String string)
          Creates a Parser using a String as the source of PVL statements.
 
Method Summary
 Parameter Add_To(Parameter The_Aggregate)
          Adds to a Parameter all Parameters found from the input source.
 boolean All_Values_Strings()
          Tests if the Parser will treat all values as strings.
 Parser All_Values_Strings(boolean all_values_strings)
          Enable or disable treating all values as strings.
static int Bad_Character(String string)
          Checks a String for any bad character.
 boolean Crosshatch_Comments()
          Tests if crosshatch comments will be recognized.
 Parser Crosshatch_Comments(boolean crosshatch_comments)
          Enable or disable recognition of crosshatch comments.
 PVL_Exception First_Warning()
          Returns the first warning since the last Reset_Warning.
 Parser First_Warning(boolean first)
          Enables or disables returning the first warning that occurs as the current warning status.
 String Get_Comments()
          Gets the next sequence of comments from the source of PVL statements as a single comment String.
 Value Get_Datum(Value The_Value)
          Gets a datum from the source of PVL statements.
 Parameter Get_Parameter()
          Gets a Parameter from the source of PVL statements.
 String Get_Quoted_String()
          Gets a quoted String from the source of PVL statements.
 String Get_Units()
          Gets a units description String from the source of PVL statements.
 Value Get_Value()
          Gets a Value from the source of PVL statements.
 Parameter Get()
          Gets as many Parameters as possible.
static boolean isprint(char character)
          Tests if a character is printable: in the ASCII range from the space character (' ') to the tilde character ('~') inclusive.
 PVL_Exception Last_Warning()
          Returns the last warning since a Reset_Warning.
 Parser Last_Warning(boolean last)
          Enables or disables returning the last warning that occurs as the current warning status.
 Parser Reset_Warning()
          Clears any warning status so that the Warning method will return null until the next warning condition occurs.
 Parser Set_Reader(File file, long read_limit)
          Sets the Reader where the Parser will obtain characters by constructing a FileInputStream from the specified File and passing this to the Set_Reader (InputStream, long) method.
 Parser Set_Reader(InputStream input_stream, long read_limit)
          Sets the Reader where the Parser will obtain characters by constructing an InputStreamReader from the specified InputStream and wrapping this in a BufferedReader, the same size as the Size_Increment of the String_Buffer_Reader, for efficiency.
 Parser Set_Reader(Reader reader, long read_limit)
          Sets the Reader where the Parser will obtain characters.
 long skip_whitespace_and_comments(long location)
          Gets the next location in the PVL source stream following any sequence of whitespace and/or comments.
static int Special_Classification(String name)
          Gets the Parameter classification code corresponding to the specified special Parameter name String.
static String Special_Name(int classification)
          Gets the special Parameter name String for the specified Parameter classification code.
 boolean Strict()
          Tests if the Parser will enforce strict PVL syntax rules.
 Parser Strict(boolean strict)
          Enables or disables strict PVL syntax rules in the Parser.
 boolean String_Continuation()
          Tests if the Parser will recognize multi-line string continuation.
 Parser String_Continuation(boolean continuation)
          Enable or disable string continuation.
static String_Buffer translate_from_escape_sequences(String_Buffer string)
          Translates escape sequences in a String_Buffer to their corresponding special characters.
 boolean Verbatim_Strings()
          Tests if the Parser will handle quoted strings verbatim.
 Parser Verbatim_Strings(boolean verbatim)
          Enable or disable verbatim quoted strings.
 PVL_Exception Warning()
          Gets the current warning status.
 
Methods inherited from class PIRL.Strings.String_Buffer_Reader
Buffer_Location, Char_At, End_Index, End_Location, Ended, Equals, Extend, Filter_Input, Filter_Input, Filter_Input, Get_Reader, Index, Is_Empty, Is_End, Is_Text, Location_Of, Location_Of, Location, Next_Index, Next_Index, Next_Location, Next_Location, No_Read_Limit, Non_Text_Limit, Non_Text_Limit, Read_Limit, Read_Limit, Reader_Source, Record_Size, Record_Size, Reset_Location, Reset, Set_Reader, Size_Increment, Size_Increment, Skip_Over, Skip_Until, String_Source, Substring, Total_Read
 
Methods inherited from class PIRL.Strings.String_Buffer
append, append, append, append, append, append, append, append, append, append, append, append, append, capacity, charAt, clean, clear, delete, deleteCharAt, ensureCapacity, equals_ignore_case, equals, equalsIgnoreCase, escape_to_special, escape_to_special, from_character_references, from_character_references, getChars, index_of, index_of, indexOf, indexOf, insert, insert, insert, insert, insert, insert, insert, insert, insert, insert, length, replace_span, replace, replace, replaceSpan, reverse, setCharAt, setLength, skip_back_over, skip_back_until, skip_over, skip_until, skipBackOver, skipBackUntil, skipOver, skipUntil, special_to_escape, special_to_escape, substring, substring, to_character_references, to_character_references, to_printable_ASCII, to_printable_ASCII, toString, trim_all, trim_beginning, trim_end, trim, trim, trim
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ID

public static final String ID
Class name and version identification.

See Also:
Constant Field Values

Strict_Default

public static boolean Strict_Default
The default for enforcing strict PVL syntax rules.


Verbatim_Strings_Default

public static boolean Verbatim_Strings_Default
Verbatim strings default.


String_Continuation_Default

public static boolean String_Continuation_Default
String continuation default.


Crosshatch_Comments_Default

public static boolean Crosshatch_Comments_Default
The default for allowing crosshatched-to-EOL comments.


All_Values_Strings_Default

public static boolean All_Values_Strings_Default
The default for treating all values as strings.


CROSSHATCH

public static final char CROSSHATCH
Begins a "crosshatch comment" that extends to the end of the line: '#'.

See Also:
Constant Field Values

CHARACTER_ENCODING

public static final String CHARACTER_ENCODING
The PVL character encoding: "US-ASCII".

See Also:
Constant Field Values

RESERVED_CHARACTERS

public static final String RESERVED_CHARACTERS
Characters reserved by the PVL syntax.

{}()[]<>&\"',=;#%~|+! \t\r\n\f\013

Some of these characters have special meanings in specific contexts as delimiters of PVL items.

See Also:
Constant Field Values

PARAMETER_NAME_DELIMITER

public static final char PARAMETER_NAME_DELIMITER
Delimits a Parameter name from its Value: '='.

See Also:
Constant Field Values

PARAMETER_VALUE_DELIMITER

public static final char PARAMETER_VALUE_DELIMITER
Delimits elements of an ARRAY Value: ','.

See Also:
Constant Field Values

TEXT_DELIMITER

public static final char TEXT_DELIMITER
Encloses a TEXT STRING Value: '"'.

See Also:
Constant Field Values

SYMBOL_DELIMITER

public static final char SYMBOL_DELIMITER
Encloses a SYMBOL STRING Value: '\''.

See Also:
Constant Field Values

SET_START_DELIMITER

public static final char SET_START_DELIMITER
Marks the start of a SET ARRAY Value: '{'.

See Also:
Constant Field Values

SET_END_DELIMITER

public static final char SET_END_DELIMITER
Marks the end of a SET ARRAY Value: '}'.

See Also:
Constant Field Values

SEQUENCE_START_DELIMITER

public static final char SEQUENCE_START_DELIMITER
Marks the start of a SEQUENCE ARRAY Value: '('.

See Also:
Constant Field Values

SEQUENCE_END_DELIMITER

public static final char SEQUENCE_END_DELIMITER
Marks the end of a SEQUENCE ARRAY Value: ')'

See Also:
Constant Field Values

UNITS_START_DELIMITER

public static final char UNITS_START_DELIMITER
Marks the start of a Value units description: '<'.

See Also:
Constant Field Values

UNITS_END_DELIMITER

public static final char UNITS_END_DELIMITER
Marks the end of a Value units description: '>'.

See Also:
Constant Field Values

NUMBER_BASE_DELIMITER

public static final char NUMBER_BASE_DELIMITER
Encloses the datum of a Value in radix base notation: '#'.

See Also:
Constant Field Values

STATEMENT_END_DELIMITER

public static final char STATEMENT_END_DELIMITER
Marks the end of a PVL statement: ';'.

See Also:
Constant Field Values

STATEMENT_CONTINUATION_DELIMITER

public static final char STATEMENT_CONTINUATION_DELIMITER
Indicates that the statement continues in the next record: '&'.

See Also:
Constant Field Values

STRING_CONTINUATION_DELIMITER

public static final char STRING_CONTINUATION_DELIMITER
Indicates that the quoted string continues unbroken in the next record: '-'.

See Also:
Constant Field Values

LINE_BREAK

public static final String LINE_BREAK
Character sequence that separates PVL statement lines: "\r\n" (CR-NL).

See Also:
Constant Field Values

WHITESPACE

public static final String WHITESPACE
Set of "whitespace" characters between PVL tokens: " \t\r\n\f\013" (SP, HT, CR, NL, FF, and VT).

See Also:
Constant Field Values

COMMENT_START_DELIMITERS

public static final String COMMENT_START_DELIMITERS
Marks the start of a comment string: '/' and '*'.

See Also:
Constant Field Values

COMMENT_END_DELIMITERS

public static final String COMMENT_END_DELIMITERS
Marks the end of a comment string: '*' and '/'.

See Also:
Constant Field Values

DATE_TIME_DELIMITERS

public static final String DATE_TIME_DELIMITERS
Set of characters that suggests a DATE_TIME type of STRING Value: "-:".

See Also:
Constant Field Values

VERBATIM_STRING_DELIMITERS

public static final String VERBATIM_STRING_DELIMITERS
Encloses a verbatim (uninterpreted) string: "\\v".

See Also:
Constant Field Values

CONTAINER_NAME

public static final String CONTAINER_NAME
The default name of the aggregate Parameter to contain all Parameters when a Parser Get finds more than one Parameter: "The Container".

See Also:
Constant Field Values
Constructor Detail

Parser

public Parser(Reader reader,
              long read_limit)
       throws PVL_Exception
Creates a Parser using a Reader as the source of PVL statements, and sets a limit on the amount to read.

Parameters:
reader - The Reader to use as the source of characters.
read_limit - The maximum amount to read.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(Reader, long)

Parser

public Parser(Reader reader)
       throws PVL_Exception
Creates a Parser using a Reader as the source of PVL statements, with no limit on the amount to read.

Parameters:
reader - The Reader to use as the source of characters.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(Reader, long)

Parser

public Parser(File file,
              long read_limit)
       throws PVL_Exception
Creates a Parser using a File to create a new Reader as the source of PVL statements, and sets a limit on the amount to read.

Parameters:
file - The File to be the basis for a new Reader.
read_limit - The maximum amount to read.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(File, long)

Parser

public Parser(File file)
       throws PVL_Exception
Creates a Parser using a File to create a new Reader as the source of PVL statements, with no limit on the amount to read.

Parameters:
file - The File to be the basis for a new Reader.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(File, long)

Parser

public Parser(InputStream input_stream,
              long read_limit)
       throws PVL_Exception
Creates a Parser using an InputStream to create a new Reader as the source of PVL statements, and sets a limit on the amount to read.

Parameters:
input_stream - The InputStream to be the basis for a new Reader.
read_limit - The maximum amount to read.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(InputStream, long)

Parser

public Parser(InputStream input_stream)
       throws PVL_Exception
Creates a Parser using an InputStream to create a new Reader as the source of PVL statements, with no limit on the amount to read.

Parameters:
input_stream - The InputStream to be the basis for a new Reader.
Throws:
PVL_Exception - From Set_Reader.
See Also:
Set_Reader(InputStream, long)

Parser

public Parser()
Creates a Parser with no source of PVL statements.


Parser

public Parser(String string)
Creates a Parser using a String as the source of PVL statements.

Parameters:
string - The String to use as the source of characters.

Parser

public Parser(char[] char_array)
Creates a Parser using a character array as the source of PVL statements.

Parameters:
char_array - The source of characters.

Parser

public Parser(char[] char_array,
              int offset,
              int length)
Creates a Parser using a characer array subset as the source of PVL statements.

Parameters:
char_array - The source of characters.
offset - The array index where the character source starts.
length - The number of characters to source.
Method Detail

Set_Reader

public Parser Set_Reader(Reader reader,
                         long read_limit)
                  throws PVL_Exception
Sets the Reader where the Parser will obtain characters.

If Filter_Input is true, then a SIZED_RECORDS Warning is registered.

N.B.: When Filter_Input is used it will read the first two characters from the reader to test them for non-printable values, unless the test has already been done on the reader or input filtering has been disabled. If no characters are yet available from the reader - e.g. if the reader is backed by a pipe or network socket - then the read will block until two characters become available (or an IOException occurs). The base String_Buffer_Reader does its own Filter_Input test every time its internal buffer content is extended. Therefore, to prevent the Filter_Input() test from blocking when the reader is set defer setting the reader until blocking is acceptable, it is known that the required input will be available or filtering has been disabled. For example:

Parser parser = new Parser ();
// The reader is certain not to need filtering.
parser.Filter_Input (false);
// Any of the Set_Reader methods may be used.
parser.Set_Reader (reader, read_limit);

Parameters:
reader - The Reader source for Parser input. If null, then there is no PVL source and thus nothing to be parsed.
Returns:
This Parser.
Throws:
PVL_Exception - From Set_Reader. A FILE_IO form is thrown if there is an IOException from Filter_Input.
See Also:
String_Buffer_Reader.Set_Reader(Reader), String_Buffer_Reader.Filter_Input(), String_Buffer_Reader.Read_Limit(long)

Set_Reader

public Parser Set_Reader(InputStream input_stream,
                         long read_limit)
                  throws PVL_Exception
Sets the Reader where the Parser will obtain characters by constructing an InputStreamReader from the specified InputStream and wrapping this in a BufferedReader, the same size as the Size_Increment of the String_Buffer_Reader, for efficiency. The Parser's CHARACTER_ENCODING is used.

Parameters:
input_stream - The InputStream source for Parser input. If null then there is nothing to read.
Returns:
This Parser.
Throws:
PVL_Exception - From Set_Reader. A FILE_IO form is thrown if there is an UnsupportedEncodingException when constructing the InputStreamReader.
See Also:
String_Buffer_Reader.Size_Increment(int)

Set_Reader

public Parser Set_Reader(File file,
                         long read_limit)
                  throws PVL_Exception
Sets the Reader where the Parser will obtain characters by constructing a FileInputStream from the specified File and passing this to the Set_Reader (InputStream, long) method.

Parameters:
file - The File source for Parser input. If null, there is nothing to read.
Returns:
This Parser.
Throws:
PVL_Exception - A FILE_IO error condition will occur if the File refers to a directory or a file for which read permission is not provided. Any other IOException that occurs while constructing the FileInputStream will also be converted into a PVL_Exception.FILE_IO exception.
See Also:
Set_Reader(InputStream, long)

Strict

public Parser Strict(boolean strict)
Enables or disables strict PVL syntax rules in the Parser.

Normally the Parser is tolerant. However, the default is controlled by the Strict_Default value.

N.B.: Enabling strict syntax rules will prevent treating all Values as Strings.

Parameters:
strict - true if strict rules are applied; false otherwise.
Returns:
This Parser.

Strict

public boolean Strict()
Tests if the Parser will enforce strict PVL syntax rules.

Returns:
true if the Parser will enforce strict syntax rules; false otherwise.

Verbatim_Strings

public Parser Verbatim_Strings(boolean verbatim)
Enable or disable verbatim quoted strings.

With format control (Verbatim_Strings disabled) multi-line quoted strings in PVL statements have white space surrounding the line breaks compressed to a single space character - except when String_Continuation is enabled and the last non-white space character on the line is a dash ("-"), in which case no space is included. This is because output formatting is expected to be controlled by embedded format characters which are processed by the Write method:

\n - line break.
\t - horizontal tab.
\f - form feed (page break).
\ \ - backslash character.
\v - verbatim (no formatting) till the next \v.

Without format control (Verbatim_Strings enabled) all STRING Values are taken as-is.

By default verbatim strings are disabled. However, the default is controlled by the Verbatim_Strings_Default value.

Parameters:
verbatim - true if quoted strings in PVL statements are to be taken verbatim, without format control.
Returns:
This Parser.
See Also:
Get_Parameter(), Get_Quoted_String(), String_Continuation(boolean), Get_Comments(), Get_Datum(Value), Get_Units()

Verbatim_Strings

public boolean Verbatim_Strings()
Tests if the Parser will handle quoted strings verbatim.

Returns:
true if quoted strings are taken verbatim; false otherwise.
See Also:
Verbatim_Strings(boolean)

String_Continuation

public Parser String_Continuation(boolean continuation)
Enable or disable string continuation.

When string continuation is enabled (the default) and verbatim strings is disabled (the default) the occurrance in a quoted string of a STRING_CONTINUATION_DELIMITER as the last character before the new line sequence causes the string continuation delimiter and all characters up to the next non-whitespace character to be removed from the string; i.e. the string continues on the next line after any whitespace.

By default string continuation is enabled. However, the default is controlled by the String_Continuation_Default value.

Parameters:
continuation - true if string continuation is to be enabled; false otherwise.
Returns:
This Parser.
See Also:
Get_Quoted_String(), Verbatim_Strings(boolean)

String_Continuation

public boolean String_Continuation()
Tests if the Parser will recognize multi-line string continuation.

Returns:
true if quoted string conintuation will be recognized; false otherwise.
See Also:
String_Continuation(boolean)

Crosshatch_Comments

public Parser Crosshatch_Comments(boolean crosshatch_comments)
Enable or disable recognition of crosshatch comments. A Crosshatch comment begins with the CROSSHATCH character and extends to the end of the current line (marked by a LINE_BREAK character). Crosshatch comments are never recognized in Strict mode.

Note: Crosshatch comments are not recognized in the PVL syntax specification. Because of their common use in configuration files, this special extension is provided to accommodate such applications. Be default crosshatch comments are not recognized.

By default crosshatch comments are enabled. However, the default is controlled by the Crosshatch_Comments_Default value.

Parameters:
crosshatch_comments - true if crosshatch comments are to be recognized; false otherwise.
Returns:
This Parser.

Crosshatch_Comments

public boolean Crosshatch_Comments()
Tests if crosshatch comments will be recognized.

Regardless of this test, crosshatch comments are never recognized in Strict mode.

Returns:
true if crosshatch comments will be recognized; false otherwise.
See Also:
Crosshatch_Comments(boolean)

All_Values_Strings

public Parser All_Values_Strings(boolean all_values_strings)
Enable or disable treating all values as strings.

When enabled, and strict syntax rules are not enabled, all PVL parameter values will be parsed as Strings; no other Value Type will be generated. This may be useful in cases where, for example, a numeric Type Value would be generated from a PVL value representation that can not be converted to a binary value with sufficient precision, or where unquoted PVL value representations that would otherwise be expected to be identifiers happen to contain all numeric digit characters.

By default treating all values as strings is disabled. However the default is controlled by the All_Values_Strings_Default value.

Parameters:
all_values_strings - true if all Values are to be generated as Value.STRING Types; false otherwise.
Returns:
This Parser.

All_Values_Strings

public boolean All_Values_Strings()
Tests if the Parser will treat all values as strings.

Regardless of this test, treating all values as strings will not be enforced if Strict mode is enabled.

Returns:
true if the Parser will treat all values as strings; false otherwise.
See Also:
All_Values_Strings(boolean)

Warning

public PVL_Exception Warning()
Gets the current warning status.

When conditions are encountered that are unusual enough to warrant attention, but not an error condition that would prevent successful processing which would cause an exception to be thrown, a warning condition is registered. The warning is in the form of a PVL_Exception that was not thrown. The current warning status is either the First_Warning or the Last_Warning since a Reset_Warning.

Returns:
The current warning status as a PVL_Exception object, or null if no warning condition is registered.
See Also:
PVL_Exception, First_Warning(boolean), Last_Warning(boolean), Reset_Warning()

Reset_Warning

public Parser Reset_Warning()
Clears any warning status so that the Warning method will return null until the next warning condition occurs.

Returns:
This Parser.
See Also:
Warning()

First_Warning

public Parser First_Warning(boolean first)
Enables or disables returning the first warning that occurs as the current warning status. The first warning is one that occurs when the current warning status is null.

Parameters:
first - true to enable returning the first warning status; false to return the last warning that occurred as the current warning status.
Returns:
This Parser.
See Also:
Warning(), First_Warning(), Reset_Warning()

First_Warning

public PVL_Exception First_Warning()
Returns the first warning since the last Reset_Warning.

Returns:
The first warning status as a PVL_Exception object, or null if no warning condition is registered.
See Also:
PVL_Exception, Warning(), First_Warning(boolean), Reset_Warning()

Last_Warning

public Parser Last_Warning(boolean last)
Enables or disables returning the last warning that occurs as the current warning status. The last warning is the most recent one regarless of any previous warning conditions that may have occured without an intervening Reset_Warning.

Parameters:
last - true to enable returning the last warning status; false to return the first warning condition that occurred as the current warning status.
Returns:
This Parser.
See Also:
Warning(), Last_Warning(), Reset_Warning()

Last_Warning

public PVL_Exception Last_Warning()
Returns the last warning since a Reset_Warning.

Returns:
The last warning status as a PVL_Exception object, or null if no warning condition is registered.
See Also:
PVL_Exception, Warning(), Last_Warning(boolean), Reset_Warning()

Get

public Parameter Get()
              throws PVL_Exception
Gets as many Parameters as possible.

When the source of PVL statements is a Reader, then an aggregate Parameter named CONTAINER_NAME will be provided to contain all the Parameters found (zero or more). If this Parser was created with a String as the source of PVL statements, then a container Parameter will only be provided if more than one Paramter is found: the single Parameter found, or an empty UNKNOWN Parameter if nothing is found.

Returns:
A Parameter containing everything found from the input source.
Throws:
PVL_Exception - If an unrecoverable problem occurred while parsing the input source.
See Also:
Add_To(Parameter), String_Buffer_Reader.String_Source()

Add_To

public Parameter Add_To(Parameter The_Aggregate)
                 throws PVL_Exception
Adds to a Parameter all Parameters found from the input source. Any Paramter added that is itself an AGGREGATE classification will recursively invoke this method on the new aggregate Parameter.

While the source of PVL statements is not empty and Get_Parameter can assemble a Parameter from the source, each new Parameter is Added to the specified Parameter's aggregate list. Note: Parameters having END classifications are not added, instead they stop the addition of Parameters to the current aggregate list; additions will continue with the parent of the current aggregate on returning from recursive method invocations. However, if an END_PVL Parameter is encountered no more Parameters will be added regardless of the recursion level.

Parameters:
The_Aggregate - The aggregate Parameter to which to add new Parameters from the input source.
Returns:
The_Aggregate Parameter.
Throws:
PVL_Exception -
BAD_ARGUMENT
The_Aggregate Parameter is actually an ASSIGNMENT with a non-null Value.
AGGREGATE_CLOSURE_MISMATCH
In Strict mode, when an END_AGGREGATE Parameter does not match the specific classification of the Parameter containing the aggregate list. This is only a Warning when Strict mode is not enabled.
Others are possible during parsing.
See Also:
Parameter.AGGREGATE, String_Buffer_Reader.Is_Empty(), Get_Parameter(), Parameter.Add(Parameter), Parameter.END, Parameter.END_AGGREGATE, Parameter.END_PVL

Get_Parameter

public Parameter Get_Parameter()
                        throws PVL_Exception
Gets a Parameter from the source of PVL statements.

First, Get_Comments is used to to collect any and all comment sequences preceeding the parameter proper as the Parameter's Comments.

The next sequence of non-WHITESPACE characters is taken to be the Parameter's Name. If the sequence is quoted (i.e. starts with a TEXT_DELIMITER or SYMBOL_DELIMITER), then the name is all characters within the quotes. Otherwise it is all characters up to but not including the next PARAMETER_NAME_DELIMITER, STATEMENT_END_DELIMITER, whitespace character, or comment sequence. The name is checked for any Bad_Character and a Warning will be registered (exception thrown in Strict mode) if one is found. If Verbatim_Strings is not enabled, then all character escape sequences in the name are replaced with their corresponding special characters.

If the name is associated with a Special_Classification, then the Parameter is given that classification; otherwise it is tentatively classified as a TOKEN.

If, after any whitespace or comments following the Parameter name, a PARAMETER_NAME_DELIMITER is found, then the Get_Value method is used to obtain the expected Parameter Value. If the Parameter had been given the TOKEN classification, then the classification is upgraded to ASSIGNMENT. If, however, the Parameter had been given an AGGREGATE classification as a result of its special name, and the Value obtained is a STRING type, then the Parameter's name is changed to the Value String; if, in this case, the Value found is not a STRING type a Warning will be registered (exception thrown in Strict mode).

Having assembled a valid Parameter, the Next_Location in the input stream is moved forward past any whitespace or STATEMENT_END_DELIMITER.

Returns:
The next Parameter assembled from the input stream, or null if no Parameter can be assembled because the input stream is empty.
Throws:
PVL_Exception -
ILLEGAL_SYNTAX
Strict mode is enabled and the Parameter name was quoted.
RESERVED_CHARACTER
The Parameter name contained a bad character and Strict mode is enabled.
GROUP_VALUE
The Value obtained for an AGGREGATE Parameter is not a STRING type.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Get_Comments(), Parameter.Comments(), Parameter.Name(), translate_from_escape_sequences(String_Buffer), Special_Classification(String), Get_Value()

Get_Comments

public String Get_Comments()
                    throws PVL_Exception
Gets the next sequence of comments from the source of PVL statements as a single comment String. After skipping over any whitespace, the next characters must start a comment sequence or nothing (null) is returned.

A PVL comment uses C-style conventions: It starts after the COMMENT_START_DELIMITERS and ends before the COMMENT_END_DELIMITERS. A comment without the closing COMMENT_END_DELIMITERS will result in a MISSING_COMMENT_END exception in Strict mode; otherwise a Warning is registered and the next line break, STATEMENT_END_DELIMITER, or the end of the input stream is taken as the end of the comment. Note: Though an effort is made to recover from encountering an unending comment, this will only be effective when no other normally closed comment occurs in the input stream (if a normally closed comment does occur after an unclosed comment, the latter will be taken as the end of the former), and in this case the input stream will have been read into memory until it is empty.

Sequential comments, with nothing but white space intervening, are accumulated with a single new-line ('\n') chararacter separating them in the resulting String that is returned. In Strict mode comments that wrap across line breaks cause an exception. When Verbatim_Strings are not enabled whitespace is trimmed from the end of each comment (but not the beginning), and escape sequences are translated into their corresponding special characters.

If any comments are found the Next_Location of the input stream is moved to the position immediately following the last comment.

Returns:
A String containing the sequence of comments found, or null if no comment occurs before the next PVL item or the end of input.
Throws:
PVL_Exception -
ILLEGAL_SYNTAX
Strict mode is enabled and a comment continues on more than one line.
MISSING_COMMENT_END
A comment does not have COMMENT_END_DELIMITERS and Strict mode is enabled. If Strict is not enabled and a line or statement delimiter can not be found, then the exception is thrown.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Verbatim_Strings(boolean)

Get_Value

public Value Get_Value()
                throws PVL_Exception
Gets a Value from the source of PVL statements.

If the next character, after skipping any whitespace and comments, is a SET_START_DELIMITER or SEQUENCE_START_DELIMITER the Value being assembled is typed as a SET or SEQUENCE, respectively, and the Next_Location in the input stream is moved over the character; otherwise the Value is tentatively typed as a SET, and the Next_Location is not moved.

Now a cycle is entered to obtain as many sequential datum values as are available. The first step is to skip any whitespace and comments and test the character that is found. If the character is a SET_END_DELIMITER or SEQUENCE_END_DELIMITER, then the Next_Location in the input stream is moved over the character and Get_Units is used to set the ARRAY Value's units description before ending the datum cycle. If the character is a STATEMENT_END_DELIMITER the datum cycle ends. A reserved PARAMETER_NAME_DELIMITER, PARAMETER_VALUE_DELIMITER, UNITS_START_DELIMITER, UNITS_END_DELIMITER, or NUMBER_BASE_DELIMITER character here is an ILLEGAL_SYNTAX exception. For any other character the Next_Location is moved forward to its position as a possible datum.

When the character at this location is a SET_START_DELIMITER or SEQUENCE_START_DELIMITER this method is called recursively to get a subarray as the datum. Otherwise the Get_Datum method is used to get a basic value followed by the Get_Units method to get any units description for the new datum. If no datum is obtained the datum cycle ends, otherwise the new datum is added to the Vector of Value's being accumulated for this new Value.

After skipping any whitespace and comments the next character is checked. A reserved PARAMETER_NAME_DELIMITER, UNITS_START_DELIMITER, UNITS_END_DELIMITER, or NUMBER_BASE_DELIMITER character here is an ILLEGAL_SYNTAX exception. A SET_START_DELIMITER or SEQUENCE_START_DELIMITER character will also generate an ILLEGAL_SYNTAX exception, but if Strict mode is not enabled this will just be registered as a Warning and the Next_Location will be moved to the character's position before continuing the datum cycle from the beginning. If the character is a PARAMETER_VALUE_DELIMITER the Next_Location is moved over the character and the datum cycle returns to the beginning. For a SET_END_DELIMITER or SEQUENCE_END_DELIMITER, if the character does not correspond to the delimiter that began the array an ARRAY_CLOSURE_MISMATCH Warning is registered (this exception is thrown in Strict mode). Then the Next_Location is moved over the character and the Get_Units method is used to set this Value's units description before ending the datum cycle. Any other character also causes the datum cycle to end.

After the datum cycle has collected as many Values as possible, if this new Value was not begun with a SET_START_DELIMITER or SEQUENCE_START_DELIMITER and the accumulated Values Vector containins less than two Values then the initial tentative SET type does not apply. In this case an empty accumlated Values Vector results in this new Value being an UNNOWN type (i.e. it is empty). When only one Value was collected it is the new Value that is returned. When two or more Values were collected the Vector containing them is set as the data of this ARRAY Value.

Returns:
The next Value assembled from the input stream, or null if no Value can be assembled because the input stream is empty.
Throws:
PVL_Exception -
ILLEGAL_SYNTAX
A misplaced reserved character was found.
ARRAY_CLOSURE_MISMATCH
The delimiter character ending an array of Values does not correspond to the one that began the array.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Value.set_type(int), Value.Units(String), Get_Datum(Value)

Get_Datum

public Value Get_Datum(Value The_Value)
                throws PVL_Exception
Gets a datum from the source of PVL statements.

After skipping any whitespace or comments, the next character is checked to determine the type of datum to parse. For a STATEMENT_END_DELIMITER nothing happens. A reserved PARAMETER_NAME_DELIMITER, PARAMETER_VALUE_DELIMITER, SET_START_DELIMITER, SET_END_DELIMITER, SEQUENCE_START_DELIMITER, SEQUENCE_END_DELIMITER, UNITS_START_DELIMITER, UNITS_END_DELIMITER, or NUMBER_BASE_DELIMITER character here is an ILLEGAL_SYNTAX exception. For a TEXT_DELIMITER or SYMBOL_DELIMITER the Get_Quoted_String method is used to set the datum of the Value and its type is set to TEXT or SYMBOL respectively.

For any ordinary character the substring up to, but not including, the next WHITESPACE, PARAMETER_VALUE_DELIMITER, STATEMENT_END_DELIMITER, COMMENT_START_DELIMITERS, any of the SET/SEQUENCE/UNITS START/END delimiters, or the end of the input stream is used for parsing a datum. If Verbatim_Strings is not enabled, then all escape sequences in the substring are converted to their special character equivalents.

The datum substring is first assumed to represent a number. If the substring contains a NUMBER_BASE_DELIMITER ('#') the number is presumed to be in radix base notation:

[sign]base#value#

In this case the initial base integer is obtained using the Integer.parseInt method and becomes the Value's Base, and the value number is obtained using the Long.parseLong method with the base argument specified and becomes the Value's datum. The sign is applied to the value number. The Value becomes type INTEGER. Without the NUMBER_BASE_DELIMITER the datum substring is taken to be in decimal notation. If this number conversion fails, then the Double.valueOf method is tried on the datum substring to produce a type REAL Value.

N.B.: If treating all values as strings has been enabled, and Strict mode is not enabled, then the value is not assumed to be a number; it is always taken as string.

If parsing the datum substring as a number fails, then the Value is a STRING and its datum is the substring. If the substring contains one of the DATE_TIME_DELIMITERS the Value is given the DATE_TIME type. Otherwise it is the IDENTIFIER type. The datum substring is also checked for a Bad_Character with a Warning being registered (exception thrown in Strict mode) if one is found.

Once a datum has been given to the Value the Next_Location in the input stream is moved to the position immediately following the datum substring.

Parameters:
The_Value - The Value to which the next datum is to be applied.
Returns:
The_Value, or null if the input stream is empty.
Throws:
PVL_Exception -
ILLEGAL_SYNTAX
A misplaced reserved character was found.
RESERVED_CHARACTER
A STRING Value contains a reserved character.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Integer.parseInt(String), Long.parseLong(String, int), Double.valueOf(String), Value.set_data(Object), Value.Base(int), Value.set_type(int), translate_from_escape_sequences(String_Buffer)

Get_Units

public String Get_Units()
                 throws PVL_Exception
Gets a units description String from the source of PVL statements.

After skipping over any whitespace, the next character must start a units description sequence or nothing (null) is returned. A units description sequence starts after a UNITS_START_DELIMITER and ends before a UNITS_END_DELIMITER. A units description sequence without the closing UNITS_END_DELIMITER will result in a MISSING_UNITS_END exception in Strict mode; otherwise a Warning is registered and the next PARAMETER_VALUE_DELIMITER, any of the SET/SEQUENCE/UNITS START/END delimiters, STATEMENT_END_DELIMITER, or the end of the input stream is taken as the end of the units description. Note: Though an effort is made to recover from encountering an unending units description, this will only be effective when no other normally closed units descripiton occurs in the input stream (if a normally closed units descripiton does occur after an unclosed units description, the latter will be taken as the end of the former), and in this case the input stream will have been read into memory until it is empty.

Sequential comments, with nothing but white space intervening, are accumulated with a single new-line ('\n') chararacter separating them in the resulting String that is returned. In Strict mode comments that wrap across line breaks cause an exception. When Verbatim_Strings are not enabled whitespace is trimmed from the end of each comment (but not the beginning), and escape sequences are translated into their corresponding special characters.

If a units descripiton is found the Next_Location of the input stream is moved to the position immediately following it. The units description is trimmed of leading and trailing whitespace. If Verbatim_Strings is not enabled, then all comment sequences are removed from the units description String, all whitespace sequences are collapsed to a single space (' ') character, and escape sequences are substituted for their corresponding special characters.

Returns:
A units description String, or null if no units description occurs before the next PVL item or the end of input.
Throws:
PVL_Exception -
MISSING_UNITS_END
A units description does not have a UNITS_END_DELIMITER.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Verbatim_Strings(boolean), translate_from_escape_sequences(String_Buffer)

Get_Quoted_String

public String Get_Quoted_String()
                         throws PVL_Exception
Gets a quoted String from the source of PVL statements.

The next non-whitespace character is taken to be the "quote" character. The characters following the first quote character up to but not including the next, non-escaped (not preceeded by a backslash, '\') quote character are the quoted string. If the closing quote character can not be found, then a MISSING_QUOTE_END Warning will be registered (the exception will be thrown in Strict mode) and the quoted string will end at the end of the input stream. Note: The lack of a closing quote character will cause the entire input stream to be read into memory until it is emtpy. The Next_Location is moved to the position immediately following the last quote character.

If Verbatim_Strings is not enabled then line break sequences (one or more sequential line breaks) and any surrounding whitespace (whitespace ending the last line and beginning the next line) are replaced with a single space (' ') character. If, however, String_Continuation is enabled and the last non-whitespace character before the line break sequence is a STRING_CONTINUATION_DELIMITER then no space remains (i.e. the string ending with the last non-whitespace character on the last line is continued with the first non-whitspace character on the next line). In addition, escape sequences are translated to their corresponding special characters. Sequences of characters bracketed by VERBATIM_STRING_DELIMITERS are taken verbatim; they are subject to neither end of line treatment nor escape sequence translation.

Returns:
A String, or null if no non-whitespace character occurs before the end of input.
Throws:
PVL_Exception -
MISSING_QUOTE_END
A closing quote was not found in the input stream.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.
See Also:
Verbatim_Strings(boolean), String_Continuation(boolean), String_Buffer.escape_to_special()

skip_whitespace_and_comments

public long skip_whitespace_and_comments(long location)
                                  throws PVL_Exception
Gets the next location in the PVL source stream following any sequence of whitespace and/or comments. A STATEMENT_CONTINUATION_DELIMITER and any Crosshatch_Comments (if enabled) are included in the whitespace category. Note: As with the Get_Comments method, a comment without a closing sequence is taken to end and the next line break, STATEMENT_END_DELIMITER, or the end of the input stream; but this condition will cause the entire input stream to be read into memory.

Parameters:
location - The starting location from which to skip over whitespace and comments.
Returns:
The location of the next character after any whitespace or comments.
Throws:
PVL_Exception -
MISSING_COMMENT_END
A comment does not have COMMENT_END_DELIMITERS and Strict mode is enabled.
FILE_IO
If an IOException occurred in the String_Buffer_Reader.

translate_from_escape_sequences

public static String_Buffer translate_from_escape_sequences(String_Buffer string)
Translates escape sequences in a String_Buffer to their corresponding special characters. The escape_to_special method is used to translate escape sequences to special characters. However the occurance of VERBATIM_STRING_DELIMITERS starts a sequence of characters that are taken verbatim (they are not translated) up to the next VERBATIM_STRING_DELIMITERS or the end of the string (the VERBATIM_STRING_DELIMITERS are dropped).

Parameters:
string - The String_Buffer to be translated.
Returns:
The translated String_Buffer.
See Also:
String_Buffer.escape_to_special()

Bad_Character

public static int Bad_Character(String string)
Checks a String for any bad character. A bad character is one of the RESERVED_CHARACTERS or a non-printable character.

Parameters:
string - The String to check.
Returns:
The index of the first bad character found in the string, or -1 if no bad characters were found.

isprint

public static boolean isprint(char character)
Tests if a character is printable: in the ASCII range from the space character (' ') to the tilde character ('~') inclusive.

Parameters:
character - The char to test.
Returns:
true if the character is printable; false otherwise.

Special_Classification

public static int Special_Classification(String name)
Gets the Parameter classification code corresponding to the specified special Parameter name String. The special Parameter names and their classification codes are:

Begin_Object or BeginObject - BEGIN_OBJECT
Object - OBJECT
End_Object or EndObject - END_OBJECT
Begin_Group or BeginGroup - BEGIN_GROUP
Group - GROUP
End_Group or EndGroup - END_GROUP
End - END_PVL

The names are not case sensitive.

Parameters:
name - A String that may be a special Parameter name.
Returns:
the Parameter classification code associated with the special Parameter name; or -1 if the name is not a special Parameter name.
See Also:
Parameter

Special_Name

public static String Special_Name(int classification)
Gets the special Parameter name String for the specified Parameter classification code.

Parameters:
classification - A Parameter classification code int.
Returns:
The special Parameter name String associated with the classification code; or null if the classification code is not associated with a special Parameter name.
See Also:
Special_Classification(String)

PIRL

Copyright (C) \ 2003-2009 Bradford Castalia, University of Arizona