PIRL

PIRL.Strings
Class String_Buffer_Reader

java.lang.Object
  extended by PIRL.Strings.String_Buffer
      extended by PIRL.Strings.String_Buffer_Reader
Direct Known Subclasses:
Parser

public class String_Buffer_Reader
extends String_Buffer

A String_Buffer_Reader provides methods to manipulate a character stream as if it were a String_Buffer by backing it with a Reader to provide the characters for the String_Buffer.

The intended use is to consume a character stream, in a semi-sequential manner, processing statements as they are encountered by moving along the stream from statement to statement. A statement is composed of a contiguous sequence of characters of variable length.

The character stream may be sourced from a Reader or a String. A String source is a convenience that allows a String_Buffer to be used transparently in the same context where a Reader might also be used, but instead a "pre-read" source of characters is provided directly (a StringReader could also be used).

For a Reader source the character stream is buffered using a sliding window. The source contents are referenced by virtual Location values that act as indexes into the entire character stream; the first character of the stream is at location 0. The buffer window is automatically extended to contain characters from any location available in the input stream.

To avoid a large number of read operations on the input stream, the buffer is always extended by a specified Size_Increment. In addition, a Read_Limit may be specified to force input termination at any location in the character stream. Input may also be terminated when a threshold of non-text bytes (more than a specified sequential limit) occur in the stream, as would typically occur after the initial label area of an image file.

When statements from the character stream have been processed and are no longer needed the Next_Location is updated. That part of the character stream in the buffer before the Next_Location is considered to be disposable so that the next time the buffer needs to be extended to a new stream location the portion of the buffer containing characters from before the Next_Location are deleted, which slides the buffer forward. Thus instead of simply extending indefinitely, the buffer is reused by moving the contents to cover the section of the character stream being processed and removing contents no longer needed. In this way character streams of indefinite length may be processed using a relatively small buffer size.

Modifications may be made to the contents of the buffer using any of the String_Buffer methods (which includes all of the StringBuffer methods). When the buffer needs to be extended, because a reference is made to a location beyond the end of the current buffer contents, input characters are always appended to the end of the existing buffer contents regardless of changes that have been made that might have altered the number of characters it contains. However, the Next_Location will not change unless specifically updated by the user, and all characters before the Next_Location will be deleted whenever the buffer contents are extended, so characters before the Next_Location should be considered consumed. Thus the String_Buffer_Reader effectively offers String_Buffer (StringBuffer) access to all characters obtained from the Reader start at the Next_Location and extending on. If the Next_Location is never moved forward (it can be moved backwards, but not before the current Buffer_Location) then all characters from the Reader are always available (this is always the case for a String source of characters), but sliding the buffer forward past consumed statements is what provides efficient use of buffer memory for indefinately long input streams.

Note: Those String_Buffer methods which search forward for character patterns will extend the buffer to the end of input if the pattern is not found. To avoid unnecessarily large buffer extensions the Read_Limit can be set to some location downstream that the application considers to be beyond where the pattern could reasonably be found.

When the character source is a String, nothing is ever read. Nevertheless, the contents of the buffer may still be appended with additional characters and the Next_Location may still be moved forward. When buffer manipulation methods attempt to extend the buffer, it will still be slid forward by deleting characters before the Next_Location. In this case virtual location values may be thought of as the number of charaters consumed (slid over) plus the number of characters in the buffer.

A special feature of this String_Buffer_Reader is its ability to handle character streams with binary sized records (as produced by DEC VMS systems). Each record in such a stream has the form:

Size Characters [ Pad ]

Where the Size is a two byte, LSB first, binary integer value specifying the number of bytes of Characters to follow. And Pad is a byte of value 0 that is present when the Size is odd. The Size bytes will be replaced with a LINE_BREAK (CR-LF) sequence, and any Pad byte will be replaced with a space character, thus providing a consistent character stream. Note: This special filtering will only be applied when the source of characters is a Reader.

Version:
1.16
Author:
Bradford Castalia, UA/PIRL
See Also:
String_Buffer

Field Summary
static long DEFAULT_READ_LIMIT
          The default read limit.
static int DEFAULT_SIZE_INCREMENT
          The amount to increase the size of the string buffer when it needs to be enlarged while reading a file.
static String ID
           
static char INVALID_CHARACTER
          A character that should not occur in any valid String.
static long NO_READ_LIMIT
          The Read_Limit size value that means what it says.
 
Fields inherited from class PIRL.Strings.String_Buffer
QUESTIONABLE_CHARACTER
 
Constructor Summary
String_Buffer_Reader()
          Creates a String_Buffer_Reader with no character source.
String_Buffer_Reader(Reader reader)
          Creates a String_Buffer_Reader with the reader as the unlimited source of characters.
String_Buffer_Reader(Reader reader, long limit)
          Creates a String_Buffer_Reader with the reader as the source of characters up to the specified limit.
String_Buffer_Reader(String string)
          Creates a String_Buffer_Reader with a String source of characters.
 
Method Summary
 long Buffer_Location()
          Gets the current location of the beginning of the buffer in the virtual stream.
 char Char_At(long location)
          Gets the character at the specified location in the virual stream.
 int End_Index()
          Gets the buffer index of the last character in the buffer.
 long End_Location()
          Gets the location immediately following the last character in the buffer.
 boolean Ended()
          Tests if the end of input has been reached.
 boolean Equals(long location, String pattern)
          Tests if the pattern equals the substring starting at the location.
 boolean Extend()
          Extend the string buffer with additional input characters.
 boolean Filter_Input()
          Tests if the character source will be filtered on input.
 String_Buffer_Reader Filter_Input(boolean allow)
          Enable or disable input filtering.
protected  void Filter_Input(int index)
          Filters the current contents of the character buffer, starting at the specified index and continuing to the end of the buffer contents.
 Reader Get_Reader()
          Gets the Reader used as the source of characters.
 int Index(long location)
          Gets the index in the buffer for the location in the input stream.
 boolean Is_Empty()
          Tests if there are no more characters at the Next_Location; the source of characters is effectively empty.
 boolean Is_End(long location)
          Tests if the specified location is at or beyond the End_Location.
protected  boolean Is_Text(char character)
          Tests if a character is considered to be text.
 long Location_Of(long location, char character)
          Finds the next location of the character in the virtual stream.
 long Location_Of(long location, String pattern)
          Finds the next location of the pattern String in the virtual stream.
 long Location(int index)
          Gets the location in the input stream of the index in the buffer.
 int Next_Index()
          Gets the index in the buffer of the Next_Location.
 int Next_Index(int index)
          Sets the Next_Location using a buffer index value.
 long Next_Location()
          Gets the current value of the next location.
 long Next_Location(long location)
          Sets the next location to position the buffer when sliding it forward.
 String_Buffer_Reader No_Read_Limit()
          Sets the read limit to NO_READ_LIMIT.
 int Non_Text_Limit()
          Get the current limit for non-text data input.
 String_Buffer_Reader Non_Text_Limit(int limit)
          Sets the length of a non-text data sequence that will cause input from the Reader's character stream to the the buffer to stop.
 long Read_Limit()
          Gets the current limit where reading characters is to stop; the maximum number of characters to input from the Reader.
 String_Buffer_Reader Read_Limit(long limit)
          Sets the location where character input is to stop; the maximum number of characters to be obtained from the source.
 boolean Reader_Source()
          Tests if the source of characters is a Reader.
protected static int Record_Size(char LSB, char MSB)
          Converts two char values to a single int value, assuming that the first char value is the LSB and the second the MSB of a 16-bit integer.
protected  int Record_Size(int index)
          Converts a 16-bit (in 2 sequential chars), LSB-first value at the buffer index to a record size value.
 String_Buffer_Reader Reset_Location()
          Resets the virtual stream location values.
 String_Buffer_Reader Reset()
          Reset as if (almost) nothing happened.
 String_Buffer_Reader Set_Reader(Reader reader)
          Sets the Reader to use as the source of characters.
 int Size_Increment()
          Gets the size increment by which the input buffer will be extended when it is slid forward in the Reader's input stream.
 String_Buffer_Reader Size_Increment(int amount)
          Sets the size increment by which the input buffer will be extended when it is slid forward in the Reader's input stream.
 long Skip_Over(long location, String skip)
          Skips over a character set in the virtual stream.
 long Skip_Until(long location, String find)
          Skips until a member of the character set is found in the virtual stream.
 boolean String_Source()
          Tests if the source of characters is a String (there is no Reader).
 String Substring(long start, long end)
          Gets the substring including the characters from the start location up to, but not including, the end location in the virtual stream.
 long Total_Read()
          Gets the total number of characters read so far.
 
Methods inherited from class PIRL.Strings.String_Buffer
append, append, append, append, append, append, append, append, append, append, append, append, append, capacity, charAt, clean, clear, delete, deleteCharAt, ensureCapacity, equals_ignore_case, equals, equalsIgnoreCase, escape_to_special, escape_to_special, from_character_references, from_character_references, getChars, index_of, index_of, indexOf, indexOf, insert, insert, insert, insert, insert, insert, insert, insert, insert, insert, length, replace_span, replace, replace, replaceSpan, reverse, setCharAt, setLength, skip_back_over, skip_back_until, skip_over, skip_until, skipBackOver, skipBackUntil, skipOver, skipUntil, special_to_escape, special_to_escape, substring, substring, to_character_references, to_character_references, to_printable_ASCII, to_printable_ASCII, toString, trim_all, trim_beginning, trim_end, trim, trim, trim
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait
 

Field Detail

ID

public static final String ID
See Also:
Constant Field Values

DEFAULT_SIZE_INCREMENT

public static final int DEFAULT_SIZE_INCREMENT
The amount to increase the size of the string buffer when it needs to be enlarged while reading a file.

It should be at least 8k to allow for the largest possible sized record on first read.

See Also:
Constant Field Values

DEFAULT_READ_LIMIT

public static final long DEFAULT_READ_LIMIT
The default read limit.

This class was designed for use with the PVL Parser (which is a subclass) for processing PVL syntax statements. The read limit sets the maximum amount of a file to read when working through a file label. When the limit is reached it is presumed that there are no more statements to be processed. Set it to -1 for no limit.

N.B.: Without a recognizable end-of-label marker it is quite possible for non-label file data to be interpreted as PVL statements. Thus it is advisable to set a reasonable limit on the amount of file data to read. It is typical for a PVL label to contain as one of its parameters the size of the label. This suggests a strategy of using a reasonably small limit (enough to ensure that the label size parameter, usually near the start of the PVL statements, will be read), finding this parameter, rewinding the file, and ingesting the PVL statements again using the parameter value as the limiting value. A less "intelligent" (but likely to be easier) approach is to use the default limit and simply check the validity of parameters, since applications are likely to know what parameters are valid for them.

See Also:
Constant Field Values

NO_READ_LIMIT

public static final long NO_READ_LIMIT
The Read_Limit size value that means what it says.

See Also:
Read_Limit(long), Constant Field Values

INVALID_CHARACTER

public static final char INVALID_CHARACTER
A character that should not occur in any valid String.

See Also:
CharacterIterator.DONE, Constant Field Values
Constructor Detail

String_Buffer_Reader

public String_Buffer_Reader(Reader reader,
                            long limit)
Creates a String_Buffer_Reader with the reader as the source of characters up to the specified limit.

Parameters:
reader - The Reader to use as the source of input.
limit - The maximum number of character bytes to read.
See Also:
Read_Limit(long)

String_Buffer_Reader

public String_Buffer_Reader(Reader reader)
Creates a String_Buffer_Reader with the reader as the unlimited source of characters.

Parameters:
reader - The Reader to use as the source of input.

String_Buffer_Reader

public String_Buffer_Reader()
Creates a String_Buffer_Reader with no character source.

See Also:
Set_Reader(Reader)

String_Buffer_Reader

public String_Buffer_Reader(String string)
Creates a String_Buffer_Reader with a String source of characters.

Parameters:
string - The String to use as the source of input.
Method Detail

Set_Reader

public String_Buffer_Reader Set_Reader(Reader reader)
Sets the Reader to use as the source of characters.

Changing the Reader in mid stream may have unexpected side effects. Any data still being held in the internal character array pending further processing before transfer to the charactr buffer will remain in the virtual input stream before data read from the new Reader (note that for a String source of characters the internal character array is not used). The Total_Read will not be reset by a change of Reader; i.e. a single virtual character input stream is seen.

Note: If the reader is set to null, then input has, by definition, ended, but the current contents of the buffer remain available. However, any data still pending processing in the internal character array is dropped. If this object was created with a String character source a Reader may be set to supplement the initial String. In this case the Total_Read will include the length of the initial String.

Parameters:
reader - The reader to associate with this object.
Returns:
This Buffer_String_Reader.
See Also:
Total_Read(), String_Source(), Extend()

Get_Reader

public Reader Get_Reader()
Gets the Reader used as the source of characters.

Returns:
The Reader associated with this object; will be null if the object currently has no Reader (e.g. it was created with a String character source).
See Also:
Set_Reader(Reader)

Reader_Source

public boolean Reader_Source()
Tests if the source of characters is a Reader.

Returns:
true if the source of characters is a Reader.

String_Source

public boolean String_Source()
Tests if the source of characters is a String (there is no Reader).

Returns:
true if the source of characters is a String.

Reset

public String_Buffer_Reader Reset()
Reset as if (almost) nothing happened.

The buffer location and next location in the stream are reset to zero, the internal character buffer is marked as empty. If a Reader source (rather than String source) is being used the total read is set to zero, the read limit set to NO_READ_LIMIT and the base String_Buffer is cleared. N.B.: The internal character buffer size increment and non-text data threshold are not changed.

If input filtering is enabled it is reset so the source stream will be tested again.


Location

public long Location(int index)
Gets the location in the input stream of the index in the buffer.

Location values must be used when manipulating the contents of the buffer using String_Buffer_Reader methods.

Location values are virtual with respect to the entire character input stream. Location 0 corresponds to the first character consumed after the buffer has slid forward, or the first character currently in the buffer if nothing has yet been consumed. If the number of characters in the buffer is only changed when it is Extended, then location values are relative to all characters in the input stream. However, since the number of characters in the buffer may be changed by various methods, location values are actually relative to all characters that the buffer has slid over (been consumed) plus the current contents of the buffer; i.e. the virtual input stream.

Note: Location values are long; index values are int.

Parameters:
index - A buffer index value.
Returns:
The corresponding input stream location.
See Also:
Extend()

Index

public int Index(long location)
Gets the index in the buffer for the location in the input stream.

Index values are relative to the current contents of the buffer; index 0 corresponds to the character currently at the beginning of the buffer. Index values must be used when menipulating the contents of the buffer with String_Buffer (StingBuffer) methods.

Note: Location values are long; index values are int.

Parameters:
location - A location in the input stream.
Returns:
The corresponding buffer index relative to the current location of the beginning of the buffer in the input stream.

Buffer_Location

public long Buffer_Location()
Gets the current location of the beginning of the buffer in the virtual stream.

Returns:
The location of the first character in the buffer.

Reset_Location

public String_Buffer_Reader Reset_Location()
Resets the virtual stream location values.

The current character buffer location in the source of charcters is subtracted from the next location and the buffer location is set to zero. The total read is also set to zero if the source of characters is a () stream reader, not a String.

A reset has the effect of making it appear as if the character buffer had been loaded for the first time, but without affecting the current relative location of the next character to be read.

Because the location values are long integers they can be exepected to remain valid even for a reader that is a coninuously generated stream (such as a network socket). Nevertheless, a reset when the next location exceeds some (very large) threshold will ensure that a stream of unlimited length can be continuously processed.


Next_Location

public long Next_Location(long location)
                   throws IndexOutOfBoundsException,
                          IOException
Sets the next location to position the buffer when sliding it forward.

If the location is beyond the end of the current buffer contents the buffer is extended to include the location, if possible.

Parameters:
location - The next location to use for the beginning of the buffer when the buffer is slid forward.
Returns:
The new value of the next location. The only case where this will be different from the specified location is when the end of input has been reached before the specified location could be reached; in which case the return value is the final end location.
Throws:
IndexOutOfBoundsException - If the location is before the beginning of the buffer.
IOException - If there was a problem extending the buffer.
See Also:
End_Location(), Extend()

Next_Location

public long Next_Location()
Gets the current value of the next location.

Returns:
The current value of the next location.
See Also:
Next_Location(long)

Next_Index

public int Next_Index(int index)
               throws IndexOutOfBoundsException,
                      IOException
Sets the Next_Location using a buffer index value.

Parameters:
index - The buffer index value for the next location.
Returns:
The index value of the new next location.
Throws:
IndexOutOfBoundsException - If the index is before the beginning of the buffer.
IOException - If there was a problem extending the buffer.
See Also:
Next_Location(long)

Next_Index

public int Next_Index()
Gets the index in the buffer of the Next_Location.

Returns:
The index in the buffer of the next location.
See Also:
Next_Location()

End_Location

public long End_Location()
Gets the location immediately following the last character in the buffer.

Returns:
The location immediately following the last character in the buffer.

End_Index

public int End_Index()
Gets the buffer index of the last character in the buffer. This is the same as the number of characters currently in the buffer.

Returns:
The buffer index of the last character in the buffer.

Read_Limit

public String_Buffer_Reader Read_Limit(long limit)
Sets the location where character input is to stop; the maximum number of characters to be obtained from the source.

If the limit is 0, then the DEFAULT_READ_LIMIT (256 KB) is used. If the limit is less than 0, then NO_READ_LIMIT will be used, so no upper limit will be imposed on the number of characters to obtain from the source.

Note: The read limit is reset to the total number of characters read if a Reader encounters the end of the input stream.

Note: The read limit will have no effect when the character source is a String. In this case nothing is ever read so the read limit is never changed.

Note: When the threshold for sequential non-text data has been reached while extending the buffer, then the read limit is automatically set to the number of characters read before the non-text sequence. Once this condition has been encountered, the read limit can not be changed without first resetting the non-text data threshold up to a higher value.

Parameters:
limit - The location where character input is to stop.
Returns:
This String_Buffer_Reader.
See Also:
NO_READ_LIMIT

No_Read_Limit

public String_Buffer_Reader No_Read_Limit()
Sets the read limit to NO_READ_LIMIT.

Returns:
This String_Buffer_Reader.
See Also:
Read_Limit(long)

Read_Limit

public long Read_Limit()
Gets the current limit where reading characters is to stop; the maximum number of characters to input from the Reader.

This will be NO_READ_LIMIT if there is no limit to the number of characters to read from the source.

Returns:
The current read limit.

Total_Read

public long Total_Read()
Gets the total number of characters read so far. This value is the actual number of characters input from the Reader regardless of any subsequent use of the data.

Returns:
The total number of characters read.

Ended

public boolean Ended()
Tests if the end of input has been reached.

The end of input occurs when the read limit has been reached. When the input source is a String, then it has always ended (i.e. there is nothing to read).

Returns:
true if the character source has ended.

Is_Empty

public boolean Is_Empty()
Tests if there are no more characters at the Next_Location; the source of characters is effectively empty.

There will be no more characters at the Next_Location if the input has eneded and the Next_Location is at, or beyond, the End_Location.

Returns:
true if there are no more statements available.
See Also:
Ended(), End_Location()

Is_End

public boolean Is_End(long location)
Tests if the specified location is at or beyond the End_Location.

Note: Unlike Ended, Is_End only indicates that the location is beyond the current buffer contents. There may still be more source characters available.

Parameters:
location - The location to test.
Returns:
true if the location is at or beyond the End_Location.
See Also:
End_Location()

Size_Increment

public String_Buffer_Reader Size_Increment(int amount)
Sets the size increment by which the input buffer will be extended when it is slid forward in the Reader's input stream.

If the amount is less than or equal to 0, then the default size increment (16 KB) will be applied. If the amount is less than or equal to the Non_Text_Limit threshold, then it will be increased by that amount. If the current internal character array contains more input data that the size increment, then it will be set to the amount of data being held. If the new size increment is different from the length of the current internal character array, then a new array will be allocated and any data in the old array will be copied into it.

Parameters:
amount - The new size increment.
Returns:
This String_Buffer_Reader.
See Also:
Non_Text_Limit(int), Extend()

Size_Increment

public int Size_Increment()
Gets the size increment by which the input buffer will be extended when it is slid forward in the Reader's input stream.

Returns:
The current size increment.
See Also:
Size_Increment(int)

Non_Text_Limit

public String_Buffer_Reader Non_Text_Limit(int limit)
Sets the length of a non-text data sequence that will cause input from the Reader's character stream to the the buffer to stop.

Text characters are recognized by the Is_Text method; characters that fail this test are non-text.

Since this class is managing a String buffer, it is assumed that non-text data from the input should not be transferred into the buffer; i.e. the data are not valid String characters. However, to allow for possible input filter requirements (e.g. binary record size values), a sequence of non-text data less than the limit is allowed into the character buffer. When a sequence of non-text data in the input stream reaches the limit the Read_Limit is set to the amount read up to, but not including, the sequence of non-text data. A limit of 0 has the same effect as a limit of 1 (so a limit of zero will be forced to 1); i.e. no non-text data are acceptable. Specifying a negative limit disables non-text data checking; i.e. any non-text data is acceptable.

Note: No sequence of non-text data as long as the limit amount is allowed into the object's character buffer; shorter sequences are allowed to pass. And since the Read_Limit is set to the location in the input stream immediately preceeding a limiting non-text sequence, the buffer will not be Extended beyond this location. However, the Non_Text_Limit may subsequently be increased and the Read_limit lifted (in that order; the Read_limit can not be changed while a Non_Text_Limit block is in effect) to allow further processing of the input stream (data read from the Reader but not transferred to the character buffer is never lost). Nevertheless, any non-text data sequence outstanding will remain in effect and will be included in counting the length of the next non-text sequence; the non-text sequence length is only reset to 0 when a text character is seen in the input stream.

Warning: The Non_Text_Limit may affect the operation of an input filter. The input filter provided with this class requires, if input filtering is enabled, a Non_Text_Limit of a least 4.

Parameters:
limit - The maximum allowed sequence of non-text input data.
Returns:
This String_Buffer_Reader.
See Also:
Is_Text(char), Read_Limit(long), Extend()

Non_Text_Limit

public int Non_Text_Limit()
Get the current limit for non-text data input.

Returns:
The non-text data limit.
See Also:
Non_Text_Limit(int)

Extend

public boolean Extend()
               throws IndexOutOfBoundsException,
                      IOException
Extend the string buffer with additional input characters.

The procedure to extend the buffer has several steps:

Test if input filtering is enabled.
The Filter_Input test is done before reading any data so the Reader input position is not altered. The test method may need to read characters from the source at its current position.
Slide the buffer forward.
If the Next_Location is beyond the Buffer_Location then the consumed contents of the buffer - i.e. from the beginning up to, but not including, the Next_Index - are deleted.
Check for end of input.
The end of input occurs when the end of the Reader's data stream has been reached, or the amount of input has reached the Read_Limit. When this object was constructed from a String the end of input has been reached by definition. When the end of input has been reached the buffer can not be extended so the method returns false.
Determine how much to extend the buffer.
The buffer will be extended by the lesser of the amount of free space in the internal character array or the amount from the current Total_Read up to the Read_Limit. Of course, if there is no read limit then the former is always used. The internal character array, where characters read from the Reader are stored for checking before being transferred to the object's buffer, has a length of Size_Increment. However, during each read cycle less than the entire character array contents may be transferred to the buffer; the remainder is carried over in the array where new input data is added in the next read cycle. Thus the buffer will be extended by no more than Size_Increment characters.
Read characters.
Characters are read from the Reader into an internal character array. Here they may be scanned for non-text data before being appended to the object's character buffer. This cycle continues until the amount to extend the buffer has been read or the end of the input stream is encounterd.
Check for non-text data.
Characters read into the internal storage array may be scanned for a sequence of non-text data of the current maximum length. This check is only done if the non-text data threshold is non-negative. Counting of sequential non-text data is continuous across buffer extensions and is only reset to zero when a text character is found. If the non-text data threshold is reached, then only the data preceeding the sequence is appended to the object's character buffer and the Read_Limit is reset to the corresponding position in the data stream. Shorter sequences of non-text data at the end of scanned input are also omitted from the transfer to the buffer, against the possibility of more non-text data from the next read, but the Read_Limit is not changed. Data not transferred to the buffer is retained in the internal array where it is added to in the next read cycle.
Transfer characters to the buffer.
The data in the character array, from the beginning up to but not including any trailing non-text data (if non-text data checking is enabled), is appended to the object's character buffer.
Filter the new characters.
If input filtering is enabled, the Filter_Input method is invoked on the new characters.

Note: Altering the Reader's input position in the character stream may have have unpredictable consequences. The impact on the operation of the Filter_Input method in particular could be fatal. Without input filtering, however, it is quite possible to move the Reader's input position (e.g. by reset, skip, or read operations) to suit application needs. It is important to keep in mind that any such alterations will go undetected, and the value of the Total_Read and source stream locations will continue to indicate the amount of data actually read and consumed from a virtually sequential input stream.

Returns:
true if more input is available; false if the end of input was reached.
Throws:
IOException - From the read method of the Reader.
IndexOutOfBoundsException - If the buffer indexes are invalid.
See Also:
Filter_Input(), Next_Location(long), Buffer_Location(), Next_Index(), Ended(), Read_Limit(long), Total_Read(), Size_Increment(int), Is_Text(char), Non_Text_Limit(int), Filter_Input(int)

Is_Text

protected boolean Is_Text(char character)
Tests if a character is considered to be text.

Text is evaluated against the ASCII code set and includes all printable characters - values from 0x20 (' ') to 0x7E ('~') inclusive - plus the usual format control characters - HT, LF, VT, FF, and CR (0x9 to 0xD inclusive).

Parameters:
character - The character to be tested.
Returns:
true if the character is text; false otherwise.

Filter_Input

public boolean Filter_Input()
                     throws IOException
Tests if the character source will be filtered on input.

Some files - notably binary data files produced by DEC VMS systems - use a record structure composed of a leading binary size value (16-bits, LSB first) followed by size bytes of data. This is like Pascal strings (and just as cumbersome). In addition, if the size is odd then a zero-valued pad byte will be appended to the record (to force word alignment for all size values). Binary record size values are detected by testing the second (MSB) character obtained from the Reader (String character sources are not tested) for a value less than 32 (' ') but not 9 (HT), 10 (NL) or 13 (CR). This test for the presence of the binary size value depends on a few assumptions: 1) the Reader is positioned at the beginning of a potential binary size value (this is probably the beginng of the file or stream), 2) the second character of normal (unfiltered) input will not be unprintable, and 3) the first binary size value is less than 8k and outside the ranges (2304-2815) and (3328-3583). N.B.: This last assumption is rather risky, of course; but these records are very likely to be less than 2k in size. Of course, this also requires that any character encoding preserve the values of bytes in the stream.

During character input a source that has sized records will be filtered: The record size bytes will be replaced with a LINE_BREAK (CR-LF) sequence, and any pad bytes will be replaced with a space character, thus providing a consistent character stream.

When a String_Buffer_Reader is created its input filtering status is untested. If the input filtering status is untested when this method is used then the next two characters are read (if the source is a Reader) unconditionally - i.e. regardless if the source has already been read or characters otherwise put in the buffer - and they are tested. The results of the input source test update the input filtering status to either filtered or unfiltered, so subsequent use of this method will not repeat the input source test unless the input filtering status is reset to the initial untested state (which may be done by the Filter_Input (boolean) method.

Note: To accommodate binary record size values the Non_Text_Limit threshold must be greater than 3 (for a possible pad byte followed by two record size bytes).

Returns:
true if input filtering will be applied, false otherwise.
Throws:
IOException - If the Reader could not be read during the initial check.
See Also:
Filter_Input(boolean), Filter_Input(int), Non_Text_Limit(int)

Filter_Input

public String_Buffer_Reader Filter_Input(boolean allow)
Enable or disable input filtering.

If input filtering is to be allowed and it is currently not enabled, the input filtering status is reset to untested. If input filtering is not to be allowed it is unconditionally disabled.

Parameters:
allow - true if input filtering is allowed; false otherwise.
Returns:
This String_Buffer_Reader.
See Also:
Filter_Input(), Filter_Input(int)

Filter_Input

protected void Filter_Input(int index)
                     throws IndexOutOfBoundsException,
                            IOException
Filters the current contents of the character buffer, starting at the specified index and continuing to the end of the buffer contents.

The details of the filtering implemented here is described by the Filter_Input test method.

Parameters:
index - The index in the character buffer where filtering is to start. Nothing is done if the index is invalid (i.e. < 0 or >= the end of characters in the buffer).
Throws:
IndexOutOfBoundsException - An index for record size bytes or padding is invalid. Assuming that the filtering algorithm is bug free (8^]), then the input stream has been corrupted, possible due to inappropriate changes to its position by the application. It is very likely in this case that previous character buffer contents have been mangled as well.
IOException - This implmentation doesn't do anything that could generate an IOException. However, the method is marked this way so sbuclass implementations that need to use the Reader may throw this exception.
See Also:
Filter_Input(), End_Index(), Record_Size(char, char)

Record_Size

protected int Record_Size(int index)
Converts a 16-bit (in 2 sequential chars), LSB-first value at the buffer index to a record size value.

This method is used by the input filter.

Parameters:
index - The index of the binary record size bytes in the buffer.
Returns:
A record size value.
See Also:
Record_Size(char, char)

Record_Size

protected static int Record_Size(char LSB,
                                 char MSB)
Converts two char values to a single int value, assuming that the first char value is the LSB and the second the MSB of a 16-bit integer.

Bits 0-7 of the MSB are shifted left 8 bits and ORed with bits 0-7 of the LSB to form the integer value.

Parameters:
LSB - The Least Significant Byte of the 16-bit record size value.
MSB - The Most Significant Byte of the 16-bit record size value.
Returns:
A record size value.
See Also:
Filter_Input()

Char_At

public char Char_At(long location)
Gets the character at the specified location in the virual stream.

The current contents of the character buffer will be automatically extended to the specified location.

Parameters:
location - The location from which to get a character.
Returns:
The char found at the location. If the location is invalid for any reason then the INVALID_CHARACTER is returned.

Substring

public String Substring(long start,
                        long end)
                 throws IndexOutOfBoundsException,
                        IOException
Gets the substring including the characters from the start location up to, but not including, the end location in the virtual stream.

The current contents of the character buffer will be automatically extended to include the specified locations. If the substring extends beyond the end of input location, then that portion up to the end of input will be returned. If both start and end locations are beyond the end of input, then an empty String will be returned.

Parameters:
start - The location of the first character of the substring.
end - The location of end of the substring (the location immediately following the last character in the substring).
Returns:
The String from the start location, inclusive, to the end location, exclusive.
Throws:
IndexOutOfBoundsException - A location is before the Buffer_Location, or start is greater than end.
IOException - While reading characters to extend the buffer.

Skip_Over

public long Skip_Over(long location,
                      String skip)
               throws IndexOutOfBoundsException,
                      IOException
Skips over a character set in the virtual stream.

Starting at the specified location find the location of the next character that is not in the skip string. The skip string is not a pattern; i.e. a character in the virtual stream that matches any character in the skip string is skipped during the search. The return location will be for a character that does not occur in the skip string.

The character buffer will be extended until a non-skip character is found. Note: The character buffer will be extended to include all available input if all characters from the beginning location on are in the skip String.

Parameters:
location - The location from which to start the search.
skip - The String containing characters to be skipped.
Returns:
The location of the next character not in the skip String. This will be the End_Location if the end of input data is reached (i.e. all characters from the start location on are in the skip String).
Throws:
IndexOutOfBoundsException - The location is before the Buffer_Location.
IOException - While reading characters to extend the buffer.

Skip_Until

public long Skip_Until(long location,
                       String find)
                throws IndexOutOfBoundsException,
                       IOException
Skips until a member of the character set is found in the virtual stream.

Starting at the specified location find the location of the next character that is in the find string. The find string is not a pattern; i.e. a character in the virtual stream that matches any character in the find string satisfies the search. The return location will be for a character that occurs in the find string.

The character buffer will be extended until a find character is found. Note: The character buffer will be extended to include all available input if all characters from the beginning location on are not in the find String.

Parameters:
location - The location from which to start the search.
find - The String containing characters to be found.
Returns:
The location of the next character also in the find String. If the end of input data is reached (i.e. all characters from the start location on are not in the find String), then -1 is returned.
Throws:
IndexOutOfBoundsException - The location is before the Buffer_Location.
IOException - While reading characters to extend the buffer.

Location_Of

public long Location_Of(long location,
                        String pattern)
                 throws IndexOutOfBoundsException,
                        IOException
Finds the next location of the pattern String in the virtual stream.

Starting at the specified location find the location of the next occurance of the substring that matches the pattern String.

The character buffer will be extended until the pattern substring is found. Note: The character buffer will be extended to include all available input if the pattern can not be found.

Parameters:
location - The location from which to start the search.
pattern - The String to be found as a substring of the virtual stream.
Returns:
The location of the beginning of the pattern substring. If the end of input data is reached without finding the pattern String, then -1 is returned.
Throws:
IndexOutOfBoundsException - The location is before the Buffer_Location.
IOException - While reading characters to extend the buffer.

Location_Of

public long Location_Of(long location,
                        char character)
                 throws IndexOutOfBoundsException,
                        IOException
Finds the next location of the character in the virtual stream.

Starting at the specified location find the next location of the specified character.

The character buffer will be extended until the specified character is found. Note: The character buffer will be extended to include all available input if the character can not be found.

Parameters:
location - The location from which to start the search.
character - The character to be found.
Returns:
The location of the next occurance of the character in the virtual stream. If the end of input data is reached without finding the character, then -1 is returned.
Throws:
IndexOutOfBoundsException - The location is before the Buffer_Location.
IOException - While reading characters to extend the buffer.

Equals

public boolean Equals(long location,
                      String pattern)
               throws IndexOutOfBoundsException,
                      IOException
Tests if the pattern equals the substring starting at the location.

The character buffer will be extended as needed to include a substring starting at the specified location that is as long as the pattern String.

Parameters:
location - The location of the substring to compare.
pattern - The String to compare against the substring.
Returns:
true if the substring equals the pattern String; false otherwise.
Throws:
IndexOutOfBoundsException - The location is before the Buffer_Location.
IOException - While reading characters to extend the buffer.

PIRL

Copyright (C) \ 2003-2009 Bradford Castalia, University of Arizona