Strings
|
VMS record filtering for a Sliding_String. More...
#include <VMS_Records_Filter.hh>
Public Member Functions | |
virtual bool | enabled (bool enable) |
Enables or disables filtering. | |
virtual bool | enabled () const |
Tests if the filter is enabled or disabled. | |
virtual void | filter (std::string &a_string, Index start=0, Index end=std::string::npos) |
Filters out VMS binary record size values from current contents of the string. | |
virtual std::string | identification () const |
Gets the class ID string. | |
VMS_Records_Filter () | |
Constructs a VMS_Records_Filter in a newly enabled condition. | |
Static Public Member Functions | |
static int | record_size (char LSB, char MSB) |
Converts two char values to a single index value, assuming that the first char value is the LSB and the second the MSB of a 16-bit integer. | |
Static Public Attributes | |
static const char *const | ID |
Class identification. | |
static const char | LINE_BREAK [] |
Used to plug VMS binary record size value holes. | |
static const int | RECORD_SIZE_LIMIT |
Threshold for invalid record_size value. |
VMS record filtering for a Sliding_String.
Binary data files produced by DEC VMS systems use a record structure composed of a leading binary size value (16-bits, LSB first) followed by size bytes of data. This is like Pascal strings (and just as cumbersome). In addition, if the size is odd then a zero-valued pad byte will be appended to the record (to force word alignment for all size values).
Binary record size values are detected by testing, at the beginning of the VMS record structure (at initialization of the algorithm on first use of the filter method and after re-enabling) the first two characters (from the start index) of the string for a value less than 8k. This test for the presence of the binary size value depends on a few assumptions: 1) the start index is at the beginning of a potential binary size value (this is probably the beginning of the stream), 2) the second character of normal (unfiltered) input will be printable (have an ASCII value >= ' '), and 3) the binary size values are less than 8k. Of course, this also requires that any character encoding preserve the values of bytes in the stream. If the filter method ever detects a record size value >= 8k it assumes that the VMS binary record structure is no longer present in the string and turns off further filtering. Filtering can be restarted with the enabled(bool) method.
WARNING: Strings that have as their second character (after the start index on first use) such print control characters as a tab, or new-line will be incorrectly sensed as containing VMS binary record size values.
WARNING: If the string length after the start index is less than 2 when an initialization test is to be done, then the test is presumed to fail and filtering will be disabled.
The String if filtered by replacing record size bytes with a LINE_BREAK (CR-LF) sequence, and any pad bytes will be replaced with a space character, thus providing a consistent character stream.
Constructs a VMS_Records_Filter in a newly enabled condition.
virtual std::string identification | ( | ) | const [virtual] |
Gets the class ID string.
Reimplemented from String_Filter.
virtual void filter | ( | std::string & | a_string, |
Index | start = 0 , |
||
Index | end = std::string::npos |
||
) | [virtual] |
Filters out VMS binary record size values from current contents of the string.
This filter is expected to be applied to a Sliding_String. The start Index must be where the new data was added when at the last slide forward. The end Index is expected to be the end of the string.
Each VMS record is preceeded by a 16-bit, LSB first, binary record size value. The record size value is the number of characters in the next record. This value does not include the two chars of the size value itself, nor a possible zero valued byte that is appended to the record whenever its size is odd. The record size bytes are replaced with LINE_BREAK (CR-LF) characters. Since the pad byte is a non-text datum it is plugged with a space character whenever it occurs.
To keep track of where the record size values are located in the string, the offset from the last data byte processed to the next record size value is recorded in the Record_Size value. It's possible for a size value to be split between (at the end of) filtering operations. In this case the Record_Size is set to -1, the first (LSB) byte of the size value is saved in the LSB value, and the start byte of the next filtering operation will be the value of the next (MSB) byte of record size value; except if the last record was padded, in which case it follows the pad byte in the start position.
To keep track of odd-sized, and thus padded, records between Sliding_String slide operations, a Padded value is set at the end of each filtering operation: 1 if a pad byte is expected; 0 otherwise. This allows the Record_Size value to be adjusted accordingly.
Note: The size of the string is not changed and no characters are moved. Only record size/padding values are replaced. If the position input stream backing the Sliding_String is changed, or the string is modified in a way that changes character positions or the value of the record size bytes, the filtering algorithm will lose track of the VMS record structure resulting in undefined, and probably inappropriate, effects.
a_string | A string to be filtered. |
start | The position in the string where filtering is to start. Nothing is done if the position is not within the length of the string [default = 0]. |
end | The end of the filtering range. This is expected to be the end of the string [default = std::string::npos] |
Reimplemented from String_Filter.
virtual bool enabled | ( | bool | enable | ) | [virtual] |
Enables or disables filtering.
If the filter is being re-enabled after being disabled - either as a result of calling enabled (false) or because the algorithm detected an invalid record size - then the algorithm will reinitialize the next time the filter method is used. If the filter is currently in the same state as the enable argument, nothing is changed.
enable | If true filtering will be enabled; otherwise it is disabled. |
Reimplemented from String_Filter.
virtual bool enabled | ( | ) | const [virtual] |
Tests if the filter is enabled or disabled.
Reimplemented from String_Filter.
static int record_size | ( | char | LSB, |
char | MSB | ||
) | [static] |
Converts two char values to a single index value, assuming that the first char value is the LSB and the second the MSB of a 16-bit integer.
Bits 0-7 of the MSB are shifted left 8 bits and ORed with bits 0-7 of the LSB to form the integer value.
LSB | The Least Significant Byte of the 16-bit record size value. |
MSB | The Most Significant Byte of the 16-bit record size value. |
const char* const ID [static] |
Class identification.
const char LINE_BREAK[] [static] |
Used to plug VMS binary record size value holes.
const int RECORD_SIZE_LIMIT [static] |
Threshold for invalid record_size value.