jj2000.j2k.entropy.encoder
Class MQCoder

java.lang.Object
  |
  +--jj2000.j2k.entropy.encoder.MQCoder

public class MQCoder
extends java.lang.Object

This class implements the MQ arithmetic coder. When initialized a specific state can be specified for each context, which may be adapted to the probability distribution that is expected for that context.

The type of length calculation and termination can be chosen at construction time. ---- Tricks that have been tried to improve speed ----

1) Merging Qe and mPS and doubling the lookup tables
Merge the mPS into Qe, as the sign bit (if Qe>=0 the sense of MPS is 0, if Qe<0 the sense is 1), and double the lookup tables. The first half of the lookup tables correspond to Qe>=0 (i.e. the sense of MPS is 0) and the second half to Qe<0 (i.e. the sense of MPS is 1). The nLPS lookup table is modified to incorporate the changes in the sense of MPS, by making it jump from the first to the second half and vice-versa, when a change is specified by the swicthLM lookup table. See JPEG book, section 13.2, page 225.
There is NO speed improvement in doing this, actually there is a slight decrease, probably due to the fact that often Q has to be negated. Also the fact that a brach of the type "if (bit==mPS[li])" is replaced by two simpler braches of the type "if (bit==0)" and "if (q<0)" may contribute to that.

2) Removing cT
It is possible to remove the cT counter by setting a flag bit in the high bits of the C register. This bit will be automatically shifted left whenever a renormalization shift occurs, which is equivalent to decreasing cT. When the flag bit reaches the sign bit (leftmost bit), which is equivalenet to cT==0, the byteOut() procedure is called. This test can be done efficiently with "c<0" since C is a signed quantity. Care must be taken in byteOut() to reset the bit in order to not interfere with other bits in the C register. See JPEG book, page 228.
There is NO speed improvement in doing this. I don't really know why since the number of operations whenever a renormalization occurs is decreased. Maybe it is due to the number of extra operations in the byteOut(), terminate() and getNumCodedBytes() procedures.

3) Change the convention of MPS and LPS.
Making the LPS interval be above the MPS interval (MQ coder convention is the opposite) can reduce the number of operations along the MPS path. In order to generate the same bit stream as with the MQ convention the output bytes need to be modified accordingly. The basic rule for this is that C = (C'^0xFF...FF)-A, where C is the codestream for the MQ convention and C' is the codestream generated by this other convention. Note that this affects bit-stuffing as well.
This has not been tested yet.

4) Removing normalization while loop on MPS path
Since in the MPS path Q is guaranteed to be always greater than 0x4000 (decimal 0.375) it is never necessary to do more than 1 renormalization shift. Therefore the test of the while loop, and the loop itself, can be removed.

5) Simplifying test on A register
Since A is always less than or equal to 0xFFFF, the test "(a & 0x8000)==0" can be replaced by the simplete test "a < 0x8000". This test is simpler in Java since it involves only 1 operation (although the original test can be converted to only one operation by smart Just-In-Time compilers)
This change has been integrated in the decoding procedures.

6) Speedup mode
Implemented a method that uses the speedup mode of the MQ-coder if possible. This should greately improve performance when coding long runs of MPS symbols that have high probability. However, to take advantage of this, the entropy coder implementation has to explicetely use it. The generated bit stream is the same as if no speedup mode would have been used.
Implemented but performance not tested yet.

7) Multiple-symbol coding
Since the time spent in a method call is non-negligable, coding several symbols with one method call reduces the overhead per coded symbol. The decodeSymbols() method implements this. However, to take advantage of it, the implementation of the entropy coder has to explicitely use it.
Implemented but performance not tested yet.


Field Summary
(package private)  int a
          The current interval
(package private)  int b
          The last encoded byte of data
(package private)  int c
          The current bit code
(package private)  int cT
          The bit code counter
(package private)  boolean delFF
          If a 0xFF byte has been delayed and not yet been written to the output (in the MQ we can never have more than 1 0xFF byte in a row).
(package private)  int[] I
          The current index of each context
(package private)  int[] initStates
          The initial state of each context
static int LENGTH_LAZY
          Identifier for the lazy length calculation.
static int LENGTH_LAZY_GOOD
          Identifier for a very simple length calculation.
static int LENGTH_NEAR_OPT
          Identifier for the near optimal length calculation.
(package private)  int ltype
          The length calculation type to use.
(package private)  int[] mPS
          The current most probable signal for each context
(package private) static int[] nLPS
          The indexes of the next LPS
(package private) static int[] nMPS
          The indexes of the next MPS
(package private)  int nrOfWrittenBytes
          The number of written bytes so far, excluding any delayed 0xFF bytes.
(package private)  int nSaved
          Number of saved states.
(package private)  ByteOutputBuffer out
          The ByteOutputBuffer used to write the compressed bit stream.
(package private) static int[] qe
          The data structures containing the probabilities for the LPS
(package private) static int SAVED_INC
          The increase in length for the arrays to save states
(package private) static int SAVED_LEN
          The initial length of the arrays to save sates
(package private)  int[] savedA
          Saved values of the A register.
(package private)  int[] savedB
          Saved values of the B byte buffer.
(package private)  int[] savedC
          Saved values of the C register.
(package private)  int[] savedCT
          Saved values of CT counter.
(package private)  boolean[] savedDelFF
          Saved values of the delFF (i.e.
(package private) static int[] switchLM
          Whether LPS and MPS should be switched
static int TERM_EASY
          The identifier for the easy termination that is simpler than the 'TERM_NEAR_OPT' one but slightly less efficient.
static int TERM_FULL
          The identifier fort the termination that uses a full flush.
static int TERM_NEAR_OPT
          The identifier for the termination that uses the near optimal length calculation to terminate the arithmetic codewrod
static int TERM_PRED_ER
          The identifier for the predictable termination policy for error resilience.
(package private)  int ttype
          The termination type to use.
 
Constructor Summary
MQCoder(ByteOutputBuffer oStream, int nrOfContexts, int[] init)
          Instantiates a new MQ-coder, with the specified number of contexts and initial states.
 
Method Summary
private  void byteOut()
          This function puts one byte of compressed bits in the output stream.
 void codeSymbol(int bit, int context)
          This function performs the arithmetic encoding of one symbol.
 void codeSymbols(int[] bits, int[] cX, int n)
          This function performs the arithmetic encoding of several symbols together.
 void fastCodeSymbols(int bit, int ctxt, int n)
          This method performs the coding of the symbol 'bit', using context 'ctxt', 'n' times, using the MQ-coder speedup mode if possible.
 void finishLengthCalculation(int[] rates, int n)
          Terminates the calculation of the required length for each coding pass.
 int getNumCodedBytes()
          Returns the number of bytes that are necessary from the compressed output stream to decode all the symbols that have been coded this far.
 int getNumCtxts()
          Returns the number of contexts in the arithmetic coder.
 void reset()
          Reinitializes the MQ coder and the underlying 'ByteOutputBuffer' buffer as if a new object was instantaited.
 void resetCtxt(int c)
          Resets a context to the original probability distribution, and sets its more probable symbol to 0.
 void resetCtxts()
          Resets all contexts to their original probability distribution and sets all more probable symbols to 0.
private  void saveState()
          Saves the current state of the MQ coder (just the registers, not the contexts) so that a near optimal length calculation can be performed later.
 void setLenCalcType(int ltype)
          Set the length calculation type to the specified type.
 void setTermType(int ttype)
          Set termination type to the specified type.
 int terminate()
          This function flushes the remaining encoded bits and makes sure that enough information is written to the bit stream to be able to finish decoding, and then it reinitializes the internal state of the MQ coder but without modifying the context states.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

LENGTH_LAZY

public static final int LENGTH_LAZY
Identifier for the lazy length calculation. The lazy length calculation is not optimal but is extremely simple.

See Also:
Constant Field Values

LENGTH_LAZY_GOOD

public static final int LENGTH_LAZY_GOOD
Identifier for a very simple length calculation. This provides better results than the 'LENGTH_LAZY' computation. This is the old length calculation that was implemented in this class.

See Also:
Constant Field Values

LENGTH_NEAR_OPT

public static final int LENGTH_NEAR_OPT
Identifier for the near optimal length calculation. This calculation is more complex than the lazy one but provides an almost optimal length calculation.

See Also:
Constant Field Values

TERM_FULL

public static final int TERM_FULL
The identifier fort the termination that uses a full flush. This is the less efficient termination.

See Also:
Constant Field Values

TERM_NEAR_OPT

public static final int TERM_NEAR_OPT
The identifier for the termination that uses the near optimal length calculation to terminate the arithmetic codewrod

See Also:
Constant Field Values

TERM_EASY

public static final int TERM_EASY
The identifier for the easy termination that is simpler than the 'TERM_NEAR_OPT' one but slightly less efficient.

See Also:
Constant Field Values

TERM_PRED_ER

public static final int TERM_PRED_ER
The identifier for the predictable termination policy for error resilience. This is the same as the 'TERM_EASY' one but an special sequence of bits is embodied in the spare bits for error resilience purposes.

See Also:
Constant Field Values

qe

static final int[] qe
The data structures containing the probabilities for the LPS


nMPS

static final int[] nMPS
The indexes of the next MPS


nLPS

static final int[] nLPS
The indexes of the next LPS


switchLM

static final int[] switchLM
Whether LPS and MPS should be switched


out

ByteOutputBuffer out
The ByteOutputBuffer used to write the compressed bit stream.


mPS

int[] mPS
The current most probable signal for each context


I

int[] I
The current index of each context


c

int c
The current bit code


cT

int cT
The bit code counter


a

int a
The current interval


b

int b
The last encoded byte of data


delFF

boolean delFF
If a 0xFF byte has been delayed and not yet been written to the output (in the MQ we can never have more than 1 0xFF byte in a row).


nrOfWrittenBytes

int nrOfWrittenBytes
The number of written bytes so far, excluding any delayed 0xFF bytes. Upon initialization it is -1 to indicated that the byte buffer 'b' is empty as well.


initStates

int[] initStates
The initial state of each context


ttype

int ttype
The termination type to use. One of 'TERM_FULL', 'TERM_NEAR_OPT', 'TERM_EASY' or 'TERM_PRED_ER'.


ltype

int ltype
The length calculation type to use. One of 'LENGTH_LAZY', 'LENGTH_LAZY_GOOD', 'LENGTH_NEAR_OPT'.


savedC

int[] savedC
Saved values of the C register. Used for the LENGTH_NEAR_OPT length calculation.


savedCT

int[] savedCT
Saved values of CT counter. Used for the LENGTH_NEAR_OPT length calculation.


savedA

int[] savedA
Saved values of the A register. Used for the LENGTH_NEAR_OPT length calculation.


savedB

int[] savedB
Saved values of the B byte buffer. Used for the LENGTH_NEAR_OPT length calculation.


savedDelFF

boolean[] savedDelFF
Saved values of the delFF (i.e. delayed 0xFF) state. Used for the LENGTH_NEAR_OPT length calculation.


nSaved

int nSaved
Number of saved states. Used for the LENGTH_NEAR_OPT length calculation.


SAVED_LEN

static final int SAVED_LEN
The initial length of the arrays to save sates

See Also:
Constant Field Values

SAVED_INC

static final int SAVED_INC
The increase in length for the arrays to save states

See Also:
Constant Field Values
Constructor Detail

MQCoder

public MQCoder(ByteOutputBuffer oStream,
               int nrOfContexts,
               int[] init)
Instantiates a new MQ-coder, with the specified number of contexts and initial states. The compressed bytestream is written to the 'oStream' object.

Parameters:
oStream - where to output the compressed data.
nrOfContexts - The number of contexts used by the MQ coder.
init - The initial state for each context. A reference is kept to this array to reinitialize the contexts whenever 'reset()' or 'resetCtxts()' is called.
Method Detail

setLenCalcType

public void setLenCalcType(int ltype)
Set the length calculation type to the specified type.

Parameters:
ltype - The type of length calculation to use. One of 'LENGTH_LAZY', 'LENGTH_LAZY_GOOD' or 'LENGTH_NEAR_OPT'.

setTermType

public void setTermType(int ttype)
Set termination type to the specified type.

Parameters:
ttype - The type of termination to use. One of 'TERM_FULL', 'TERM_NEAR_OPT', 'TERM_EASY' or 'TERM_PRED_ER'.

fastCodeSymbols

public final void fastCodeSymbols(int bit,
                                  int ctxt,
                                  int n)
This method performs the coding of the symbol 'bit', using context 'ctxt', 'n' times, using the MQ-coder speedup mode if possible.

If the symbol 'bit' is the current more probable symbol (MPS) and qe[ctxt]<=0x4000, and (A-0x8000)>=qe[ctxt], speedup mode will be used. Otherwise the normal mode will be used. The speedup mode can significantly improve the speed of arithmetic coding when several MPS symbols, with a high probability distribution, must be coded with the same context. The generated bit stream is the same as if the normal mode was used.

This method is also faster than the 'codeSymbols()' and 'codeSymbol()' ones, for coding the same symbols with the same context several times, when speedup mode can not be used, although not significantly.

Parameters:
bit - The symbol do code, 0 or 1.
ctxt - The context to us in coding the symbol.
n - The number of times that the symbol must be coded.

codeSymbols

public final void codeSymbols(int[] bits,
                              int[] cX,
                              int n)
This function performs the arithmetic encoding of several symbols together. The function receives an array of symbols that are to be encoded and an array containing the contexts with which to encode them.

The advantage of using this function is that the cost of the method call is amortized by the number of coded symbols per method call.

Each context has a current MPS and an index describing what the current probability is for the LPS. Each bit is encoded and if the probability of the LPS exceeds .5, the MPS and LPS are switched.

Parameters:
bits - An array containing the symbols to be encoded. Valid symbols are 0 and 1.
cX - The context for each of the symbols to be encoded.
n - The number of symbols to encode.

codeSymbol

public final void codeSymbol(int bit,
                             int context)
This function performs the arithmetic encoding of one symbol. The function receives a bit that is to be encoded and a context with which to encode it.

Each context has a current MPS and an index describing what the current probability is for the LPS. Each bit is encoded and if the probability of the LPS exceeds .5, the MPS and LPS are switched.

Parameters:
bit - The symbol to be encoded, must be 0 or 1.
context - the context with which to encode the symbol.

byteOut

private void byteOut()
This function puts one byte of compressed bits in the output stream. The highest 8 bits of c are then put in b to be the next byte to write. This method delays the output of any 0xFF bytes until a non 0xFF byte has to be written to the output bit stream (the 'delFF' variable signals if there is a delayed 0xff byte).


terminate

public int terminate()
This function flushes the remaining encoded bits and makes sure that enough information is written to the bit stream to be able to finish decoding, and then it reinitializes the internal state of the MQ coder but without modifying the context states.

After calling this method the 'finishLengthCalculation()' method should be called, after compensating the returned length for the length of previous coded segments, so that the length calculation is finalized.

The type of termination used depends on the one specified at the constructor.

Returns:
The length of the arithmetic codeword after termination, in bytes.

getNumCtxts

public final int getNumCtxts()
Returns the number of contexts in the arithmetic coder.

Returns:
The number of contexts

resetCtxt

public final void resetCtxt(int c)
Resets a context to the original probability distribution, and sets its more probable symbol to 0.

Parameters:
c - The number of the context (it starts at 0).

resetCtxts

public final void resetCtxts()
Resets all contexts to their original probability distribution and sets all more probable symbols to 0.


getNumCodedBytes

public final int getNumCodedBytes()
Returns the number of bytes that are necessary from the compressed output stream to decode all the symbols that have been coded this far. The number of returned bytes does not include anything coded previous to the last time the 'terminate()' or 'reset()' methods where called.

The values returned by this method are then to be used in finishing the length calculation with the 'finishLengthCalculation()' method, after compensation of the offset in the number of bytes due to previous terminated segments.

This method should not be called if the current coding pass is to be terminated. The 'terminate()' method should be called instead.

The calculation is done based on the type of length calculation specified at the constructor.

Returns:
The number of bytes in the compressed output stream necessary to decode all the information coded this far.

reset

public final void reset()
Reinitializes the MQ coder and the underlying 'ByteOutputBuffer' buffer as if a new object was instantaited. All the data in the 'ByteOutputBuffer' buffer is erased and the state and contexts of the MQ coder are reinitialized). Additionally any saved MQ states are discarded.


saveState

private void saveState()
Saves the current state of the MQ coder (just the registers, not the contexts) so that a near optimal length calculation can be performed later.


finishLengthCalculation

public void finishLengthCalculation(int[] rates,
                                    int n)
Terminates the calculation of the required length for each coding pass. This method must be called just after the 'terminate()' one has been called for each terminated MQ segment.

The values in 'rates' must have been compensated for any offset due to previous terminated segments, so that the correct index to the stored coded data is used.

Parameters:
rates - The array containing the values returned by 'getNumCodedBytes()' for each coding pass.
n - The index in the 'rates' array of the last terminated length.