001/*
002 * $RCSfile: MQCoder.java,v $
003 * $Revision: 1.1 $
004 * $Date: 2005/02/11 05:02:09 $
005 * $State: Exp $
006 *
007 * Class:                   MQCoder
008 *
009 * Description:             Class that encodes a number of bits using the
010 *                          MQ arithmetic coder
011 *
012 *
013 *                          Diego SANTA CRUZ, Jul-26-1999 (improved speed)
014 *
015 * COPYRIGHT:
016 *
017 * This software module was originally developed by Raphaël Grosbois and
018 * Diego Santa Cruz (Swiss Federal Institute of Technology-EPFL); Joel
019 * Askelöf (Ericsson Radio Systems AB); and Bertrand Berthelot, David
020 * Bouchard, Félix Henry, Gerard Mozelle and Patrice Onno (Canon Research
021 * Centre France S.A) in the course of development of the JPEG2000
022 * standard as specified by ISO/IEC 15444 (JPEG 2000 Standard). This
023 * software module is an implementation of a part of the JPEG 2000
024 * Standard. Swiss Federal Institute of Technology-EPFL, Ericsson Radio
025 * Systems AB and Canon Research Centre France S.A (collectively JJ2000
026 * Partners) agree not to assert against ISO/IEC and users of the JPEG
027 * 2000 Standard (Users) any of their rights under the copyright, not
028 * including other intellectual property rights, for this software module
029 * with respect to the usage by ISO/IEC and Users of this software module
030 * or modifications thereof for use in hardware or software products
031 * claiming conformance to the JPEG 2000 Standard. Those intending to use
032 * this software module in hardware or software products are advised that
033 * their use may infringe existing patents. The original developers of
034 * this software module, JJ2000 Partners and ISO/IEC assume no liability
035 * for use of this software module or modifications thereof. No license
036 * or right to this software module is granted for non JPEG 2000 Standard
037 * conforming products. JJ2000 Partners have full right to use this
038 * software module for his/her own purpose, assign or donate this
039 * software module to any third party and to inhibit third parties from
040 * using this software module for non JPEG 2000 Standard conforming
041 * products. This copyright notice must be included in all copies or
042 * derivative works of this software module.
043 *
044 * Copyright (c) 1999/2000 JJ2000 Partners.
045 * */
046package jj2000.j2k.entropy.encoder;
047
048import jj2000.j2k.entropy.StdEntropyCoderOptions;
049import jj2000.j2k.util.ArrayUtil;
050
051/**
052 * This class implements the MQ arithmetic coder. When initialized a specific
053 * state can be specified for each context, which may be adapted to the
054 * probability distribution that is expected for that context.
055 *
056 * <P>The type of length calculation and termination can be chosen at
057 * construction time.
058 *
059 * ---- Tricks that have been tried to improve speed ----
060 *
061 * 1) Merging Qe and mPS and doubling the lookup tables
062 *
063 * Merge the mPS into Qe, as the sign bit (if Qe>=0 the sense of MPS is 0, if
064 * Qe<0 the sense is 1), and double the lookup tables. The first half of the
065 * lookup tables correspond to Qe>=0 (i.e. the sense of MPS is 0) and the
066 * second half to Qe<0 (i.e. the sense of MPS is 1). The nLPS lookup table is
067 * modified to incorporate the changes in the sense of MPS, by making it jump
068 * from the first to the second half and vice-versa, when a change is
069 * specified by the swicthLM lookup table. See JPEG book, section 13.2, page
070 * 225.
071 *
072 * There is NO speed improvement in doing this, actually there is a slight
073 * decrease, probably due to the fact that often Q has to be negated. Also the
074 * fact that a brach of the type "if (bit==mPS[li])" is replaced by two
075 * simpler braches of the type "if (bit==0)" and "if (q<0)" may contribute to
076 * that.
077 *
078 * 2) Removing cT
079 *
080 * It is possible to remove the cT counter by setting a flag bit in the high
081 * bits of the C register. This bit will be automatically shifted left
082 * whenever a renormalization shift occurs, which is equivalent to decreasing
083 * cT. When the flag bit reaches the sign bit (leftmost bit), which is
084 * equivalenet to cT==0, the byteOut() procedure is called. This test can be
085 * done efficiently with "c<0" since C is a signed quantity. Care must be
086 * taken in byteOut() to reset the bit in order to not interfere with other
087 * bits in the C register. See JPEG book, page 228.
088 *
089 * There is NO speed improvement in doing this. I don't really know why since
090 * the number of operations whenever a renormalization occurs is
091 * decreased. Maybe it is due to the number of extra operations in the
092 * byteOut(), terminate() and getNumCodedBytes() procedures.
093 *
094 *
095 * 3) Change the convention of MPS and LPS.
096 *
097 * Making the LPS interval be above the MPS interval (MQ coder convention is
098 * the opposite) can reduce the number of operations along the MPS path. In
099 * order to generate the same bit stream as with the MQ convention the output
100 * bytes need to be modified accordingly. The basic rule for this is that C =
101 * (C'^0xFF...FF)-A, where C is the codestream for the MQ convention and C' is
102 * the codestream generated by this other convention. Note that this affects
103 * bit-stuffing as well.
104 *
105 * This has not been tested yet.
106 *
107 * 4) Removing normalization while loop on MPS path
108 *
109 * Since in the MPS path Q is guaranteed to be always greater than 0x4000
110 * (decimal 0.375) it is never necessary to do more than 1 renormalization
111 * shift. Therefore the test of the while loop, and the loop itself, can be
112 * removed.
113 *
114 * 5) Simplifying test on A register
115 *
116 * Since A is always less than or equal to 0xFFFF, the test "(a & 0x8000)==0"
117 * can be replaced by the simplete test "a < 0x8000". This test is simpler in
118 * Java since it involves only 1 operation (although the original test can be
119 * converted to only one operation by  smart Just-In-Time compilers)
120 *
121 * This change has been integrated in the decoding procedures.
122 *
123 * 6) Speedup mode
124 *
125 * Implemented a method that uses the speedup mode of the MQ-coder if
126 * possible. This should greately improve performance when coding long runs of
127 * MPS symbols that have high probability. However, to take advantage of this,
128 * the entropy coder implementation has to explicetely use it. The generated
129 * bit stream is the same as if no speedup mode would have been used.
130 *
131 * Implemented but performance not tested yet.
132 *
133 * 7) Multiple-symbol coding
134 *
135 * Since the time spent in a method call is non-negligable, coding several
136 * symbols with one method call reduces the overhead per coded symbol. The
137 * decodeSymbols() method implements this. However, to take advantage of it,
138 * the implementation of the entropy coder has to explicitely use it.
139 *
140 * Implemented but performance not tested yet.
141 *  */
142public class MQCoder {
143
144    /** Identifier for the lazy length calculation. The lazy length
145     * calculation is not optimal but is extremely simple. */
146    public static final int LENGTH_LAZY = 0;
147
148    /** Identifier for a very simple length calculation. This provides better
149     * results than the 'LENGTH_LAZY' computation. This is the old length
150     * calculation that was implemented in this class. */
151    public static final int LENGTH_LAZY_GOOD = 1;
152
153    /** Identifier for the near optimal length calculation. This calculation
154     * is more complex than the lazy one but provides an almost optimal length
155     * calculation. */
156    public static final int LENGTH_NEAR_OPT = 2;
157
158    /** The identifier fort the termination that uses a full flush. This is
159     * the less efficient termination. */
160    public static final int TERM_FULL = 0;
161
162    /** The identifier for the termination that uses the near optimal length
163     * calculation to terminate the arithmetic codewrod */
164    public static final int TERM_NEAR_OPT = 1;
165
166    /** The identifier for the easy termination that is simpler than the
167     * 'TERM_NEAR_OPT' one but slightly less efficient. */
168    public static final int TERM_EASY = 2;
169
170    /** The identifier for the predictable termination policy for error
171     * resilience. This is the same as the 'TERM_EASY' one but an special
172     * sequence of bits is embodied in the spare bits for error resilience
173     * purposes. */
174    public static final int TERM_PRED_ER = 3;
175
176    /** The data structures containing the probabilities for the LPS */
177    final static
178        int qe[]={0x5601, 0x3401, 0x1801, 0x0ac1, 0x0521, 0x0221, 0x5601,
179                  0x5401, 0x4801, 0x3801, 0x3001, 0x2401, 0x1c01, 0x1601,
180                  0x5601, 0x5401, 0x5101, 0x4801, 0x3801, 0x3401, 0x3001,
181                  0x2801, 0x2401, 0x2201, 0x1c01, 0x1801, 0x1601, 0x1401,
182                  0x1201, 0x1101, 0x0ac1, 0x09c1, 0x08a1, 0x0521, 0x0441,
183                  0x02a1, 0x0221, 0x0141, 0x0111, 0x0085, 0x0049, 0x0025,
184                  0x0015, 0x0009, 0x0005, 0x0001, 0x5601 };
185
186    /** The indexes of the next MPS */
187    final static
188        int nMPS[]={ 1 , 2, 3, 4, 5,38, 7, 8, 9,10,11,12,13,29,15,16,17,
189                     18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,
190                     35,36,37,38,39,40,41,42,43,44,45,45,46 };
191
192    /** The indexes of the next LPS */
193    final static
194        int nLPS[]={ 1 , 6, 9,12,29,33, 6,14,14,14,17,18,20,21,14,14,15,
195                     16,17,18,19,19,20,21,22,23,24,25,26,27,28,29,30,31,
196                     32,33,34,35,36,37,38,39,40,41,42,43,46 };
197
198    /** Whether LPS and MPS should be switched */
199    final static        // at indices 0, 6, and 14 we switch
200        int switchLM[]={ 1,0,0,0,0,0,1,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,
201                         0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 };
202    // Having ints proved to be more efficient than booleans
203
204    /** The ByteOutputBuffer used to write the compressed bit stream. */
205    ByteOutputBuffer out;
206
207    /** The current most probable signal for each context */
208    int[] mPS;
209
210    /** The current index of each context */
211    int[] I;
212
213    /** The current bit code */
214    int c;
215
216    /** The bit code counter */
217    int cT;
218
219    /** The current interval */
220    int a;
221
222    /** The last encoded byte of data */
223    int b;
224
225    /** If a 0xFF byte has been delayed and not yet been written to the output
226     * (in the MQ we can never have more than 1 0xFF byte in a row). */
227    boolean delFF;
228
229    /** The number of written bytes so far, excluding any delayed 0xFF
230     * bytes. Upon initialization it is -1 to indicated that the byte buffer
231     * 'b' is empty as well. */
232    int nrOfWrittenBytes = -1;
233
234    /** The initial state of each context */
235    int initStates[];
236
237    /** The termination type to use. One of 'TERM_FULL', 'TERM_NEAR_OPT',
238     * 'TERM_EASY' or 'TERM_PRED_ER'. */
239    int ttype;
240
241    /** The length calculation type to use. One of 'LENGTH_LAZY',
242     * 'LENGTH_LAZY_GOOD', 'LENGTH_NEAR_OPT'. */
243    int ltype;
244
245    /** Saved values of the C register. Used for the LENGTH_NEAR_OPT length
246     * calculation. */
247    int savedC[];
248
249    /** Saved values of CT counter. Used for the LENGTH_NEAR_OPT length
250     * calculation. */
251    int savedCT[];
252
253    /** Saved values of the A register. Used for the LENGTH_NEAR_OPT length
254     * calculation. */
255    int savedA[];
256
257    /** Saved values of the B byte buffer. Used for the LENGTH_NEAR_OPT length
258     * calculation. */
259    int savedB[];
260
261    /** Saved values of the delFF (i.e. delayed 0xFF) state. Used for the
262     * LENGTH_NEAR_OPT length calculation. */
263    boolean savedDelFF[];
264
265    /** Number of saved states. Used for the LENGTH_NEAR_OPT length
266     * calculation. */
267    int nSaved;
268
269    /** The initial length of the arrays to save sates */
270    final static int SAVED_LEN = 32*StdEntropyCoderOptions.NUM_PASSES;
271
272    /** The increase in length for the arrays to save states */
273    final static int SAVED_INC = 4*StdEntropyCoderOptions.NUM_PASSES;
274
275    /**
276     * Set the length calculation type to the specified type
277     *
278     * @param ltype The type of length calculation to use. One of
279     * 'LENGTH_LAZY', 'LENGTH_LAZY_GOOD' or 'LENGTH_NEAR_OPT'.
280     * */
281    public void setLenCalcType(int ltype){
282        // Verify the ttype and ltype
283        if (ltype != LENGTH_LAZY && ltype != LENGTH_LAZY_GOOD &&
284            ltype != LENGTH_NEAR_OPT) {
285            throw new IllegalArgumentException("Unrecognized length "+
286                                               "calculation type code: "+ltype);
287        }
288
289        if(ltype == LENGTH_NEAR_OPT){
290            if(savedC==null)
291                savedC = new int[SAVED_LEN];
292            if(savedCT==null)
293                savedCT = new int[SAVED_LEN];
294            if(savedA==null)
295                savedA = new int[SAVED_LEN];
296            if(savedB==null)
297                savedB = new int[SAVED_LEN];
298            if(savedDelFF==null)
299                savedDelFF = new boolean[SAVED_LEN];
300        }
301
302        this.ltype = ltype;
303    }
304
305    /**
306     * Set termination type to the specified type
307     *
308     * @param ttype The type of termination to use. One of 'TERM_FULL',
309     * 'TERM_NEAR_OPT', 'TERM_EASY' or 'TERM_PRED_ER'.
310     * */
311    public void setTermType(int ttype){
312        if (ttype != TERM_FULL && ttype != TERM_NEAR_OPT &&
313            ttype != TERM_EASY && ttype != TERM_PRED_ER ) {
314            throw new IllegalArgumentException("Unrecognized termination type "+
315                                               "code: "+ttype);
316        }
317
318        this.ttype = ttype;
319    }
320
321    /**
322     * Instantiates a new MQ-coder, with the specified number of contexts and
323     * initial states. The compressed bytestream is written to the 'oStream'
324     * object.
325     *
326     * @param oStream where to output the compressed data
327     *
328     * @param nrOfContexts The number of contexts used
329     *
330     * @param init The initial state for each context. A reference is kept to
331     * this array to reinitialize the contexts whenever 'reset()' or
332     * 'resetCtxts()' is called.
333     * */
334    public MQCoder(ByteOutputBuffer oStream, int nrOfContexts, int init[]){
335        out = oStream;
336
337        // --- INITENC
338
339        // Default initialization of the statistics bins is MPS=0 and
340        // I=0
341        I=new int[nrOfContexts];
342        mPS=new int[nrOfContexts];
343        initStates = init;
344
345        a=0x8000;
346        c=0;
347        if(b==0xFF)
348            cT=13;
349        else
350            cT=12;
351
352        resetCtxts();
353
354        // End of INITENC ---
355
356        b=0;
357    }
358
359    /**
360     * This method performs the coding of the symbol 'bit', using context
361     * 'ctxt', 'n' times, using the MQ-coder speedup mode if possible.
362     *
363     * <P>If the symbol 'bit' is the current more probable symbol (MPS) and
364     * qe[ctxt]<=0x4000, and (A-0x8000)>=qe[ctxt], speedup mode will be
365     * used. Otherwise the normal mode will be used. The speedup mode can
366     * significantly improve the speed of arithmetic coding when several MPS
367     * symbols, with a high probability distribution, must be coded with the
368     * same context. The generated bit stream is the same as if the normal mode
369     * was used.
370     *
371     * <P>This method is also faster than the 'codeSymbols()' and
372     * 'codeSymbol()' ones, for coding the same symbols with the same context
373     * several times, when speedup mode can not be used, although not
374     * significantly.
375     *
376     * @param bit The symbol do code, 0 or 1.
377     *
378     * @param ctxt The context to us in coding the symbol
379     *
380     * @param n The number of times that the symbol must be coded.
381     * */
382    public final void fastCodeSymbols(int bit, int ctxt, int n) {
383        int q;  // cache for context's Qe
384        int la; // cache for A register
385        int nc; // counter for renormalization shifts
386        int ns; // the maximum length of a speedup mode run
387        int li; // cache for I[ctxt]
388
389        li = I[ctxt]; // cache current index
390        q=qe[li];     // retrieve current LPS prob.
391
392        if ((q <= 0x4000) && (bit == mPS[ctxt])
393            && ((ns = (a-0x8000)/q+1) > 1)) { // Do speed up mode
394            // coding MPS, no conditional exchange can occur and
395            // speedup mode is possible for more than 1 symbol
396            do { // do as many speedup runs as necessary
397                if (n <= ns) { // All symbols in this run
398                    // code 'n' symbols
399                    la = n*q; // accumulated Q
400                    a -= la;
401                    c += la;
402                    if (a >= 0x8000) { // no renormalization
403                        I[ctxt] = li;  // save the current state
404                        return; // done
405                    }
406                    I[ctxt] = nMPS[li]; // goto next state and save it
407                    // -- Renormalization (MPS: no need for while loop)
408                    a <<= 1; // a is doubled
409                    c <<= 1; // c is doubled
410                    cT--;
411                    if(cT==0) {
412                        byteOut();
413                    }
414                    // -- End of renormalization
415                    return; // done
416                }
417                else { // Not all symbols in this run
418                    // code 'ns' symbols
419                    la = ns*q; // accumulated Q
420                    c += la;
421                    a -= la;
422                    // cache li and q for next iteration
423                    li = nMPS[li];
424                    q = qe[li]; // New q is always less than current one
425                    // new I[ctxt] is stored in last run
426                    // Renormalization always occurs since we exceed 'ns'
427                    // -- Renormalization (MPS: no need for while loop)
428                    a <<= 1; // a is doubled
429                    c <<= 1; // c is doubled
430                    cT--;
431                    if(cT==0) {
432                        byteOut();
433                    }
434                    // -- End of renormalization
435                    n -= ns; // symbols left to code
436                    ns = (a-0x8000)/q+1; // max length of next speedup run
437                    continue; // goto next iteration
438                }
439            } while (n>0);
440        } // end speed up mode
441        else { // No speedup mode
442            // Either speedup mode is not possible or not worth doing it
443            // because of probable conditional exchange
444            // Code everything as in normal mode
445            la = a;       // cache A register in local variable
446            do {
447                if (bit == mPS[ctxt]) { // -- code MPS
448                    la -= q; // Interval division associated with MPS coding
449                    if(la>=0x8000){ // Interval big enough
450                        c += q;
451                    }
452                    else { // Interval too short
453                        if(la<q) // Probabilities are inverted
454                            la = q;
455                        else
456                            c += q;
457                        // cache new li and q for next iteration
458                        li = nMPS[li];
459                        q = qe[li];
460                        // new I[ctxt] is stored after end of loop
461                        // -- Renormalization (MPS: no need for while loop)
462                        la <<= 1; // a is doubled
463                        c <<= 1;  // c is doubled
464                        cT--;
465                        if(cT==0) {
466                            byteOut();
467                        }
468                        // -- End of renormalization
469                    }
470                }
471                else { // -- code LPS
472                    la -= q; // Interval division according to LPS coding
473                    if(la<q)
474                        c += q;
475                    else
476                        la = q;
477                    if(switchLM[li]!=0){
478                        mPS[ctxt]=1-mPS[ctxt];
479                    }
480                    // cache new li and q for next iteration
481                    li = nLPS[li];
482                    q = qe[li];
483                    // new I[ctxt] is stored after end of loop
484                    // -- Renormalization
485                    // sligthly better than normal loop
486                    nc = 0;
487                    do {
488                        la <<= 1;
489                        nc++; // count number of necessary shifts
490                    } while (la<0x8000);
491                    if (cT > nc) {
492                        c <<= nc;
493                        cT -= nc;
494                    }
495                    else {
496                        do {
497                            c <<= cT;
498                            nc -= cT;
499                            // cT = 0; // not necessary
500                            byteOut();
501                        } while (cT <= nc);
502                        c <<= nc;
503                        cT -= nc;
504                    }
505                    // -- End of renormalization
506                }
507                n--;
508            } while (n>0);
509            I[ctxt] = li; // store new I[ctxt]
510            a = la; // save cached A register
511        }
512    }
513
514    /**
515     * This function performs the arithmetic encoding of several symbols
516     * together. The function receives an array of symbols that are to be
517     * encoded and an array containing the contexts with which to encode them.
518     *
519     * <P>The advantage of using this function is that the cost of the method
520     * call is amortized by the number of coded symbols per method call.
521     *
522     * <P>Each context has a current MPS and an index describing what the
523     * current probability is for the LPS. Each bit is encoded and if the
524     * probability of the LPS exceeds .5, the MPS and LPS are switched.
525     *
526     * @param bits An array containing the symbols to be encoded. Valid
527     * symbols are 0 and 1.
528     *
529     * @param cX The context for each of the symbols to be encoded
530     *
531     * @param n The number of symbols to encode.
532     * */
533    public final void codeSymbols(int[] bits, int[] cX, int n){
534        int q;
535        int li; // local cache of I[context]
536        int la;
537        int nc;
538        int ctxt; // context of current symbol
539        int i; // counter
540
541        // NOTE: here we could use symbol aggregation to speed things up.
542        // It remains to be studied.
543
544        la = a; // cache A register in local variable
545        for(i=0;i<n;i++){
546            // NOTE: (a < 0x8000) is equivalent to ((a & 0x8000)==0)
547            // since 'a' is always less than or equal to 0xFFFF
548
549            // NOTE: conditional exchange guarantees that A for MPS is
550            // always greater than 0x4000 (i.e. 0.375)
551            // => one renormalization shift is enough for MPS
552            // => no need to do a renormalization while loop for MPS
553
554            ctxt = cX[i];
555            li = I[ctxt];
556            q=qe[li]; // Retrieve current LPS prob.
557
558            if(bits[i]==mPS[ctxt]){ // -- Code MPS
559
560                la -= q; // Interval division associated with MPS coding
561
562                if(la>=0x8000){ // Interval big enough
563                    c += q;
564                }
565                else { // Interval too short
566                    if(la<q) // Probabilities are inverted
567                        la = q;
568                    else
569                        c += q;
570
571                    I[ctxt]=nMPS[li];
572
573                    // -- Renormalization (MPS: no need for while loop)
574                    la <<= 1; // a is doubled
575                    c <<= 1; // c is doubled
576                    cT--;
577                    if(cT==0) {
578                        byteOut();
579                    }
580                    // -- End of renormalization
581                }
582            }// End Code MPS --
583            else{ // -- Code LPS
584                la -= q; // Interval division according to LPS coding
585
586                if(la<q)
587                    c += q;
588                else
589                    la = q;
590                if(switchLM[li]!=0){
591                    mPS[ctxt]=1-mPS[ctxt];
592                }
593                I[ctxt]=nLPS[li];
594
595                // -- Renormalization
596
597                // sligthly better than normal loop
598                nc = 0;
599                do {
600                    la <<= 1;
601                    nc++; // count number of necessary shifts
602                } while (la<0x8000);
603                if (cT > nc) {
604                    c <<= nc;
605                    cT -= nc;
606                }
607                else {
608                    do {
609                        c <<= cT;
610                        nc -= cT;
611                        // cT = 0; // not necessary
612                        byteOut();
613                    } while (cT <= nc);
614                    c <<= nc;
615                    cT -= nc;
616                }
617
618                // -- End of renormalization
619            }
620        }
621        a = la; // save cached A register
622    }
623
624
625    /**
626     * This function performs the arithmetic encoding of one symbol. The
627     * function receives a bit that is to be encoded and a context with which
628     * to encode it.
629     *
630     * <P>Each context has a current MPS and an index describing what the
631     * current probability is for the LPS. Each bit is encoded and if the
632     * probability of the LPS exceeds .5, the MPS and LPS are switched.
633     *
634     * @param bit The symbol to be encoded, must be 0 or 1.
635     *
636     * @param context the context with which to encode the symbol.
637     * */
638    public final void codeSymbol(int bit, int context){
639        int q;
640        int li; // local cache of I[context]
641        int la;
642        int n;
643
644        // NOTE: (a < 0x8000) is equivalent to ((a & 0x8000)==0)
645        // since 'a' is always less than or equal to 0xFFFF
646
647        // NOTE: conditional exchange guarantees that A for MPS is
648        // always greater than 0x4000 (i.e. 0.375)
649        // => one renormalization shift is enough for MPS
650        // => no need to do a renormalization while loop for MPS
651
652        li = I[context];
653        q=qe[li]; // Retrieve current LPS prob.
654
655        if(bit==mPS[context]){// -- Code MPS
656
657            a -= q; // Interval division associated with MPS coding
658
659            if(a>=0x8000){ // Interval big enough
660                c += q;
661            }
662            else { // Interval too short
663                if(a<q) // Probabilities are inverted
664                    a = q;
665                else
666                    c += q;
667
668                I[context]=nMPS[li];
669
670                // -- Renormalization (MPS: no need for while loop)
671                a <<= 1; // a is doubled
672                c <<= 1; // c is doubled
673                cT--;
674                if(cT==0) {
675                    byteOut();
676                }
677                // -- End of renormalization
678            }
679        }// End Code MPS --
680        else{ // -- Code LPS
681
682            la = a; // cache A register in local variable
683            la -= q; // Interval division according to LPS coding
684
685            if(la<q)
686                c += q;
687            else
688                la = q;
689            if(switchLM[li]!=0){
690                mPS[context]=1-mPS[context];
691            }
692            I[context]=nLPS[li];
693
694            // -- Renormalization
695
696            // sligthly better than normal loop
697            n = 0;
698            do {
699                la <<= 1;
700                n++; // count number of necessary shifts
701            } while (la<0x8000);
702            if (cT > n) {
703                c <<= n;
704                cT -= n;
705            }
706            else {
707                do {
708                    c <<= cT;
709                    n -= cT;
710                    // cT = 0; // not necessary
711                    byteOut();
712                } while (cT <= n);
713                c <<= n;
714                cT -= n;
715            }
716
717            // -- End of renormalization
718            a = la; // save cached A register
719        }
720    }
721
722    /**
723     * This function puts one byte of compressed bits in the out out stream.
724     * the highest 8 bits of c are then put in b to be the next byte to
725     * write. This method delays the output of any 0xFF bytes until a non 0xFF
726     * byte has to be written to the output bit stream (the 'delFF' variable
727     * signals if there is a delayed 0xff byte).
728     * */
729    private void byteOut(){
730        if(nrOfWrittenBytes >= 0){
731            if(b==0xFF){
732                // Delay 0xFF byte
733                delFF = true;
734                b=c>>>20;
735                c &= 0xFFFFF;
736                cT=7;
737            }
738            else if(c < 0x8000000){
739                // Write delayed 0xFF bytes
740                if (delFF) {
741                    out.write(0xFF);
742                    delFF = false;
743                    nrOfWrittenBytes++;
744                }
745                out.write(b);
746                nrOfWrittenBytes++;
747                b=c>>>19;
748                c &= 0x7FFFF;
749                cT=8;
750            }
751            else{
752                b++;
753                if(b==0xFF){
754                    // Delay 0xFF byte
755                    delFF = true;
756                    c &= 0x7FFFFFF;
757                    b=c>>>20;
758                    c &= 0xFFFFF;
759                    cT=7;
760                }
761                else{
762                    // Write delayed 0xFF bytes
763                    if (delFF) {
764                        out.write(0xFF);
765                        delFF = false;
766                        nrOfWrittenBytes++;
767                    }
768                    out.write(b);
769                    nrOfWrittenBytes++;
770                    b=((c>>>19)&0xFF);
771                    c &= 0x7FFFF;
772                    cT=8;
773                }
774            }
775        }
776        else {
777            // NOTE: carry bit can never be set if the byte buffer was empty
778            b= (c>>>19);
779            c &= 0x7FFFF;
780            cT=8;
781            nrOfWrittenBytes++;
782        }
783    }
784
785    /**
786     * This function flushes the remaining encoded bits and makes sure that
787     * enough information is written to the bit stream to be able to finish
788     * decoding, and then it reinitializes the internal state of the MQ coder
789     * but without modifying the context states.
790     *
791     * <P>After calling this method the 'finishLengthCalculation()' method
792     * should be called, after cmopensating the returned length for the length
793     * of previous coded segments, so that the length calculation is finalized.
794     *
795     * <P>The type of termination used depends on the one specified at the
796     * constructor.
797     *
798     * @return The length of the arithmetic codeword after termination, in
799     * bytes.
800     * */
801    public int terminate(){
802        switch (ttype) {
803        case TERM_FULL:
804            //sets the remaining bits of the last byte of the coded bits.
805            int tempc=c+a;
806            c=c|0xFFFF;
807            if(c>=tempc)
808                c=c-0x8000;
809
810            int remainingBits = 27-cT;
811
812            // Flushes remainingBits
813            do{
814                c <<= cT;
815                if(b != 0xFF)
816                    remainingBits -= 8;
817                else
818                    remainingBits -= 7;
819                byteOut();
820            }
821            while(remainingBits > 0);
822
823            b |= (1<<(-remainingBits))-1;
824            if (b==0xFF) { // Delay 0xFF bytes
825                delFF = true;
826            }
827            else {
828                // Write delayed 0xFF bytes
829                if (delFF) {
830                    out.write(0xFF);
831                    delFF = false;
832                    nrOfWrittenBytes++;
833                }
834                out.write(b);
835                nrOfWrittenBytes++;
836            }
837            break;
838        case TERM_PRED_ER:
839        case TERM_EASY:
840            // The predictable error resilient and easy termination are the
841            // same, except for the fact that the easy one can modify the
842            // spare bits in the last byte to maximize the likelihood of
843            // having a 0xFF, while the error resilient one can not touch
844            // these bits.
845
846            // In the predictable error resilient case the spare bits will be
847            // recalculated by the decoder and it will check if they are the
848            // same as as in the codestream and then deduce an error
849            // probability from there.
850
851            int k; // number of bits to push out
852
853            k = (11-cT)+1;
854
855            c <<= cT;
856            for (; k > 0; k-=cT, c<<=cT){
857              byteOut();
858            }
859
860            // Make any spare bits 1s if in easy termination
861            if (k < 0 && ttype == TERM_EASY) {
862                // At this stage there is never a carry bit in C, so we can
863                // freely modify the (-k) least significant bits.
864                b |= (1<<(-k))-1;
865            }
866
867            byteOut(); // Push contents of byte buffer
868            break;
869        case TERM_NEAR_OPT:
870
871            // This algorithm terminates in the shortest possible way, besides
872            // the fact any previous 0xFF 0x7F sequences are not
873            // eliminated. The probabalility of having those sequences is
874            // extremely low.
875
876            // The calculation of the length is based on the fact that the
877            // decoder will pad the codestream with an endless string of
878            // (binary) 1s. If the codestream, padded with 1s, is within the
879            // bounds of the current interval then correct decoding is
880            // guaranteed. The lower inclusive bound of the current interval
881            // is the value of C (i.e. if only lower intervals would be coded
882            // in the future). The upper exclusive bound of the current
883            // interval is C+A (i.e. if only upper intervals would be coded in
884            // the future). We therefore calculate the minimum length that
885            // would be needed so that padding with 1s gives a codestream
886            // within the interval.
887
888            // In general, such a calculation needs the value of the next byte
889            // that appears in the codestream. Here, since we are terminating,
890            // the next value can be anything we want that lies within the
891            // interval, we use the lower bound since this minimizes the
892            // length. To calculate the necessary length at any other place
893            // than the termination it is necessary to know the next bytes
894            // that will appear in the codestream, which involves storing the
895            // codestream and the sate of the MQCoder at various points (a
896            // worst case approach can be used, but it is much more
897            // complicated and the calculated length would be only marginally
898            // better than much simple calculations, if not the same).
899
900            int cLow;
901            int cUp;
902            int bLow;
903            int bUp;
904
905            // Initialize the upper (exclusive) and lower bound (inclusive) of
906            // the valid interval (the actual interval is the concatenation of
907            // bUp and cUp, and bLow and cLow).
908            cLow = c;
909            cUp = c+a;
910            bLow = bUp = b;
911
912            // We start by normalizing the C register to the sate cT = 0
913            // (i.e., just before byteOut() is called)
914            cLow <<= cT;
915            cUp  <<= cT;
916            // Progate eventual carry bits and reset them in Clow, Cup NOTE:
917            // carry bit can never be set if the byte buffer was empty so no
918            // problem with propagating a carry into an empty byte buffer.
919            if ((cLow & (1<<27)) != 0) { // Carry bit in cLow
920                if (bLow == 0xFF) {
921                    // We can not propagate carry bit, do bit stuffing
922                    delFF = true; // delay 0xFF
923                    // Get next byte buffer
924                    bLow = cLow>>>20;
925                    bUp = cUp>>>20;
926                    cLow &= 0xFFFFF;
927                    cUp &= 0xFFFFF;
928                    // Normalize to cT = 0
929                    cLow <<= 7;
930                    cUp <<= 7;
931                }
932                else { // we can propagate carry bit
933                    bLow++; // propagate
934                    cLow &= ~(1<<27); // reset carry in cLow
935                }
936            }
937            if ((cUp & (1<<27)) != 0) {
938                bUp++; // propagate
939                cUp &= ~(1<<27); // reset carry
940            }
941
942            // From now on there can never be a carry bit on cLow, since we
943            // always output bLow.
944
945            // Loop testing for the condition and doing byte output if they
946            // are not met.
947            while(true){
948                // If decoder's codestream is within interval stop
949                // If preceding byte is 0xFF only values [0,127] are valid
950                if(delFF){ // If delayed 0xFF
951                    if (bLow <= 127 && bUp > 127) break;
952                    // We will write more bytes so output delayed 0xFF now
953                    out.write(0xFF);
954                    nrOfWrittenBytes++;
955                    delFF = false;
956                }
957                else{ // No delayed 0xFF
958                    if (bLow <= 255 && bUp > 255) break;
959                }
960
961                // Output next byte
962                // We could output anything within the interval, but using
963                // bLow simplifies things a lot.
964
965                // We should not have any carry bit here
966
967                // Output bLow
968                if (bLow < 255) {
969                    // Transfer byte bits from C to B
970                    // (if the byte buffer was empty output nothing)
971                    if (nrOfWrittenBytes >= 0) out.write(bLow);
972                    nrOfWrittenBytes++;
973                    bUp -= bLow;
974                    bUp <<= 8;
975                    // Here bLow would be 0
976                    bUp |= (cUp >>> 19) & 0xFF;
977                    bLow = (cLow>>> 19) & 0xFF;
978                    // Clear upper bits (just pushed out) from cUp Clow.
979                    cLow &= 0x7FFFF;
980                    cUp  &= 0x7FFFF;
981                    // Goto next state where CT is 0
982                    cLow <<= 8;
983                    cUp  <<= 8;
984                    // Here there can be no carry on Cup, Clow
985                }
986                else { // bLow = 0xFF
987                    // Transfer byte bits from C to B
988                    // Since the byte to output is 0xFF we can delay it
989                    delFF = true;
990                    bUp -= bLow;
991                    bUp <<= 7;
992                    // Here bLow would be 0
993                    bUp |= (cUp>>20) & 0x7F;
994                    bLow = (cLow>>20) & 0x7F;
995                    // Clear upper bits (just pushed out) from cUp Clow.
996                    cLow &= 0xFFFFF;
997                    cUp  &= 0xFFFFF;
998                    // Goto next state where CT is 0
999                    cLow <<= 7;
1000                    cUp  <<= 7;
1001                    // Here there can be no carry on Cup, Clow
1002                }
1003            }
1004            break;
1005        default:
1006            throw new Error("Illegal termination type code");
1007        }
1008
1009        // Reinitialize the state (without modifying the contexts)
1010        int len;
1011
1012        len = nrOfWrittenBytes;
1013        a = 0x8000;
1014        c = 0;
1015        b = 0;
1016        cT = 12;
1017        delFF = false;
1018        nrOfWrittenBytes = -1;
1019
1020        // Return the terminated length
1021        return len;
1022    }
1023
1024    /**
1025     * Returns the number of contexts in the arithmetic coder.
1026     *
1027     * @return The number of contexts
1028     * */
1029    public final int getNumCtxts(){
1030        return I.length;
1031    }
1032
1033    /**
1034     * Resets a context to the original probability distribution, and sets its
1035     * more probable symbol to 0.
1036     *
1037     * @param c The number of the context (it starts at 0).
1038     * */
1039    public final void resetCtxt(int c){
1040        I[c]=initStates[c];
1041        mPS[c] = 0;
1042    }
1043
1044    /**
1045     * Resets all contexts to their original probability distribution and sets
1046     * all more probable symbols to 0.
1047     * */
1048    public final void resetCtxts(){
1049        System.arraycopy(initStates,0,I,0,I.length);
1050        ArrayUtil.intArraySet(mPS,0);
1051    }
1052
1053    /**
1054     * Returns the number of bytes that are necessary from the compressed
1055     * output stream to decode all the symbols that have been coded this
1056     * far. The number of returned bytes does not include anything coded
1057     * previous to the last time the 'terminate()' or 'reset()' methods where
1058     * called.
1059     *
1060     * <P>The values returned by this method are then to be used in finishing
1061     * the length calculation with the 'finishLengthCalculation()' method,
1062     * after compensation of the offset in the number of bytes due to previous
1063     * terminated segments.
1064     *
1065     * <P>This method should not be called if the current coding pass is to be
1066     * terminated. The 'terminate()' method should be called instead.
1067     *
1068     * <P>The calculation is done based on the type of length calculation
1069     * specified at the constructor.
1070     *
1071     * @return The number of bytes in the compressed output stream necessary
1072     * to decode all the information coded this far.
1073     * */
1074    public final int getNumCodedBytes(){
1075        // NOTE: testing these algorithms for correctness is quite
1076        // difficult. One way is to modify the rate allocator so that not all
1077        // bit-planes are output if the distortion estimate for last passes is
1078        // the same as for the previous ones.
1079
1080        switch (ltype) {
1081        case LENGTH_LAZY_GOOD:
1082            // This one is a bit better than LENGTH_LAZY.
1083            int bitsInN3Bytes; // The minimum amount of bits that can be stored
1084                               // in the 3 bytes following the current byte
1085                               // buffer 'b'.
1086            if (b >= 0xFE) {
1087                // The byte after b can have a bit stuffed so ther could be
1088                // one less bit available
1089                bitsInN3Bytes = 22; // 7 + 8 + 7
1090            }
1091            else {
1092                // We are sure that next byte after current byte buffer has no
1093                // bit stuffing
1094                bitsInN3Bytes = 23; // 8 + 7 + 8
1095            }
1096            if ((11-cT+16) <= bitsInN3Bytes) {
1097                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+3;
1098            }
1099            else {
1100                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+4;
1101            }
1102        case LENGTH_LAZY:
1103            // This is the very basic one that appears in the VM text
1104            if ((27-cT) <= 22) {
1105                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+3;
1106            }
1107            else {
1108                return nrOfWrittenBytes+(delFF ? 1 : 0)+1+4;
1109            }
1110        case LENGTH_NEAR_OPT:
1111            // This is the best length calculation implemented in this class.
1112            // It is almost always optimal. In order to calculate the length
1113            // it is necessary to know which bytes will follow in the MQ
1114            // bit stream, so we need to wait until termination to perform it.
1115            // Save the state to perform the calculation later, in
1116            // finishLengthCalculation()
1117            saveState();
1118            // Return current number of output bytes to use it later in
1119            // finishLengthCalculation()
1120            return nrOfWrittenBytes;
1121        default:
1122            throw new Error("Illegal length calculation type code");
1123        }
1124    }
1125
1126    /**
1127     * Reinitializes the MQ coder and the underlying 'ByteOutputBuffer' buffer
1128     * as if a new object was instantaited. All the data in the
1129     * 'ByteOutputBuffer' buffer is erased and the state and contexts of the
1130     * MQ coder are reinitialized). Additionally any saved MQ states are
1131     * discarded.
1132     * */
1133    public final void reset() {
1134
1135        // Reset the output buffer
1136        out.reset();
1137
1138        a=0x8000;
1139        c=0;
1140        b=0;
1141        if(b==0xFF)
1142            cT=13;
1143        else
1144            cT=12;
1145        resetCtxts();
1146        nrOfWrittenBytes = -1;
1147        delFF = false;
1148
1149        nSaved = 0;
1150    }
1151
1152    /**
1153     * Saves the current state of the MQ coder (just the registers, not the
1154     * contexts) so that a near optimal length calculation can be performed
1155     * later.
1156     * */
1157    private void saveState() {
1158        // Increase capacity if necessary
1159        if (nSaved == savedC.length) {
1160            Object tmp;
1161            tmp = savedC;
1162            savedC = new int[nSaved+SAVED_INC];
1163            System.arraycopy(tmp,0,savedC,0,nSaved);
1164            tmp = savedCT;
1165            savedCT = new int[nSaved+SAVED_INC];
1166            System.arraycopy(tmp,0,savedCT,0,nSaved);
1167            tmp = savedA;
1168            savedA = new int[nSaved+SAVED_INC];
1169            System.arraycopy(tmp,0,savedA,0,nSaved);
1170            tmp = savedB;
1171            savedB = new int[nSaved+SAVED_INC];
1172            System.arraycopy(tmp,0,savedB,0,nSaved);
1173            tmp = savedDelFF;
1174            savedDelFF = new boolean[nSaved+SAVED_INC];
1175            System.arraycopy(tmp,0,savedDelFF,0,nSaved);
1176        }
1177        // Save the current sate
1178        savedC[nSaved] = c;
1179        savedCT[nSaved] = cT;
1180        savedA[nSaved] = a;
1181        savedB[nSaved] = b;
1182        savedDelFF[nSaved] = delFF;
1183        nSaved++;
1184    }
1185
1186    /**
1187     * Terminates the calculation of the required length for each coding
1188     * pass. This method must be called just after the 'terminate()' one has
1189     * been called for each terminated MQ segment.
1190     *
1191     * <P>The values in 'rates' must have been compensated for any offset due
1192     * to previous terminated segments, so that the correct index to the
1193     * stored coded data is used.
1194     *
1195     * @param rates The array containing the values returned by
1196     * 'getNumCodedBytes()' for each coding pass.
1197     *
1198     * @param n The index in the 'rates' array of the last terminated length.
1199     * */
1200    public void finishLengthCalculation(int rates[], int n) {
1201        if (ltype != LENGTH_NEAR_OPT) {
1202            // For the simple calculations the only thing we need to do is to
1203            // ensure that the calculated lengths are no greater than the
1204            // terminated one
1205            if (n > 0 && rates[n-1] > rates[n]) {
1206                // We need correction
1207                int tl = rates[n]; // The terminated length
1208                n--;
1209                do {
1210                    rates[n--] = tl;
1211               }  while (n >= 0 && rates[n] > tl);
1212            }
1213        }
1214        else {
1215            // We need to perform the more sophisticated near optimal
1216            // calculation.
1217
1218            // The calculation of the length is based on the fact that the
1219            // decoder will pad the codestream with an endless string of
1220            // (binary) 1s after termination. If the codestream, padded with
1221            // 1s, is within the bounds of the current interval then correct
1222            // decoding is guaranteed. The lower inclusive bound of the
1223            // current interval is the value of C (i.e. if only lower
1224            // intervals would be coded in the future). The upper exclusive
1225            // bound of the current interval is C+A (i.e. if only upper
1226            // intervals would be coded in the future). We therefore calculate
1227            // the minimum length that would be needed so that padding with 1s
1228            // gives a codestream within the interval.
1229
1230            // In order to know what will be appended to the current base of
1231            // the interval we need to know what is in the MQ bit stream after
1232            // the current last output byte until the termination. This is why
1233            // this calculation has to be performed after the MQ segment has
1234            // been entirely coded and terminated.
1235
1236            int cLow; // lower bound on the C register for correct decoding
1237            int cUp;  // upper bound on the C register for correct decoding
1238            int bLow; // lower bound on the byte buffer for correct decoding
1239            int bUp;  // upper bound on the byte buffer for correct decoding
1240            int ridx; // index in the rates array of the pass we are
1241            // calculating
1242            int sidx; // index in the saved state array
1243            int clen; // current calculated length
1244            boolean cdFF; // the current delayed FF state
1245            int nb;   // the next byte of output
1246            int minlen; // minimum possible length
1247            int maxlen; // maximum possible length
1248
1249            // Start on the first pass of this segment
1250            ridx = n-nSaved;
1251            // Minimum allowable length is length of previous termination
1252            minlen = (ridx-1>=0) ? rates[ridx-1] : 0;
1253            // Maximum possible length is the terminated length
1254            maxlen = rates[n];
1255            for (sidx = 0; ridx < n; ridx++, sidx++) {
1256                // Load the initial values of the bounds
1257                cLow = savedC[sidx];
1258                cUp = savedC[sidx]+savedA[sidx];
1259                bLow = savedB[sidx];
1260                bUp = savedB[sidx];
1261                // Normalize to CT = 0 and propagate and reset any carry bits
1262                cLow <<= savedCT[sidx];
1263                if ((cLow & 0x8000000) != 0) {
1264                    bLow++;
1265                    cLow &= 0x7FFFFFF;
1266                }
1267                cUp <<= savedCT[sidx];
1268                if ((cUp & 0x8000000) != 0) {
1269                    bUp++;
1270                    cUp &= 0x7FFFFFF;
1271                }
1272                // Initialize current calculated length
1273                cdFF = savedDelFF[sidx];
1274                // rates[ridx] contains the number of bytes already output
1275                // when the state was saved, compensated for the offset in the
1276                // output stream.
1277                clen = rates[ridx]+(cdFF? 1 : 0);
1278                while (true) {
1279                    // If we are at end of coded data then this is the length
1280                    if (clen >= maxlen) {
1281                        clen = maxlen;
1282                        break;
1283                    }
1284                    // Check for sufficiency of coded data
1285                    if (cdFF) {
1286                        if (bLow < 128 && bUp >= 128) {
1287                            // We are done for this pass
1288                            clen--; // Don't need delayed FF
1289                            break;
1290                        }
1291                    }
1292                    else {
1293                        if (bLow < 256 && bUp >= 256) {
1294                            // We are done for this pass
1295                            break;
1296                        }
1297                    }
1298                    // Update bounds with next byte of coded data and
1299                    // normalize to CT = 0 again.
1300                    nb =  (clen >= minlen) ? out.getByte(clen) : 0;
1301                    bLow -= nb;
1302                    bUp -= nb;
1303                    clen++;
1304                    if (nb == 0xFF) {
1305                        bLow <<= 7;
1306                        bLow |= (cLow >> 20) & 0x7F;
1307                        cLow &= 0xFFFFF;
1308                        cLow <<= 7;
1309                        bUp <<= 7;
1310                        bUp |= (cUp >> 20) & 0x7F;
1311                        cUp &= 0xFFFFF;
1312                        cUp <<= 7;
1313                        cdFF = true;
1314                    }
1315                    else {
1316                        bLow <<= 8;
1317                        bLow |= (cLow >> 19) & 0xFF;
1318                        cLow &= 0x7FFFF;
1319                        cLow <<= 8;
1320                        bUp <<= 8;
1321                        bUp |= (cUp >> 19) & 0xFF;
1322                        cUp &= 0x7FFFF;
1323                        cUp <<= 8;
1324                        cdFF = false;
1325                    }
1326                    // Test again
1327                }
1328                // Store the rate found
1329                rates[ridx] = (clen>=minlen) ? clen : minlen;
1330            }
1331            // Reset the saved states
1332            nSaved = 0;
1333        }
1334    }
1335}