1 /* 2 * Copyright (C) 2014 The Android Open Source Project 3 * Copyright (c) 2000, 2013, Oracle and/or its affiliates. All rights reserved. 4 * DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER. 5 * 6 * This code is free software; you can redistribute it and/or modify it 7 * under the terms of the GNU General Public License version 2 only, as 8 * published by the Free Software Foundation. Oracle designates this 9 * particular file as subject to the "Classpath" exception as provided 10 * by Oracle in the LICENSE file that accompanied this code. 11 * 12 * This code is distributed in the hope that it will be useful, but WITHOUT 13 * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or 14 * FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License 15 * version 2 for more details (a copy is included in the LICENSE file that 16 * accompanied this code). 17 * 18 * You should have received a copy of the GNU General Public License version 19 * 2 along with this work; if not, write to the Free Software Foundation, 20 * Inc., 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA. 21 * 22 * Please contact Oracle, 500 Oracle Parkway, Redwood Shores, CA 94065 USA 23 * or visit www.oracle.com if you need additional information or have any 24 * questions. 25 */ 26 27 // -- This file was mechanically generated: Do not edit! -- // 28 29 package java.nio.charset; 30 31 import java.nio.Buffer; 32 import java.nio.ByteBuffer; 33 import java.nio.CharBuffer; 34 import java.nio.BufferOverflowException; 35 import java.nio.BufferUnderflowException; 36 import java.lang.ref.WeakReference; 37 import java.nio.charset.CoderMalfunctionError; // javadoc 38 import java.util.Arrays; 39 40 41 /** 42 * An engine that can transform a sequence of bytes in a specific charset into a sequence of 43 * sixteen-bit Unicode characters. 44 * 45 * <a name="steps"></a> 46 * 47 * <p> The input byte sequence is provided in a byte buffer or a series 48 * of such buffers. The output character sequence is written to a character buffer 49 * or a series of such buffers. A decoder should always be used by making 50 * the following sequence of method invocations, hereinafter referred to as a 51 * <i>decoding operation</i>: 52 * 53 * <ol> 54 * 55 * <li><p> Reset the decoder via the {@link #reset reset} method, unless it 56 * has not been used before; </p></li> 57 * 58 * <li><p> Invoke the {@link #decode decode} method zero or more times, as 59 * long as additional input may be available, passing <tt>false</tt> for the 60 * <tt>endOfInput</tt> argument and filling the input buffer and flushing the 61 * output buffer between invocations; </p></li> 62 * 63 * <li><p> Invoke the {@link #decode decode} method one final time, passing 64 * <tt>true</tt> for the <tt>endOfInput</tt> argument; and then </p></li> 65 * 66 * <li><p> Invoke the {@link #flush flush} method so that the decoder can 67 * flush any internal state to the output buffer. </p></li> 68 * 69 * </ol> 70 * 71 * Each invocation of the {@link #decode decode} method will decode as many 72 * bytes as possible from the input buffer, writing the resulting characters 73 * to the output buffer. The {@link #decode decode} method returns when more 74 * input is required, when there is not enough room in the output buffer, or 75 * when a decoding error has occurred. In each case a {@link CoderResult} 76 * object is returned to describe the reason for termination. An invoker can 77 * examine this object and fill the input buffer, flush the output buffer, or 78 * attempt to recover from a decoding error, as appropriate, and try again. 79 * 80 * <a name="ce"></a> 81 * 82 * <p> There are two general types of decoding errors. If the input byte 83 * sequence is not legal for this charset then the input is considered <i>malformed</i>. If 84 * the input byte sequence is legal but cannot be mapped to a valid 85 * Unicode character then an <i>unmappable character</i> has been encountered. 86 * 87 * <a name="cae"></a> 88 * 89 * <p> How a decoding error is handled depends upon the action requested for 90 * that type of error, which is described by an instance of the {@linkplain 91 * CodingErrorAction} class. The possible error actions are to {@linkplain 92 * CodingErrorAction#IGNORE ignore} the erroneous input, {@link 93 * CodingErrorAction#REPORT report} the error to the invoker via 94 * the returned {@link CoderResult} object, or {@linkplain CodingErrorAction#REPLACE 95 * replace} the erroneous input with the current value of the 96 * replacement string. The replacement 97 * 98 99 100 101 102 103 * has the initial value <tt>"\uFFFD"</tt>; 104 105 * 106 * its value may be changed via the {@link #replaceWith(java.lang.String) 107 * replaceWith} method. 108 * 109 * <p> The default action for malformed-input and unmappable-character errors 110 * is to {@linkplain CodingErrorAction#REPORT report} them. The 111 * malformed-input error action may be changed via the {@link 112 * #onMalformedInput(CodingErrorAction) onMalformedInput} method; the 113 * unmappable-character action may be changed via the {@link 114 * #onUnmappableCharacter(CodingErrorAction) onUnmappableCharacter} method. 115 * 116 * <p> This class is designed to handle many of the details of the decoding 117 * process, including the implementation of error actions. A decoder for a 118 * specific charset, which is a concrete subclass of this class, need only 119 * implement the abstract {@link #decodeLoop decodeLoop} method, which 120 * encapsulates the basic decoding loop. A subclass that maintains internal 121 * state should, additionally, override the {@link #implFlush implFlush} and 122 * {@link #implReset implReset} methods. 123 * 124 * <p> Instances of this class are not safe for use by multiple concurrent 125 * threads. </p> 126 * 127 * 128 * @author Mark Reinhold 129 * @author JSR-51 Expert Group 130 * @since 1.4 131 * 132 * @see ByteBuffer 133 * @see CharBuffer 134 * @see Charset 135 * @see CharsetEncoder 136 */ 137 138 public abstract class CharsetDecoder { 139 140 private final Charset charset; 141 private final float averageCharsPerByte; 142 private final float maxCharsPerByte; 143 144 private String replacement; 145 private CodingErrorAction malformedInputAction 146 = CodingErrorAction.REPORT; 147 private CodingErrorAction unmappableCharacterAction 148 = CodingErrorAction.REPORT; 149 150 // Internal states 151 // 152 private static final int ST_RESET = 0; 153 private static final int ST_CODING = 1; 154 private static final int ST_END = 2; 155 private static final int ST_FLUSHED = 3; 156 157 private int state = ST_RESET; 158 159 private static String stateNames[] 160 = { "RESET", "CODING", "CODING_END", "FLUSHED" }; 161 162 163 /** 164 * Initializes a new decoder. The new decoder will have the given 165 * chars-per-byte and replacement values. 166 * 167 * * @param cs 168 * The charset that created this decoder 169 * 170 * @param averageCharsPerByte 171 * A positive float value indicating the expected number of 172 * characters that will be produced for each input byte 173 * 174 * @param maxCharsPerByte 175 * A positive float value indicating the maximum number of 176 * characters that will be produced for each input byte 177 * 178 * @param replacement 179 * The initial replacement; must not be <tt>null</tt>, must have 180 * non-zero length, must not be longer than maxCharsPerByte, 181 * and must be {@linkplain #isLegalReplacement legal} 182 * 183 * @throws IllegalArgumentException 184 * If the preconditions on the parameters do not hold 185 */ 186 private CharsetDecoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte, String replacement)187 CharsetDecoder(Charset cs, 188 float averageCharsPerByte, 189 float maxCharsPerByte, 190 String replacement) 191 { 192 this.charset = cs; 193 if (averageCharsPerByte <= 0.0f) 194 throw new IllegalArgumentException("Non-positive " 195 + "averageCharsPerByte"); 196 if (maxCharsPerByte <= 0.0f) 197 throw new IllegalArgumentException("Non-positive " 198 + "maxCharsPerByte"); 199 if (!Charset.atBugLevel("1.4")) { 200 if (averageCharsPerByte > maxCharsPerByte) 201 throw new IllegalArgumentException("averageCharsPerByte" 202 + " exceeds " 203 + "maxCharsPerByte"); 204 } 205 this.replacement = replacement; 206 this.averageCharsPerByte = averageCharsPerByte; 207 this.maxCharsPerByte = maxCharsPerByte; 208 // Android-removed 209 // replaceWith(replacement); 210 } 211 212 /** 213 * Initializes a new decoder. The new decoder will have the given 214 * chars-per-byte values and its replacement will be the 215 * string <tt>"\uFFFD"</tt>. 216 * 217 * @param cs 218 * The charset that created this decoder 219 * 220 * @param averageCharsPerByte 221 * A positive float value indicating the expected number of 222 * characters that will be produced for each input byte 223 * 224 * @param maxCharsPerByte 225 * A positive float value indicating the maximum number of 226 * characters that will be produced for each input byte 227 * 228 * @throws IllegalArgumentException 229 * If the preconditions on the parameters do not hold 230 */ CharsetDecoder(Charset cs, float averageCharsPerByte, float maxCharsPerByte)231 protected CharsetDecoder(Charset cs, 232 float averageCharsPerByte, 233 float maxCharsPerByte) 234 { 235 this(cs, 236 averageCharsPerByte, maxCharsPerByte, 237 "\uFFFD"); 238 } 239 240 /** 241 * Returns the charset that created this decoder. 242 * 243 * @return This decoder's charset 244 */ charset()245 public final Charset charset() { 246 return charset; 247 } 248 249 /** 250 * Returns this decoder's replacement value. 251 * 252 * @return This decoder's current replacement, 253 * which is never <tt>null</tt> and is never empty 254 */ replacement()255 public final String replacement() { 256 return replacement; 257 } 258 259 /** 260 * Changes this decoder's replacement value. 261 * 262 * <p> This method invokes the {@link #implReplaceWith implReplaceWith} 263 * method, passing the new replacement, after checking that the new 264 * replacement is acceptable. </p> 265 * 266 * @param newReplacement The replacement value 267 * 268 269 * The new replacement; must not be <tt>null</tt> 270 * and must have non-zero length 271 272 273 274 275 276 277 278 * 279 * @return This decoder 280 * 281 * @throws IllegalArgumentException 282 * If the preconditions on the parameter do not hold 283 */ replaceWith(String newReplacement)284 public final CharsetDecoder replaceWith(String newReplacement) { 285 if (newReplacement == null) 286 throw new IllegalArgumentException("Null replacement"); 287 int len = newReplacement.length(); 288 if (len == 0) 289 throw new IllegalArgumentException("Empty replacement"); 290 if (len > maxCharsPerByte) 291 throw new IllegalArgumentException("Replacement too long"); 292 293 this.replacement = newReplacement; 294 295 296 297 298 299 implReplaceWith(this.replacement); 300 return this; 301 } 302 303 /** 304 * Reports a change to this decoder's replacement value. 305 * 306 * <p> The default implementation of this method does nothing. This method 307 * should be overridden by decoders that require notification of changes to 308 * the replacement. </p> 309 * 310 * @param newReplacement The replacement value 311 */ implReplaceWith(String newReplacement)312 protected void implReplaceWith(String newReplacement) { 313 } 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 /** 356 * Returns this decoder's current action for malformed-input errors. 357 * 358 * @return The current malformed-input action, which is never <tt>null</tt> 359 */ malformedInputAction()360 public CodingErrorAction malformedInputAction() { 361 return malformedInputAction; 362 } 363 364 /** 365 * Changes this decoder's action for malformed-input errors. 366 * 367 * <p> This method invokes the {@link #implOnMalformedInput 368 * implOnMalformedInput} method, passing the new action. </p> 369 * 370 * @param newAction The new action; must not be <tt>null</tt> 371 * 372 * @return This decoder 373 * 374 * @throws IllegalArgumentException 375 * If the precondition on the parameter does not hold 376 */ onMalformedInput(CodingErrorAction newAction)377 public final CharsetDecoder onMalformedInput(CodingErrorAction newAction) { 378 if (newAction == null) 379 throw new IllegalArgumentException("Null action"); 380 malformedInputAction = newAction; 381 implOnMalformedInput(newAction); 382 return this; 383 } 384 385 /** 386 * Reports a change to this decoder's malformed-input action. 387 * 388 * <p> The default implementation of this method does nothing. This method 389 * should be overridden by decoders that require notification of changes to 390 * the malformed-input action. </p> 391 * 392 * @param newAction The new action 393 */ implOnMalformedInput(CodingErrorAction newAction)394 protected void implOnMalformedInput(CodingErrorAction newAction) { } 395 396 /** 397 * Returns this decoder's current action for unmappable-character errors. 398 * 399 * @return The current unmappable-character action, which is never 400 * <tt>null</tt> 401 */ unmappableCharacterAction()402 public CodingErrorAction unmappableCharacterAction() { 403 return unmappableCharacterAction; 404 } 405 406 /** 407 * Changes this decoder's action for unmappable-character errors. 408 * 409 * <p> This method invokes the {@link #implOnUnmappableCharacter 410 * implOnUnmappableCharacter} method, passing the new action. </p> 411 * 412 * @param newAction The new action; must not be <tt>null</tt> 413 * 414 * @return This decoder 415 * 416 * @throws IllegalArgumentException 417 * If the precondition on the parameter does not hold 418 */ onUnmappableCharacter(CodingErrorAction newAction)419 public final CharsetDecoder onUnmappableCharacter(CodingErrorAction 420 newAction) 421 { 422 if (newAction == null) 423 throw new IllegalArgumentException("Null action"); 424 unmappableCharacterAction = newAction; 425 implOnUnmappableCharacter(newAction); 426 return this; 427 } 428 429 /** 430 * Reports a change to this decoder's unmappable-character action. 431 * 432 * <p> The default implementation of this method does nothing. This method 433 * should be overridden by decoders that require notification of changes to 434 * the unmappable-character action. </p> 435 * 436 * @param newAction The new action 437 */ implOnUnmappableCharacter(CodingErrorAction newAction)438 protected void implOnUnmappableCharacter(CodingErrorAction newAction) { } 439 440 /** 441 * Returns the average number of characters that will be produced for each 442 * byte of input. This heuristic value may be used to estimate the size 443 * of the output buffer required for a given input sequence. 444 * 445 * @return The average number of characters produced 446 * per byte of input 447 */ averageCharsPerByte()448 public final float averageCharsPerByte() { 449 return averageCharsPerByte; 450 } 451 452 /** 453 * Returns the maximum number of characters that will be produced for each 454 * byte of input. This value may be used to compute the worst-case size 455 * of the output buffer required for a given input sequence. </p> 456 * 457 * @return The maximum number of characters that will be produced per 458 * byte of input 459 */ maxCharsPerByte()460 public final float maxCharsPerByte() { 461 return maxCharsPerByte; 462 } 463 464 /** 465 * Decodes as many bytes as possible from the given input buffer, 466 * writing the results to the given output buffer. 467 * 468 * <p> The buffers are read from, and written to, starting at their current 469 * positions. At most {@link Buffer#remaining in.remaining()} bytes 470 * will be read and at most {@link Buffer#remaining out.remaining()} 471 * characters will be written. The buffers' positions will be advanced to 472 * reflect the bytes read and the characters written, but their marks and 473 * limits will not be modified. 474 * 475 * <p> In addition to reading bytes from the input buffer and writing 476 * characters to the output buffer, this method returns a {@link CoderResult} 477 * object to describe its reason for termination: 478 * 479 * <ul> 480 * 481 * <li><p> {@link CoderResult#UNDERFLOW} indicates that as much of the 482 * input buffer as possible has been decoded. If there is no further 483 * input then the invoker can proceed to the next step of the 484 * <a href="#steps">decoding operation</a>. Otherwise this method 485 * should be invoked again with further input. </p></li> 486 * 487 * <li><p> {@link CoderResult#OVERFLOW} indicates that there is 488 * insufficient space in the output buffer to decode any more bytes. 489 * This method should be invoked again with an output buffer that has 490 * more {@linkplain Buffer#remaining remaining} characters. This is 491 * typically done by draining any decoded characters from the output 492 * buffer. </p></li> 493 * 494 * <li><p> A {@linkplain CoderResult#malformedForLength 495 * malformed-input} result indicates that a malformed-input 496 * error has been detected. The malformed bytes begin at the input 497 * buffer's (possibly incremented) position; the number of malformed 498 * bytes may be determined by invoking the result object's {@link 499 * CoderResult#length() length} method. This case applies only if the 500 * {@linkplain #onMalformedInput malformed action} of this decoder 501 * is {@link CodingErrorAction#REPORT}; otherwise the malformed input 502 * will be ignored or replaced, as requested. </p></li> 503 * 504 * <li><p> An {@linkplain CoderResult#unmappableForLength 505 * unmappable-character} result indicates that an 506 * unmappable-character error has been detected. The bytes that 507 * decode the unmappable character begin at the input buffer's (possibly 508 * incremented) position; the number of such bytes may be determined 509 * by invoking the result object's {@link CoderResult#length() length} 510 * method. This case applies only if the {@linkplain #onUnmappableCharacter 511 * unmappable action} of this decoder is {@link 512 * CodingErrorAction#REPORT}; otherwise the unmappable character will be 513 * ignored or replaced, as requested. </p></li> 514 * 515 * </ul> 516 * 517 * In any case, if this method is to be reinvoked in the same decoding 518 * operation then care should be taken to preserve any bytes remaining 519 * in the input buffer so that they are available to the next invocation. 520 * 521 * <p> The <tt>endOfInput</tt> parameter advises this method as to whether 522 * the invoker can provide further input beyond that contained in the given 523 * input buffer. If there is a possibility of providing additional input 524 * then the invoker should pass <tt>false</tt> for this parameter; if there 525 * is no possibility of providing further input then the invoker should 526 * pass <tt>true</tt>. It is not erroneous, and in fact it is quite 527 * common, to pass <tt>false</tt> in one invocation and later discover that 528 * no further input was actually available. It is critical, however, that 529 * the final invocation of this method in a sequence of invocations always 530 * pass <tt>true</tt> so that any remaining undecoded input will be treated 531 * as being malformed. 532 * 533 * <p> This method works by invoking the {@link #decodeLoop decodeLoop} 534 * method, interpreting its results, handling error conditions, and 535 * reinvoking it as necessary. </p> 536 * 537 * 538 * @param in 539 * The input byte buffer 540 * 541 * @param out 542 * The output character buffer 543 * 544 * @param endOfInput 545 * <tt>true</tt> if, and only if, the invoker can provide no 546 * additional input bytes beyond those in the given buffer 547 * 548 * @return A coder-result object describing the reason for termination 549 * 550 * @throws IllegalStateException 551 * If a decoding operation is already in progress and the previous 552 * step was an invocation neither of the {@link #reset reset} 553 * method, nor of this method with a value of <tt>false</tt> for 554 * the <tt>endOfInput</tt> parameter, nor of this method with a 555 * value of <tt>true</tt> for the <tt>endOfInput</tt> parameter 556 * but a return value indicating an incomplete decoding operation 557 * 558 * @throws CoderMalfunctionError 559 * If an invocation of the decodeLoop method threw 560 * an unexpected exception 561 */ decode(ByteBuffer in, CharBuffer out, boolean endOfInput)562 public final CoderResult decode(ByteBuffer in, CharBuffer out, 563 boolean endOfInput) 564 { 565 int newState = endOfInput ? ST_END : ST_CODING; 566 if ((state != ST_RESET) && (state != ST_CODING) 567 && !(endOfInput && (state == ST_END))) 568 throwIllegalStateException(state, newState); 569 state = newState; 570 571 for (;;) { 572 573 CoderResult cr; 574 try { 575 cr = decodeLoop(in, out); 576 } catch (BufferUnderflowException x) { 577 throw new CoderMalfunctionError(x); 578 } catch (BufferOverflowException x) { 579 throw new CoderMalfunctionError(x); 580 } 581 582 if (cr.isOverflow()) 583 return cr; 584 585 if (cr.isUnderflow()) { 586 if (endOfInput && in.hasRemaining()) { 587 cr = CoderResult.malformedForLength(in.remaining()); 588 // Fall through to malformed-input case 589 } else { 590 return cr; 591 } 592 } 593 594 CodingErrorAction action = null; 595 if (cr.isMalformed()) 596 action = malformedInputAction; 597 else if (cr.isUnmappable()) 598 action = unmappableCharacterAction; 599 else 600 assert false : cr.toString(); 601 602 if (action == CodingErrorAction.REPORT) 603 return cr; 604 605 if (action == CodingErrorAction.REPLACE) { 606 if (out.remaining() < replacement.length()) 607 return CoderResult.OVERFLOW; 608 out.put(replacement); 609 } 610 611 if ((action == CodingErrorAction.IGNORE) 612 || (action == CodingErrorAction.REPLACE)) { 613 // Skip erroneous input either way 614 in.position(in.position() + cr.length()); 615 continue; 616 } 617 618 assert false; 619 } 620 621 } 622 623 /** 624 * Flushes this decoder. 625 * 626 * <p> Some decoders maintain internal state and may need to write some 627 * final characters to the output buffer once the overall input sequence has 628 * been read. 629 * 630 * <p> Any additional output is written to the output buffer beginning at 631 * its current position. At most {@link Buffer#remaining out.remaining()} 632 * characters will be written. The buffer's position will be advanced 633 * appropriately, but its mark and limit will not be modified. 634 * 635 * <p> If this method completes successfully then it returns {@link 636 * CoderResult#UNDERFLOW}. If there is insufficient room in the output 637 * buffer then it returns {@link CoderResult#OVERFLOW}. If this happens 638 * then this method must be invoked again, with an output buffer that has 639 * more room, in order to complete the current <a href="#steps">decoding 640 * operation</a>. 641 * 642 * <p> If this decoder has already been flushed then invoking this method 643 * has no effect. 644 * 645 * <p> This method invokes the {@link #implFlush implFlush} method to 646 * perform the actual flushing operation. </p> 647 * 648 * @param out 649 * The output character buffer 650 * 651 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 652 * {@link CoderResult#OVERFLOW} 653 * 654 * @throws IllegalStateException 655 * If the previous step of the current decoding operation was an 656 * invocation neither of the {@link #flush flush} method nor of 657 * the three-argument {@link 658 * #decode(ByteBuffer,CharBuffer,boolean) decode} method 659 * with a value of <tt>true</tt> for the <tt>endOfInput</tt> 660 * parameter 661 */ flush(CharBuffer out)662 public final CoderResult flush(CharBuffer out) { 663 if (state == ST_END) { 664 CoderResult cr = implFlush(out); 665 if (cr.isUnderflow()) 666 state = ST_FLUSHED; 667 return cr; 668 } 669 670 if (state != ST_FLUSHED) 671 throwIllegalStateException(state, ST_FLUSHED); 672 673 return CoderResult.UNDERFLOW; // Already flushed 674 } 675 676 /** 677 * Flushes this decoder. 678 * 679 * <p> The default implementation of this method does nothing, and always 680 * returns {@link CoderResult#UNDERFLOW}. This method should be overridden 681 * by decoders that may need to write final characters to the output buffer 682 * once the entire input sequence has been read. </p> 683 * 684 * @param out 685 * The output character buffer 686 * 687 * @return A coder-result object, either {@link CoderResult#UNDERFLOW} or 688 * {@link CoderResult#OVERFLOW} 689 */ implFlush(CharBuffer out)690 protected CoderResult implFlush(CharBuffer out) { 691 return CoderResult.UNDERFLOW; 692 } 693 694 /** 695 * Resets this decoder, clearing any internal state. 696 * 697 * <p> This method resets charset-independent state and also invokes the 698 * {@link #implReset() implReset} method in order to perform any 699 * charset-specific reset actions. </p> 700 * 701 * @return This decoder 702 * 703 */ reset()704 public final CharsetDecoder reset() { 705 implReset(); 706 state = ST_RESET; 707 return this; 708 } 709 710 /** 711 * Resets this decoder, clearing any charset-specific internal state. 712 * 713 * <p> The default implementation of this method does nothing. This method 714 * should be overridden by decoders that maintain internal state. </p> 715 */ implReset()716 protected void implReset() { } 717 718 /** 719 * Decodes one or more bytes into one or more characters. 720 * 721 * <p> This method encapsulates the basic decoding loop, decoding as many 722 * bytes as possible until it either runs out of input, runs out of room 723 * in the output buffer, or encounters a decoding error. This method is 724 * invoked by the {@link #decode decode} method, which handles result 725 * interpretation and error recovery. 726 * 727 * <p> The buffers are read from, and written to, starting at their current 728 * positions. At most {@link Buffer#remaining in.remaining()} bytes 729 * will be read, and at most {@link Buffer#remaining out.remaining()} 730 * characters will be written. The buffers' positions will be advanced to 731 * reflect the bytes read and the characters written, but their marks and 732 * limits will not be modified. 733 * 734 * <p> This method returns a {@link CoderResult} object to describe its 735 * reason for termination, in the same manner as the {@link #decode decode} 736 * method. Most implementations of this method will handle decoding errors 737 * by returning an appropriate result object for interpretation by the 738 * {@link #decode decode} method. An optimized implementation may instead 739 * examine the relevant error action and implement that action itself. 740 * 741 * <p> An implementation of this method may perform arbitrary lookahead by 742 * returning {@link CoderResult#UNDERFLOW} until it receives sufficient 743 * input. </p> 744 * 745 * @param in 746 * The input byte buffer 747 * 748 * @param out 749 * The output character buffer 750 * 751 * @return A coder-result object describing the reason for termination 752 */ decodeLoop(ByteBuffer in, CharBuffer out)753 protected abstract CoderResult decodeLoop(ByteBuffer in, 754 CharBuffer out); 755 756 /** 757 * Convenience method that decodes the remaining content of a single input 758 * byte buffer into a newly-allocated character buffer. 759 * 760 * <p> This method implements an entire <a href="#steps">decoding 761 * operation</a>; that is, it resets this decoder, then it decodes the 762 * bytes in the given byte buffer, and finally it flushes this 763 * decoder. This method should therefore not be invoked if a decoding 764 * operation is already in progress. </p> 765 * 766 * @param in 767 * The input byte buffer 768 * 769 * @return A newly-allocated character buffer containing the result of the 770 * decoding operation. The buffer's position will be zero and its 771 * limit will follow the last character written. 772 * 773 * @throws IllegalStateException 774 * If a decoding operation is already in progress 775 * 776 * @throws MalformedInputException 777 * If the byte sequence starting at the input buffer's current 778 * position is not legal for this charset and the current malformed-input action 779 * is {@link CodingErrorAction#REPORT} 780 * 781 * @throws UnmappableCharacterException 782 * If the byte sequence starting at the input buffer's current 783 * position cannot be mapped to an equivalent character sequence and 784 * the current unmappable-character action is {@link 785 * CodingErrorAction#REPORT} 786 */ decode(ByteBuffer in)787 public final CharBuffer decode(ByteBuffer in) 788 throws CharacterCodingException 789 { 790 int n = (int)(in.remaining() * averageCharsPerByte()); 791 CharBuffer out = CharBuffer.allocate(n); 792 793 if ((n == 0) && (in.remaining() == 0)) 794 return out; 795 reset(); 796 for (;;) { 797 CoderResult cr = in.hasRemaining() ? 798 decode(in, out, true) : CoderResult.UNDERFLOW; 799 if (cr.isUnderflow()) 800 cr = flush(out); 801 802 if (cr.isUnderflow()) 803 break; 804 if (cr.isOverflow()) { 805 n = 2*n + 1; // Ensure progress; n might be 0! 806 CharBuffer o = CharBuffer.allocate(n); 807 out.flip(); 808 o.put(out); 809 out = o; 810 continue; 811 } 812 cr.throwException(); 813 } 814 out.flip(); 815 return out; 816 } 817 818 819 820 /** 821 * Tells whether or not this decoder implements an auto-detecting charset. 822 * 823 * <p> The default implementation of this method always returns 824 * <tt>false</tt>; it should be overridden by auto-detecting decoders to 825 * return <tt>true</tt>. </p> 826 * 827 * @return <tt>true</tt> if, and only if, this decoder implements an 828 * auto-detecting charset 829 */ isAutoDetecting()830 public boolean isAutoDetecting() { 831 return false; 832 } 833 834 /** 835 * Tells whether or not this decoder has yet detected a 836 * charset <i>(optional operation)</i>. 837 * 838 * <p> If this decoder implements an auto-detecting charset then at a 839 * single point during a decoding operation this method may start returning 840 * <tt>true</tt> to indicate that a specific charset has been detected in 841 * the input byte sequence. Once this occurs, the {@link #detectedCharset 842 * detectedCharset} method may be invoked to retrieve the detected charset. 843 * 844 * <p> That this method returns <tt>false</tt> does not imply that no bytes 845 * have yet been decoded. Some auto-detecting decoders are capable of 846 * decoding some, or even all, of an input byte sequence without fixing on 847 * a particular charset. 848 * 849 * <p> The default implementation of this method always throws an {@link 850 * UnsupportedOperationException}; it should be overridden by 851 * auto-detecting decoders to return <tt>true</tt> once the input charset 852 * has been determined. </p> 853 * 854 * @return <tt>true</tt> if, and only if, this decoder has detected a 855 * specific charset 856 * 857 * @throws UnsupportedOperationException 858 * If this decoder does not implement an auto-detecting charset 859 */ isCharsetDetected()860 public boolean isCharsetDetected() { 861 throw new UnsupportedOperationException(); 862 } 863 864 /** 865 * Retrieves the charset that was detected by this 866 * decoder <i>(optional operation)</i>. 867 * 868 * <p> If this decoder implements an auto-detecting charset then this 869 * method returns the actual charset once it has been detected. After that 870 * point, this method returns the same value for the duration of the 871 * current decoding operation. If not enough input bytes have yet been 872 * read to determine the actual charset then this method throws an {@link 873 * IllegalStateException}. 874 * 875 * <p> The default implementation of this method always throws an {@link 876 * UnsupportedOperationException}; it should be overridden by 877 * auto-detecting decoders to return the appropriate value. </p> 878 * 879 * @return The charset detected by this auto-detecting decoder, 880 * or <tt>null</tt> if the charset has not yet been determined 881 * 882 * @throws IllegalStateException 883 * If insufficient bytes have been read to determine a charset 884 * 885 * @throws UnsupportedOperationException 886 * If this decoder does not implement an auto-detecting charset 887 */ detectedCharset()888 public Charset detectedCharset() { 889 throw new UnsupportedOperationException(); 890 } 891 892 893 894 895 896 897 898 899 900 901 902 903 904 905 906 907 908 909 910 911 912 913 914 915 916 917 918 919 920 921 922 923 924 925 926 927 928 929 930 931 932 933 934 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 954 955 956 957 958 959 960 961 962 963 964 965 966 967 968 969 970 971 972 973 974 975 976 977 978 979 980 throwIllegalStateException(int from, int to)981 private void throwIllegalStateException(int from, int to) { 982 throw new IllegalStateException("Current state = " + stateNames[from] 983 + ", new state = " + stateNames[to]); 984 } 985 986 } 987