@InterfaceAudience.Private @InterfaceStability.Evolving public class ResettableFileInputStream extends ResettableInputStream implements RemoteMarkable, LengthMeasurable
This class makes the following assumptions:
The ability to reset() is dependent on the underlying PositionTracker instance's durability semantics.
A note on surrogate pairs:
The logic for decoding surrogate pairs is as follows:
If no character has been decoded by a "normal" pass, and the buffer still has remaining bytes,
then an attempt is made to read 2 characters in one pass.
If it succeeds, then the first char (high surrogate) is returned;
the second char (low surrogate) is recorded internally,
and is returned at the next call to readChar().
If it fails, then it is assumed that EOF has been reached.
Impacts on position, mark and reset: when a surrogate pair is decoded, the position is incremented by the amount of bytes taken to decode the entire pair (usually, 4). This is the most reasonable choice since it would not be advisable to reset a stream to a position pointing to the second char in a pair of surrogates: such a dangling surrogate would not be properly decoded without its counterpart.
Thus the behaviour of mark and reset is as follows:
mark() is called after a high surrogate pair has been returned by
readChar(), the marked position will be that of the character following
the low surrogate, not that of the low surrogate itself.reset() is called after a high surrogate pair has been returned by
readChar(), the low surrogate is always returned by the next call to
readChar(), before the stream is actually reset to the last marked
position.This ensures that no dangling high surrogate could ever be read as long as
the same instance is used to read the whole pair. However, if reset()
is called after a high surrogate pair has been returned by readChar(),
and a new instance of ResettableFileInputStream is used to resume reading,
then the low surrogate char will be lost,
resulting in a corrupted sequence of characters (dangling high surrogate).
This situation is hopefully extremely unlikely to happen in real life.
| Modifier and Type | Field and Description |
|---|---|
static int |
DEFAULT_BUF_SIZE |
static int |
MIN_BUF_SIZE
The minimum acceptable buffer size to store bytes read
from the underlying file.
|
| Constructor and Description |
|---|
ResettableFileInputStream(File file,
PositionTracker tracker) |
ResettableFileInputStream(File file,
PositionTracker tracker,
int bufSize,
Charset charset,
DecodeErrorPolicy decodeErrorPolicy) |
| Modifier and Type | Method and Description |
|---|---|
void |
close() |
long |
getMarkPosition()
Return the saved mark position without moving the mark pointer.
|
long |
length()
returns the total length of the stream or file
|
void |
mark()
Marks the current position in this input stream.
|
void |
markPosition(long position)
Indicate that the specified position should be returned to in the case of
Resettable.reset() being called. |
int |
read()
Read a single byte of data from the stream.
|
int |
read(byte[] b,
int off,
int len)
Read multiple bytes of data from the stream.
|
int |
readChar()
Read a single character.
|
void |
reset()
Reset stream position to that set by
ResettableInputStream.mark() |
void |
seek(long newPos)
Seek to the specified byte position in the stream.
|
long |
tell()
Tell the current byte position.
|
public static final int DEFAULT_BUF_SIZE
public static final int MIN_BUF_SIZE
public ResettableFileInputStream(File file, PositionTracker tracker) throws IOException
file - File to readtracker - PositionTracker implementation to make offset position durableFileNotFoundException - If the file to read does not existIOException - If the position reported by the tracker cannot be soughtpublic ResettableFileInputStream(File file, PositionTracker tracker, int bufSize, Charset charset, DecodeErrorPolicy decodeErrorPolicy) throws IOException
file - File to readtracker - PositionTracker implementation to make offset position durablebufSize - Size of the underlying buffer used for input. If lesser than MIN_BUF_SIZE,
a buffer of length MIN_BUF_SIZE will be created instead.charset - Character set used for decoding text, as necessarydecodeErrorPolicy - A DecodeErrorPolicy instance to determine how
the decoder should behave in case of malformed input and/or
unmappable character.FileNotFoundException - If the file to read does not existIOException - If the position reported by the tracker cannot be soughtpublic int read()
throws IOException
ResettableInputStreamread in class ResettableInputStream-1 if the end of the stream has
been reached.IOExceptionpublic int read(byte[] b,
int off,
int len)
throws IOException
ResettableInputStreamread in class ResettableInputStreamb - the buffer into which the data is read.off - Offset into the array b at which the data is written.len - the maximum number of bytes to read.-1 if
the end of the stream has been reached.IOExceptionpublic int readChar()
throws IOException
ResettableInputStreamRead a single character.
Note that this may lead to returning only one character in a 2-char surrogate pair sequence. When this happens, the underlying implementation should never persist a mark between two chars of a two-char surrogate pair sequence.
readChar in class ResettableInputStreamIOExceptionpublic void mark()
throws IOException
ResettableInputStreamreset method repositions this stream at the last marked
position so that subsequent reads re-read the same bytes.
Marking a closed stream should not have any effect on the stream.
mark in interface Resettablemark in class ResettableInputStreamIOException - If there is an error while setting the mark position.InputStream.mark(int),
InputStream.reset()public void markPosition(long position)
throws IOException
RemoteMarkableResettable.reset() being called.markPosition in interface RemoteMarkableIOExceptionpublic long getMarkPosition()
throws IOException
RemoteMarkablegetMarkPosition in interface RemoteMarkableIOExceptionpublic void reset()
throws IOException
ResettableInputStreamResettableInputStream.mark()reset in interface Resettablereset in class ResettableInputStreamIOExceptionpublic long length()
throws IOException
LengthMeasurablelength in interface LengthMeasurableIOExceptionpublic long tell()
throws IOException
ResettableInputStreamtell in interface Seekabletell in class ResettableInputStreamIOExceptionpublic void seek(long newPos)
throws IOException
ResettableInputStreamseek in interface Seekableseek in class ResettableInputStreamnewPos - Absolute byte offset to seek toIOExceptionpublic void close()
throws IOException
close in interface Closeableclose in interface AutoCloseableclose in class ResettableInputStreamIOExceptionCopyright © 2009-2022 Apache Software Foundation. All Rights Reserved.