|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectcom.googlecode.compressingcircularbuffers.CompressingCircularBuffer
public class CompressingCircularBuffer
Maintains a compressed, fixed-memory-footprint, "sketch" (summary) of a sampled data sequence. The compressed sequence is formed by applying a specified compression method to a series of approximately equal sized contiguous sequential chunks of the original sequence.
For example, with the default, AveragingCompressionMethod, sequential raw data samples (fed in via the update method) are averaged together in approximately equal sized contiguous chunks to form the compressed sequence. As new data comes in, those chunks expand in size (with associated averages adjusted accordingly) so that a buffer of the specified size can still contain a compressed form of the growing sequence.
In general, these chunk sizes will either 2k or 2k+1, where k is chosen so that the aggregated samples fill up the available buffer space (k starts at zero and increases as more samples come in and the buffer becomes full at the current compression level). Chunks at the beginning or at the end of the sequence use the larger aggregation interval, with the smaller interval used in the middle. You can use the methods getFirstSampleIndex and getLastSampleIndex to determine the number of samples aggregated into any specific compressed value.
Performance is very similar to that of an ordinary circular buffer: The algorithm updates the compressed sequence in at most one read from, and two writes to, its internal buffer per update. Thus it is appropriate for use when update times must be very predictable, since there is never any 'stalling' as large chunks of memory are reorganized to reflect a change in compression level. For example, if 'memory' were a large file on a hard disk, the algorithm could still have acceptable worst case update times (a couple of seeks/writes per update). As with ordinary circular buffers, it is also close to optimal in memory use: all buffer elements are filled with useful data.
This class uses an indexing scheme that exploits basic properties of modular arithmetic to quickly locate exactly those positions containing the 'compressed away' data that are available to store the new data. That's why it's more efficient than more straightforward approaches. In the associated paper "Compressing Circular Buffers" I explain this indexing scheme more completely.
Copyright, 2010, John C. Gunther. All rights reserved. Released under the terms of the Apache 2.0 licence.
Field Summary | |
---|---|
static int |
DEFAULT_MAX_COMPRESSION_RATIO
Default upper bound on compression ratio if none is specified in constructor. |
static int |
MAX_ALLOWED_UPDATES
Upper bound on maximum number of updates that can be performed without a reset. |
static int |
NOSUCHINDEX
Compressed element index returned when no compressed element matching given criteria is available. |
Constructor Summary | |
---|---|
CompressingCircularBuffer()
Creates a new compressing circular buffer. |
|
CompressingCircularBuffer(int nElements)
Creates a new compressing circular buffer. |
|
CompressingCircularBuffer(int nElements,
CompressionMethod compressionMethod)
Creates a new compressing circular buffer. |
|
CompressingCircularBuffer(int nElements,
CompressionMethod compressionMethod,
int minCompressionRatio)
Creates a new compressing circular buffer. |
|
CompressingCircularBuffer(int nElements,
CompressionMethod compressionMethod,
int minCompressionRatio,
int maxCompressionRatio)
Creates a new compressing circular buffer. |
Method Summary | |
---|---|
int |
getCompressedIndex(int iRawDataSample)
Retrieves the index of the compressed value whose aggregation interval includes the given raw data sample. |
double |
getCompressedValue(int iCompressed)
Retrieves compressed value associated with given index. |
int |
getCompressionRatio()
Retrieves the number of points that will be included in the next new compressed value that will be formed from samples fed in via the update method. |
int |
getFirstSampleIndex(int iCompressed)
Retrieves the index of the first uncompressed sequence value included in the specified compressed value; uncompressed sequence indexes start at 0. |
double |
getIncompleteCompressedValue()
Retrieves the "in progress", or incomplete compressed value. |
int |
getLastSampleIndex(int iCompressed)
Retrieves the index of the last uncompressed sequence value incorporated into the specified compressed value. |
int |
getNCompressed()
Retrieves the number of compressed values within the circular buffer. |
int |
getNCompressedSamples()
Retrieves the number of samples fed to the circular buffer that have been incorporated into the compressed sequence. |
int |
getNSamples()
Retrieves the number of samples fed to the circular buffer so far. |
void |
reset()
Returns the compressing circular buffer to it's initial state, discarding all previously seen data. |
void |
update(double sample)
Updates the compressed representation of the sequence to include a newly collected data value (sequence element). |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Field Detail |
---|
public static final int MAX_ALLOWED_UPDATES
public static final int NOSUCHINDEX
public static final int DEFAULT_MAX_COMPRESSION_RATIO
Constructor Detail |
---|
public CompressingCircularBuffer(int nElements, CompressionMethod compressionMethod, int minCompressionRatio, int maxCompressionRatio)
nElements
- number of compressed elements used to represent
sequence. Must be an odd integer and at least 1. Default is 1.compressionMethod
- how compression will be performed. Use
an instance of one of the predefined compression methods (ThinningCompressionMethod,
AveragingCompressionMethod, SummingCompressionMethod, MinimizingCompresionMethod,
MaximizingCompressionMethod, or RandomSamplingCompressionMethod) or else
create your own by implementing the CompressionMethod interface.
Default is new AveragingCompressionMethod().minCompressionRatio
- minimum number of data points
contained in any compressed value. Default is 1.maxCompressionRatio
- maximum number of data points in any
compressed value. Must be ≥ minCompressionRatio. Note
that compressed values always contain a number of points equal to
minCompressionRatio*2k with k a non-negative integer.
Thus actual maximum is the largest integer of this form that does
not exceed the maxCompressionRatio that you specify. Once this
limit is reached, the sequence will behave like an ordinary
circular buffer (wrapping around and overwriting the oldest
compressed value). Default is DEFAULT_MAX_COMPRESSION_RATIO.DEFAULT_MAX_COMPRESSION_RATIO
,
ThinningCompressionMethod
,
AveragingCompressionMethod
,
SummingCompressionMethod
,
MinimizingCompressionMethod
,
MaximizingCompressionMethod
,
RandomSamplingCompressionMethod
public CompressingCircularBuffer(int nElements, CompressionMethod compressionMethod, int minCompressionRatio)
Convenience constructor. Same as:
CompressingCircularBuffer(nElements, compressionMethod, minCompressionRatio, DEFAULT_MAX_COMPRESSION_RATIO)
CompressingCircularBuffer(int, CompressionMethod, int, int)
public CompressingCircularBuffer(int nElements, CompressionMethod compressionMethod)
Convenience constructor. Same as:
CompressingCircularBuffer(nElements, compressionMethod, 1, DEFAULT_MAX_COMPRESSION_RATIO)
CompressingCircularBuffer(int, CompressionMethod, int, int)
public CompressingCircularBuffer(int nElements)
Convenience constructor. Same as:
CompressingCircularBuffer(nElements, new AveragingCompressionMethod(), 1, DEFAULT_MAX_COMPRESSION_RATIO)
CompressingCircularBuffer(int, CompressionMethod, int, int)
public CompressingCircularBuffer()
Convenience constructor. Same as:
CompressingCircularBuffer(1, new AveragingCompressionMethod(), 1, DEFAULT_MAX_COMPRESSION_RATIO)
CompressingCircularBuffer(int, CompressionMethod, int, int)
Method Detail |
---|
public void update(double sample)
For example, if you were sampling an on-line sensor once per minute, you might call update every minute, feeding in each newly sampled value. In that case, the compressing circular buffer would maintain an up-to-date compressed "sketch" of the entire sequence; that sketch would fit into the number of buffer elements that you specified in the compressing circular buffer's constructor.
sample
- next value of a sampled sequence to be incorporated into the compressed representation.public int getCompressedIndex(int iRawDataSample)
iRawDataSample
- raw data sample index
public int getFirstSampleIndex(int iCompressed)
Returns NOSUCHINDEX if there is currently no compressed value at the given index.
This method can be used when you need to determine the range of underlying uncompressed sequence values that a compressed value incorporates.
iCompressed
- the compressed sequence element whose sample index is to be retrieved
getLastSampleIndex
,
NOSUCHINDEX
public int getLastSampleIndex(int iCompressed)
Returns NOSUCHINDEX if there is currently no compressed value at the given index.
iCompressed
- the compressed sequence element whose sample index is to be retrieved
getFirstSampleIndex
,
NOSUCHINDEX
public int getNCompressedSamples()
public int getNSamples()
public int getNCompressed()
Note: Except during startup, this will always equal the number of elements in the buffer specified in the constructor.
public int getCompressionRatio()
Initially, this starts at minCompressionRatio, until the compressed values buffer fills up, in which case it doubles, until the buffer is filled again at the higher compression ratio, then it doubles again, and so forth.
public double getCompressedValue(int iCompressed)
Indexes range from 0...nElements-1, with order corresponding to the order of the underlying, uncompressed, data in the original sequence (0 for the oldest, nElements-1 for the newest).
iCompressed
- index of compressed value to be retrieved
public double getIncompleteCompressedValue()
This value reflects any recent updates not yet incorporated into a compressed value. For example, if the buffer is currently compressing at a 4 to 1 ratio, and the total number of updates were not a multiple of 4, then a compressed value for either the most recently seen, 1, 2, or 3 points would be returned.
Note that if no incomplete compressed value is available, Double.NaN will be returned.
public void reset()
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |