|
JASSv2
|
Simple-16 integer compression. More...
#include <compress_integer_simple_16.h>


Public Member Functions | |
| compress_integer_simple_16 () | |
| Constructor. | |
| virtual | ~compress_integer_simple_16 () |
| Destructor. | |
| virtual size_t | encode (void *encoded, size_t encoded_buffer_length, const integer *source, size_t source_integers) |
| Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer. More... | |
| virtual void | decode (integer *decoded, size_t integers_to_decode, const void *source, size_t source_length) |
| Decode a sequence of integers encoded with this codex. More... | |
Public Member Functions inherited from JASS::compress_integer | |
| compress_integer () | |
| Constructor. | |
| virtual | ~compress_integer () |
| Destructor. | |
Static Public Member Functions | |
| static void | unittest (void) |
| Unit test this class. | |
Static Public Member Functions inherited from JASS::compress_integer | |
| static size_t | d1_encode (integer *encoded, const integer *source, size_t source_integers) |
| Convert an array of integers into an array of D1 (delta, d-gap) encoded integers. More... | |
| static size_t | d1_decode (integer *decoded, const integer *source, size_t source_integers) |
| Convert a D1 encoded array of integers into an array of integers. More... | |
| static size_t | dn_encode (integer *encoded, const integer *source, size_t source_integers, size_t n=1) |
| Convert an array of integers into an array of Dn (delta, d-gap) encoded integers with a gap of n. More... | |
| static size_t | dn_decode (integer *decoded, const integer *source, size_t source_integers, size_t n=1) |
| Convert a Dn encoded array of integers into an array of integers. More... | |
| static void | unittest_one (compress_integer &encoder, const std::vector< uint32_t > &sequence) |
| Test one sequence to make sure it encodes and decodes to the same thing. Assert if not. More... | |
| static void | unittest (compress_integer &compressor, uint32_t staring_from=0) |
| Unit test this class, assert on failure. More... | |
Static Protected Attributes | |
| static const size_t | ints_packed_table [] |
| Number of integers packed into a word, given its mask type. More... | |
| static const size_t | can_pack_table [] |
| Bitmask map for valid masks at an offset (column) for some num_bits_needed (row) More... | |
| static const size_t | row_for_bits_needed [] |
| Translates the 'bits_needed' to the appropriate 'row' offset for use with can_pack table. More... | |
| static const size_t | invalid_masks_for_offset [] |
| We AND out masks for offsets where we don't know if we can fully pack for that offset. More... | |
| static const size_t | simple16_shift_table [] |
| Number of bits to shift across when packing - is sum of prior packed ints (see above) More... | |
Additional Inherited Members | |
Public Types inherited from JASS::compress_integer | |
| typedef uint32_t | integer |
| This class and descendants will work on integers of this size. Do not change without also changing JASS_COMPRESS_INTEGER_BITS_PER_INTEGER. | |
Simple-16 integer compression.
Simple-16 is an extension to Simple-9 that uses all 16 selectors (rather than just 9) for encoding the payloads. This resulrs in a more effective encoding that performs faster than Simple-9. This is because fewer reads ar eneeded and hence its faster. Note that, because there are only 28 bits in a payload, the maximum integer that can be encoded with simple-9 is (2^29) - 1 = 536,870,911. This is less than the number of documens in a large collection (such as ClueWeb).
The encodings are: 28 * 1-bit 7 * 2-bits and 14 * 1-bit 7 * 1-bit and 7 * 2-bits and 7 * 1-bit 14 * 1-bit and 7 * 2-bits 14 * 2-bits 1 * 4-bit and 8 * 3-bits 1 * 3-bits and 4 * 4-bits and 3 * 3-bits 7 * 4-bits 4 * 5 bits and 2 * 4 bits 2 * 4-bits and 4 * 5-bits 3 * 6-bits and 2 * 5-bits 2 * 5-bits and 3 * 6 bits 4 * 7-bits 1 * 10-bits and 2 * 9 bits 2 * 14-bits 1 * 28-bits
See: Zhang J, Long X, Suel T. (2008) Performance of compressed inverted list caching in search engines. Proceeedings of 17th Conference on the World Wide Web, pp 387-396 Yan H, Ding S, Suel T (2009) Inverted index compression and query processing with optimized document ordering. roceeedings of 18th Conference on the World Wide Web, 401-410
|
virtual |
Decode a sequence of integers encoded with this codex.
| decoded | [out] The sequence of decoded integers. |
| integers_to_decode | [in] The minimum number of integers to decode (it may decode more). |
| source | [in] The encoded integers. |
| source_length | [in] The length (in bytes) of the source buffer. |
Implements JASS::compress_integer.
|
virtual |
Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer.
| encoded | [out] The sequence of bytes that is the encoded sequence. |
| encoded_buffer_length | [in] The length (in bytes) of the output buffer, encoded. |
| source | [in] The sequence of integers to encode. |
| source_integers | [in] The length (in integers) of the source buffer. |
Implements JASS::compress_integer.
|
staticprotected |
Bitmask map for valid masks at an offset (column) for some num_bits_needed (row)
|
staticprotected |
Number of integers packed into a word, given its mask type.
|
staticprotected |
We AND out masks for offsets where we don't know if we can fully pack for that offset.
|
staticprotected |
Translates the 'bits_needed' to the appropriate 'row' offset for use with can_pack table.
|
staticprotected |
Number of bits to shift across when packing - is sum of prior packed ints (see above)
1.8.13