|
JASSv2
|
Simple-9 integer compression. More...
#include <compress_integer_simple_9.h>


Classes | |
| class | lookup |
| lookup table storing how many integers are encoded and how they are encoded, More... | |
Public Member Functions | |
| compress_integer_simple_9 () | |
| Consructor. | |
| virtual | ~compress_integer_simple_9 () |
| Destructor. | |
| virtual size_t | encode (void *encoded, size_t encoded_buffer_length, const integer *source, size_t source_integers) |
| Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer. More... | |
| virtual void | decode (integer *decoded, size_t integers_to_decode, const void *source, size_t source_length) |
| Decode a sequence of integers encoded with this codex. More... | |
Public Member Functions inherited from JASS::compress_integer | |
| compress_integer () | |
| Constructor. | |
| virtual | ~compress_integer () |
| Destructor. | |
Static Public Member Functions | |
| static void | unittest (void) |
| Unit test this class. | |
Static Public Member Functions inherited from JASS::compress_integer | |
| static size_t | d1_encode (integer *encoded, const integer *source, size_t source_integers) |
| Convert an array of integers into an array of D1 (delta, d-gap) encoded integers. More... | |
| static size_t | d1_decode (integer *decoded, const integer *source, size_t source_integers) |
| Convert a D1 encoded array of integers into an array of integers. More... | |
| static size_t | dn_encode (integer *encoded, const integer *source, size_t source_integers, size_t n=1) |
| Convert an array of integers into an array of Dn (delta, d-gap) encoded integers with a gap of n. More... | |
| static size_t | dn_decode (integer *decoded, const integer *source, size_t source_integers, size_t n=1) |
| Convert a Dn encoded array of integers into an array of integers. More... | |
| static void | unittest_one (compress_integer &encoder, const std::vector< uint32_t > &sequence) |
| Test one sequence to make sure it encodes and decodes to the same thing. Assert if not. More... | |
| static void | unittest (compress_integer &compressor, uint32_t staring_from=0) |
| Unit test this class, assert on failure. More... | |
Static Protected Attributes | |
| static const lookup | simple9_table [] |
| The table mapping bits to slectors and masks. More... | |
| static const uint32_t | bits_to_use [] |
| The number of bits used to store an integer of the given the number of bits in length. More... | |
| static const uint32_t | table_row [] |
| Given the number of bits, which row of simple9_table should be used? More... | |
| static const uint32_t | ints_packed_table [] |
| Number of integers packed into a 32-bit word, given its mask type. More... | |
| static const uint32_t | can_pack_table [] |
| Bitmask map for valid masks at an offset (column) for some num_bits_needed (row). More... | |
| static const uint32_t | row_for_bits_needed [] |
| Translates the 'bits_needed' to the appropriate 'row' offset for use with can_pack table. More... | |
| static const uint32_t | invalid_masks_for_offset [] |
| AND out masks for offsets where we don't know if we can fully pack for that offset. More... | |
| static const uint32_t | simple9_shift_table [] |
| Number of bits to shift when packing - 9 rows for simple-9. More... | |
Additional Inherited Members | |
Public Types inherited from JASS::compress_integer | |
| typedef uint32_t | integer |
| This class and descendants will work on integers of this size. Do not change without also changing JASS_COMPRESS_INTEGER_BITS_PER_INTEGER. | |
Simple-9 integer compression.
Simple-9 compression bit-packs as many integers as possible into a 32-bit word. All integers are packed into the same number of bits. The encoding is stored in a selector stored in the top 4 bits of the 32-bit word and 28-bits for the payload. Note that, because there are only 28 bits in a payload, the maximum integer that can be encoded with simple-9 is (2^29) - 1 = 536,870,911. This is less than the number of documens in a large collection (such as ClueWeb).
In essence, it encodes into a 32-bit word: 28 * 1-bit integers, or 14 * 2-bit integers, 9 * 3-bit integers, 7 * 4-bit integers, 5 * 5-bit integers 4 * 7 bit integers, 3 * 9-bit integers, 2 * 14-bit integers, or 1 * 28-bit integer
See: V. Anh, A. Moffat (2005), Inverted Index Compression Using Word-Aligned Binary Codes, Information Retrieval, 8(1):151-166
|
virtual |
Decode a sequence of integers encoded with this codex.
| decoded | [out] The sequence of decoded integers. |
| integers_to_decode | [in] The minimum number of integers to decode (it may decode more). |
| source | [in] The encoded integers. |
| source_length | [in] The length (in bytes) of the source buffer. |
Implements JASS::compress_integer.
Reimplemented in JASS::compress_integer_carry_8b, JASS::compress_integer_relative_10, and JASS::compress_integer_carryover_12.
|
virtual |
Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer.
| encoded | [out] The sequence of bytes that is the encoded sequence. |
| encoded_buffer_length | [in] The length (in bytes) of the output buffer, encoded. |
| source | [in] The sequence of integers to encode. |
| source_integers | [in] The length (in integers) of the source buffer. |
Implements JASS::compress_integer.
Reimplemented in JASS::compress_integer_carry_8b, JASS::compress_integer_relative_10, and JASS::compress_integer_carryover_12.
|
staticprotected |
The number of bits used to store an integer of the given the number of bits in length.
|
staticprotected |
Bitmask map for valid masks at an offset (column) for some num_bits_needed (row).
|
staticprotected |
Number of integers packed into a 32-bit word, given its mask type.
|
staticprotected |
AND out masks for offsets where we don't know if we can fully pack for that offset.
|
staticprotected |
Translates the 'bits_needed' to the appropriate 'row' offset for use with can_pack table.
|
staticprotected |
Number of bits to shift when packing - 9 rows for simple-9.
|
staticprotected |
The table mapping bits to slectors and masks.
|
staticprotected |
Given the number of bits, which row of simple9_table should be used?
1.8.13