QMX as orignally published (with bug fixes and a a few method name changes) More...

#include <compress_integer_qmx_original.h>

Inheritance diagram for JASS::compress_integer_qmx_original:

Collaboration diagram for JASS::compress_integer_qmx_original:

Public Member Functions
	compress_integer_qmx_original ()
	Constructor.

virtual	~compress_integer_qmx_original ()
	Destructor.

virtual size_t	encode (void encoded, size_t encoded_buffer_length, const integer source, size_t source_integers)
	Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer. More...

virtual void	decode (integer decoded, size_t integers_to_decode, const void source, size_t source_length)
	Decode a sequence of integers encoded with this codex. More...

Public Member Functions inherited from JASS::compress_integer
	compress_integer ()
	Constructor.

virtual	~compress_integer ()
	Destructor.

Static Public Member Functions
static void	unittest_one (const std::vector< uint32_t > &sequence)
	Test one sequence to make sure it encodes and decodes to the same thing. Assert if not. More...

static void	unittest (void)
	Unit test this class.

Static Public Member Functions inherited from JASS::compress_integer
static size_t	d1_encode (integer encoded, const integer source, size_t source_integers)
	Convert an array of integers into an array of D1 (delta, d-gap) encoded integers. More...

static size_t	d1_decode (integer decoded, const integer source, size_t source_integers)
	Convert a D1 encoded array of integers into an array of integers. More...

static size_t	dn_encode (integer encoded, const integer source, size_t source_integers, size_t n=1)
	Convert an array of integers into an array of Dn (delta, d-gap) encoded integers with a gap of n. More...

static size_t	dn_decode (integer decoded, const integer source, size_t source_integers, size_t n=1)
	Convert a Dn encoded array of integers into an array of integers. More...

static void	unittest_one (compress_integer &encoder, const std::vector< uint32_t > &sequence)
	Test one sequence to make sure it encodes and decodes to the same thing. Assert if not. More...

static void	unittest (compress_integer &compressor, uint32_t staring_from=0)
	Unit test this class, assert on failure. More...

Private Member Functions
void	write_out (uint8_t *buffer, uint32_t source, uint32_t raw_count, uint32_t size_in_bits, uint8_t **length_buffer)
	Encode and write out the sequence into the buffer. More...

	compress_integer_qmx_original (const compress_integer_qmx_original &obj)
	Not permitted to copy an object of this type due to memory use.

Private Attributes
uint8_t *	length_buffer
	Stores the number of bits needed to compress each integer.

uint64_t	length_buffer_length
	The length of length_buffer.

uint32_t *	full_length_buffer
	If the run_length is too short then 0-pad into this buffer.

Additional Inherited Members
Public Types inherited from JASS::compress_integer
typedef uint32_t	integer
	This class and descendants will work on integers of this size. Do not change without also changing JASS_COMPRESS_INTEGER_BITS_PER_INTEGER.

Detailed Description

QMX as orignally published (with bug fixes and a a few method name changes)

The original QMX source code with a bug fix to do with short string encoding and with the interface changed to fit the JASS requirements. For details see:

A. Trotman (2014), Compression, SIMD, and Postings Lists, Proceedings of the 19th Australasian Document Computing Symposium (ADCS 2014)

QMX is a version of BinPacking where we pack into a 128-bit SSE register the following: 256 0-bit words 128 1-bit words 64 2-bit words 40 3-bit words 32 4-bit words 24 5-bit words 20 6-bit words 16 8-bit words 12 10-bit words 8 16-bit words 4 32-bit words or pack into two 128-bit words (i.e. 256 bits) the following: 36 7-bit words 28 9-bit words 20 12-bit words 12 21-bit words

This gives us 15 possible combinations. The combinaton is stored in the top 4 bits of a selector byte. The bottom 4-bits of the selector store a run-length (the number of such sequences seen in a row.

The 128-bit (or 256-bit) packed binary values are stored first. Then we store the selectors, Finally, stored variable byte encoded, is a pointer to the start of the selector (from the end of the sequence).

This way, all reads and writes are 128-bit word aligned, except addressing the selector (and the pointer the selector). These reads are byte aligned.

Note: There is currently 1 unused encoding (i.e. 16 unused selector values). These might in the future be used for encoding exceptions, much as PForDelta does.

This code differes from the original as published in two ways. First, two bugs are fixed (an overflow on reading the buffer to be encoded, and an edge case at end of encoded string), and it has been changed to remove SIMD-word alignment requirement.

Member Function Documentation

◆ decode()

void JASS::compress_integer_qmx_original::decode	(	integer *	decoded,
		size_t	integers_to_decode,
		const void *	source,
		size_t	source_length
	)

virtual

Decode a sequence of integers encoded with this codex.

Parameters

decoded	[out] The sequence of decoded integers.
integers_to_decode	[in] The minimum number of integers to decode (it may decode more).
source	[in] The encoded integers.
source_length	[in] The length (in bytes) of the source buffer.

Implements JASS::compress_integer.

◆ encode()

size_t JASS::compress_integer_qmx_original::encode	(	void *	encoded,
		size_t	encoded_buffer_length,
		const integer *	source,
		size_t	source_integers
	)

virtual

Encode a sequence of integers returning the number of bytes used for the encoding, or 0 if the encoded sequence doesn't fit in the buffer.

Parameters

encoded	[out] The sequence of bytes that is the encoded sequence.
encoded_buffer_length	[in] The length (in bytes) of the output buffer, encoded.
source	[in] The sequence of integers to encode.
source_integers	[in] The length (in integers) of the source buffer.

Returns: The number of bytes used to encode the integer sequence, or 0 on error (i.e. overflow).

Implements JASS::compress_integer.

◆ unittest_one()

void JASS::compress_integer_qmx_original::unittest_one ( const std::vector< uint32_t > & sequence )

static

Test one sequence to make sure it encodes and decodes to the same thing. Assert if not.

Parameters

sequence [in] the sequernce to encode.

◆ write_out()

void JASS::compress_integer_qmx_original::write_out	(	uint8_t **	buffer,
		uint32_t *	source,
		uint32_t	raw_count,
		uint32_t	size_in_bits,
		uint8_t **	length_buffer
	)

private

Encode and write out the sequence into the buffer.

Parameters

buffer	[in] where to write the encoded sequence
source	[in] the integer sequence to encode
raw_count	[in] the numnber of integers to encode
size_in_bits	[in] the size, in bits, of the largest integer
length_buffer	[in] the length of buffer, in bytes

The documentation for this class was generated from the following files:

source/compress_integer_qmx_original.h
source/compress_integer_qmx_original.cpp

Public Member Functions

Static Public Member Functions

Private Member Functions

Private Attributes

Additional Inherited Members

Detailed Description

Member Function Documentation

◆ decode()

◆ encode()

◆ unittest_one()

◆ write_out()