JASSv2
Classes | Public Member Functions | Static Public Member Functions | Protected Member Functions | Private Attributes | List of all members
JASS::index_manager_sequential Class Reference

Non-thread-Safe indexer object. More...

#include <index_manager_sequential.h>

Inheritance diagram for JASS::index_manager_sequential:
Inheritance graph
[legend]
Collaboration diagram for JASS::index_manager_sequential:
Collaboration graph
[legend]

Classes

class  delegate
 Base class for the callback function called by iterate. More...
 

Public Member Functions

 index_manager_sequential ()
 Constructor.
 
virtual ~index_manager_sequential ()
 Destructor.
 
virtual void begin_document (const slice &document_primary_key)
 Tell this object that you're about to start indexing a new object. More...
 
virtual void set_primary_keys (const std::vector< slice > &keys)
 Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning) More...
 
virtual void term (const parser::token &term)
 Hand a new term from the token stream to this object. More...
 
virtual void term (const parser::token &term, const std::vector< posting > &postings_list)
 Hand a new term with a pre-computed postings list to this object. More...
 
virtual void term (const parser::token &term, compress_integer::integer docid)
 Hand a new term with a pre-computed postings list to this object. More...
 
virtual void text_render (std::ostream &stream) const
 Dump a human-readable version of the index down the stream. More...
 
virtual void iterate (index_manager::delegate &callback)
 Iterate over the index calling callback.operator() with each postings list. More...
 
virtual void iterate (index_manager::quantizing_delegate &quantizer, index_manager::delegate &callback)
 Iterate over the index calling callback.operator() with each postings list. More...
 
- Public Member Functions inherited from JASS::index_manager
 index_manager ()
 Constructor.
 
virtual ~index_manager ()
 Destructor.
 
virtual void end_document (compress_integer::integer document_length)
 Tell this object that you've finished with the current document (and are about to move on to the next, or are completely finished).
 
virtual std::vector< compress_integer::integer > & get_document_length_vector (void)
 Return a reference to the document length vector. More...
 
virtual void set_document_length_vector (std::vector< compress_integer::integer > &new_lengths)
 Replace the document length vector with the one passed to this function (warning). More...
 
compress_integer::integer get_highest_document_id (void) const
 Return the number of documents that have been successfully indexed or are in the process of being indexed.
 

Static Public Member Functions

static void unittest_build_index (index_manager_sequential &index, const std::string &document_collection)
 Build and index for the 10 sample documents. This is used by several unit tests that need a valid index. More...
 
static void unittest (void)
 Unit test this class.
 
- Static Public Member Functions inherited from JASS::index_manager
static void unittest (void)
 Unit test this class.
 

Protected Member Functions

void make_space (void)
 make sure all the internal buffers needed for iteration have been allocated
 

Private Attributes

allocator_pool memory
 All memory in allocatged from this allocator.
 
hash_table< slice, index_postings, 24 > index
 The index is a hash table of index_postings keyed on the term (a slice).
 
dynamic_array< sliceprimary_key
 The list of primary keys (i.e. external document identifiers) allocated in memory.
 
compress_integer::integerdocument_ids
 The re-used buffer storing decoded document ids.
 
index_postings_impact::impact_typeterm_frequencies
 The re-used buffer storing the term frequencies.
 
size_t temporary_size
 The number of bytes in temporary.
 
uint8_t * temporary
 Temporary buffer - cannot be used to store anything between calls.
 

Detailed Description

Non-thread-Safe indexer object.

This class is a non-thread-safe indexer used for regular sequential indexing. It self-contains its memory, uses a hash-table with direct chaining (in a non-ballanced tree) and supports a positional index.

Member Function Documentation

◆ begin_document()

virtual void JASS::index_manager_sequential::begin_document ( const slice document_primary_key)
inlinevirtual

Tell this object that you're about to start indexing a new object.

Parameters
document_primary_key[in] The document's primary key (or external document identifier).

Reimplemented from JASS::index_manager.

◆ iterate() [1/2]

virtual void JASS::index_manager_sequential::iterate ( index_manager::delegate callback)
inlinevirtual

Iterate over the index calling callback.operator() with each postings list.

Parameters
callback[in] The callback to call.

Reimplemented from JASS::index_manager.

◆ iterate() [2/2]

virtual void JASS::index_manager_sequential::iterate ( index_manager::quantizing_delegate quantizer,
index_manager::delegate callback 
)
inlinevirtual

Iterate over the index calling callback.operator() with each postings list.

Parameters
quantizer[in] The quantizer that will quantize then call the serialiser callback.
callback[in] The callback that the quantizer should call.

Reimplemented from JASS::index_manager.

◆ set_primary_keys()

virtual void JASS::index_manager_sequential::set_primary_keys ( const std::vector< slice > &  keys)
inlinevirtual

Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning)

Normally this method would only be called when an index is being "pushed" into an object rather than indexing document at a time. This method actually adds to the end of the primary key list which is assumed to be empty before the method is called, but might not be if some indexing has already happened.

Parameters
keys[in] The vector of primary keys.

Reimplemented from JASS::index_manager.

◆ term() [1/3]

virtual void JASS::index_manager_sequential::term ( const parser::token term)
inlinevirtual

Hand a new term from the token stream to this object.

Parameters
term[in] The term from the token stream.

Reimplemented from JASS::index_manager.

◆ term() [2/3]

virtual void JASS::index_manager_sequential::term ( const parser::token term,
const std::vector< posting > &  postings_list 
)
inlinevirtual

Hand a new term with a pre-computed postings list to this object.

Parameters
term[in] The term from the token stream.
postings_list[in] The pre-computed D1-encoded postings list

Reimplemented from JASS::index_manager.

◆ term() [3/3]

virtual void JASS::index_manager_sequential::term ( const parser::token term,
compress_integer::integer  docid 
)
inlinevirtual

Hand a new term with a pre-computed postings list to this object.

Parameters
term[in] The term from the token stream.
docid[in] The docid to shove on the end of of the list (with tf=1).

◆ text_render()

virtual void JASS::index_manager_sequential::text_render ( std::ostream &  stream) const
inlinevirtual

Dump a human-readable version of the index down the stream.

Parameters
stream[in] The stream to write to.

Reimplemented from JASS::index_manager.

◆ unittest_build_index()

static void JASS::index_manager_sequential::unittest_build_index ( index_manager_sequential index,
const std::string &  document_collection 
)
inlinestatic

Build and index for the 10 sample documents. This is used by several unit tests that need a valid index.

Parameters
index[out] The index once built.
document_collection[in] The documents to index.

The documentation for this class was generated from the following file: