JASSv2
Classes | Public Member Functions | Static Public Member Functions | Private Attributes | List of all members
JASS::index_manager Class Reference

Base class for holding the index during indexing. More...

#include <index_manager.h>

Inheritance diagram for JASS::index_manager:
Inheritance graph
[legend]

Classes

class  delegate
 Base class for the callback function called by iterate. More...
 
class  quantizing_delegate
 Base class for the callback function called by iterate. More...
 

Public Member Functions

 index_manager ()
 Constructor.
 
virtual ~index_manager ()
 Destructor.
 
virtual void set_primary_keys (const std::vector< slice > &keys)
 Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning) More...
 
virtual void begin_document (const slice &primary_key)
 Tell this object that you're about to start indexing a new object. More...
 
virtual void term (const parser::token &term)
 Hand a new term from the token stream to this object. More...
 
virtual void term (const parser::token &term, const std::vector< posting > &postings_list)
 Hand a new term with a pre-computed postings list to this object. More...
 
virtual void end_document (compress_integer::integer document_length)
 Tell this object that you've finished with the current document (and are about to move on to the next, or are completely finished).
 
virtual std::vector< compress_integer::integer > & get_document_length_vector (void)
 Return a reference to the document length vector. More...
 
virtual void set_document_length_vector (std::vector< compress_integer::integer > &new_lengths)
 Replace the document length vector with the one passed to this function (warning). More...
 
virtual void text_render (std::ostream &stream) const
 unimplemented: Dump a human-readable version of the index down the stream. More...
 
virtual void iterate (delegate &callback)
 Iterate over the index calling callback.operator() with each postings list. More...
 
virtual void iterate (index_manager::quantizing_delegate &quantizer, index_manager::delegate &callback)
 Iterate over the index calling callback.operator() with each postings list. More...
 
compress_integer::integer get_highest_document_id (void) const
 Return the number of documents that have been successfully indexed or are in the process of being indexed.
 

Static Public Member Functions

static void unittest (void)
 Unit test this class.
 

Private Attributes

compress_integer::integer highest_document_id
 The highest document_id seen so far (counts from 1).
 
std::vector< compress_integer::integerdocument_length_vector
 vector of document lengths.
 

Detailed Description

Base class for holding the index during indexing.

This class is a base class used to define the interface for different approaches to indexing. Once an object of this type has been declared it is used by calling begin_document() at the beginning of each document, end_document() at the end of each document, and term() for each term in the token stream (i.e. "the cat and the dog" is 5 tokens, "the", "cat", "and", "the", "dog". This class does not stem and it does not stop words. That behaviour is exterior to this class. To find out how many documents have been indexed up-to a given point call get_highest_document_id(). When subclassing, remember to call this class's methods from the over-ridden methods in the sub-class.

Member Function Documentation

◆ begin_document()

virtual void JASS::index_manager::begin_document ( const slice primary_key)
inlinevirtual

Tell this object that you're about to start indexing a new object.

Parameters
primary_key[in] The primary key (i.e. external dociment identifier) of this document.

Reimplemented in JASS::index_manager_sequential.

◆ get_document_length_vector()

virtual std::vector<compress_integer::integer>& JASS::index_manager::get_document_length_vector ( void  )
inlinevirtual

Return a reference to the document length vector.

Returns
The document length vector. This is only valid for as long as the index_manager object exists.

◆ iterate() [1/2]

virtual void JASS::index_manager::iterate ( delegate callback)
inlinevirtual

Iterate over the index calling callback.operator() with each postings list.

Parameters
callback[in] The callback to call.

Reimplemented in JASS::index_manager_sequential.

◆ iterate() [2/2]

virtual void JASS::index_manager::iterate ( index_manager::quantizing_delegate quantizer,
index_manager::delegate callback 
)
inlinevirtual

Iterate over the index calling callback.operator() with each postings list.

Parameters
quantizer[in] The quantizer that will quantize then call the serialiser callback.
callback[in] The callback that the quantizer should call.

Reimplemented in JASS::index_manager_sequential.

◆ set_document_length_vector()

virtual void JASS::index_manager::set_document_length_vector ( std::vector< compress_integer::integer > &  new_lengths)
inlinevirtual

Replace the document length vector with the one passed to this function (warning).

Parameters
new_lengths[in] The new document length vectror

It is possble that new_length.size() is different to the current largest document number. If this is the case then the largest document number is set to the number of documents in new_lengths, and future calls to index a single document will fail (the alternative is that documents in the middle get lengths of 0).

◆ set_primary_keys()

virtual void JASS::index_manager::set_primary_keys ( const std::vector< slice > &  keys)
inlinevirtual

Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning)

Normally this method would only be called when an index is being "pushed" into an object rather than indexing document at a time. This method actually adds to the end of the primary key list which is assumed to be empty before the method is called, but might not be if some indexing has already happened.

Parameters
keys[in] The vector of primary keys.

Reimplemented in JASS::index_manager_sequential.

◆ term() [1/2]

virtual void JASS::index_manager::term ( const parser::token term)
inlinevirtual

Hand a new term from the token stream to this object.

Parameters
term[in] The term from the token stream.

Reimplemented in JASS::index_manager_sequential.

◆ term() [2/2]

virtual void JASS::index_manager::term ( const parser::token term,
const std::vector< posting > &  postings_list 
)
inlinevirtual

Hand a new term with a pre-computed postings list to this object.

Parameters
term[in] The term from the token stream.
postings_list[in] The pre-computed D1-encoded postings list

Reimplemented in JASS::index_manager_sequential.

◆ text_render()

virtual void JASS::index_manager::text_render ( std::ostream &  stream) const
inlinevirtual

unimplemented: Dump a human-readable version of the index down the stream.

Parameters
stream[in] The stream to write to.

Reimplemented in JASS::index_manager_sequential.


The documentation for this class was generated from the following file: