|
JASSv2
|
Non-thread-Safe indexer object. More...
#include <index_manager_sequential.h>


Classes | |
| class | delegate |
| Base class for the callback function called by iterate. More... | |
Public Member Functions | |
| index_manager_sequential () | |
| Constructor. | |
| virtual | ~index_manager_sequential () |
| Destructor. | |
| virtual void | begin_document (const slice &document_primary_key) |
| Tell this object that you're about to start indexing a new object. More... | |
| virtual void | set_primary_keys (const std::vector< slice > &keys) |
| Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning) More... | |
| virtual void | term (const parser::token &term) |
| Hand a new term from the token stream to this object. More... | |
| virtual void | term (const parser::token &term, const std::vector< posting > &postings_list) |
| Hand a new term with a pre-computed postings list to this object. More... | |
| virtual void | term (const parser::token &term, compress_integer::integer docid) |
| Hand a new term with a pre-computed postings list to this object. More... | |
| virtual void | text_render (std::ostream &stream) const |
| Dump a human-readable version of the index down the stream. More... | |
| virtual void | iterate (index_manager::delegate &callback) |
| Iterate over the index calling callback.operator() with each postings list. More... | |
| virtual void | iterate (index_manager::quantizing_delegate &quantizer, index_manager::delegate &callback) |
| Iterate over the index calling callback.operator() with each postings list. More... | |
Public Member Functions inherited from JASS::index_manager | |
| index_manager () | |
| Constructor. | |
| virtual | ~index_manager () |
| Destructor. | |
| virtual void | end_document (compress_integer::integer document_length) |
| Tell this object that you've finished with the current document (and are about to move on to the next, or are completely finished). | |
| virtual std::vector< compress_integer::integer > & | get_document_length_vector (void) |
| Return a reference to the document length vector. More... | |
| virtual void | set_document_length_vector (std::vector< compress_integer::integer > &new_lengths) |
| Replace the document length vector with the one passed to this function (warning). More... | |
| compress_integer::integer | get_highest_document_id (void) const |
| Return the number of documents that have been successfully indexed or are in the process of being indexed. | |
Static Public Member Functions | |
| static void | unittest_build_index (index_manager_sequential &index, const std::string &document_collection) |
| Build and index for the 10 sample documents. This is used by several unit tests that need a valid index. More... | |
| static void | unittest (void) |
| Unit test this class. | |
Static Public Member Functions inherited from JASS::index_manager | |
| static void | unittest (void) |
| Unit test this class. | |
Protected Member Functions | |
| void | make_space (void) |
| make sure all the internal buffers needed for iteration have been allocated | |
Private Attributes | |
| allocator_pool | memory |
| All memory in allocatged from this allocator. | |
| hash_table< slice, index_postings, 24 > | index |
| The index is a hash table of index_postings keyed on the term (a slice). | |
| dynamic_array< slice > | primary_key |
| The list of primary keys (i.e. external document identifiers) allocated in memory. | |
| compress_integer::integer * | document_ids |
| The re-used buffer storing decoded document ids. | |
| index_postings_impact::impact_type * | term_frequencies |
| The re-used buffer storing the term frequencies. | |
| size_t | temporary_size |
| The number of bytes in temporary. | |
| uint8_t * | temporary |
| Temporary buffer - cannot be used to store anything between calls. | |
Non-thread-Safe indexer object.
This class is a non-thread-safe indexer used for regular sequential indexing. It self-contains its memory, uses a hash-table with direct chaining (in a non-ballanced tree) and supports a positional index.
|
inlinevirtual |
Tell this object that you're about to start indexing a new object.
| document_primary_key | [in] The document's primary key (or external document identifier). |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Iterate over the index calling callback.operator() with each postings list.
| callback | [in] The callback to call. |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Iterate over the index calling callback.operator() with each postings list.
| quantizer | [in] The quantizer that will quantize then call the serialiser callback. |
| callback | [in] The callback that the quantizer should call. |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Add a list of primary keys to the current list. Normally used to set it without actually indexing (warning)
Normally this method would only be called when an index is being "pushed" into an object rather than indexing document at a time. This method actually adds to the end of the primary key list which is assumed to be empty before the method is called, but might not be if some indexing has already happened.
| keys | [in] The vector of primary keys. |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Hand a new term from the token stream to this object.
| term | [in] The term from the token stream. |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Hand a new term with a pre-computed postings list to this object.
| term | [in] The term from the token stream. |
| postings_list | [in] The pre-computed D1-encoded postings list |
Reimplemented from JASS::index_manager.
|
inlinevirtual |
Hand a new term with a pre-computed postings list to this object.
| term | [in] The term from the token stream. |
| docid | [in] The docid to shove on the end of of the list (with tf=1). |
|
inlinevirtual |
Dump a human-readable version of the index down the stream.
| stream | [in] The stream to write to. |
Reimplemented from JASS::index_manager.
|
inlinestatic |
Build and index for the 10 sample documents. This is used by several unit tests that need a valid index.
| index | [out] The index once built. |
| document_collection | [in] The documents to index. |
1.8.13