Load and deserialise a JASS v1 index.
More...
#include <deserialised_jass_v1.h>
|
| class | metadata |
| | metadata for a given term including pointer to postings and number of impacts. More...
|
| |
| class | segment_header |
| | Each impact ordered segment contains a header with the impact score and the pointers to documents. More...
|
| |
| class | segment_header_on_disk |
| | Each impact ordered segment contains a header with the impact score and the pointers to documents. More...
|
| |
|
| virtual size_t | read_primary_keys (const std::string &primary_key_filename=PRIMARY_KEY_FILENAME) |
| | Read the JASS v1 index primary key file. More...
|
| |
| virtual size_t | read_vocabulary (const std::string &vocab_filename=VOCAB_FILENAME, const std::string &terms_filename=TERMS_FILENAME) |
| | Read the JASS v1 index vocabulary files. More...
|
| |
| virtual size_t | read_postings (const std::string &postings_filename=POSTINGS_FILENAME) |
| | Read the JASS v1 index postings file. More...
|
| |
| size_t | read_index_explicit (const std::string &primary_key_filename=PRIMARY_KEY_FILENAME, const std::string &vocab_filename=VOCAB_FILENAME, const std::string &terms_filename=TERMS_FILENAME, const std::string &postings_filename=POSTINGS_FILENAME) |
| | Read a JASS v1 index into memory. More...
|
| |
|
|
static constexpr const char * | PRIMARY_KEY_FILENAME = "CIdoclist.bin" |
| |
|
static constexpr const char * | VOCAB_FILENAME = "CIvocab.bin" |
| |
|
static constexpr const char * | TERMS_FILENAME = "CIvocab_terms.bin" |
| |
|
static constexpr const char * | POSTINGS_FILENAME = "CIpostings.bin" |
| |
Load and deserialise a JASS v1 index.
◆ deserialised_jass_v1()
| JASS::deserialised_jass_v1::deserialised_jass_v1 |
( |
bool |
verbose = false | ) |
|
|
inlineexplicit |
Constructor.
- Parameters
-
| verbose | [in] Should the index reading methods produce messages on stdout? |
◆ begin()
| auto JASS::deserialised_jass_v1::begin |
( |
void |
| ) |
|
|
inline |
return an iterator over the vocabulary.
- Returns
- an iterator over the vocabulary.
◆ codex()
| compress_integer * JASS::deserialised_jass_v1::codex |
( |
std::string & |
name, |
|
|
int32_t & |
d_ness |
|
) |
| const |
Return a reference to a decompressor that can be used with this index.
- Parameters
-
| name | [out] The name of the compression codex |
| d_ness | [out] Whether the codex requires D0, D1, etc decoding (-1 if it supports decode_and_process via decode_none) |
- Returns
- A reference to a compress_integer that can decode the given codex
◆ document_count()
Return the number of documents in the collection.
- Returns
- the number of documents in the collection
◆ end()
| auto JASS::deserialised_jass_v1::end |
( |
void |
| ) |
|
|
inline |
return an iterator to the end of the vocabulary.
- Returns
- an iterator to the end of the vocabulary.
◆ get_segment_list()
| virtual size_t JASS::deserialised_jass_v1::get_segment_list |
( |
segment_header * |
segments, |
|
|
metadata & |
metadata, |
|
|
size_t |
query_term_frequency, |
|
|
uint32_t & |
smallest, |
|
|
uint32_t & |
largest, |
|
|
query::DOCID_TYPE & |
document_frequency |
|
) |
| const |
|
inlinevirtual |
Extract the segment headers and return them in the parameter called segments.
- Parameters
-
| segments | [out] The list of segments for the given search term (caller must ensure this ponts to a large enough array) |
| metadata | [in] The metadata for the given search term |
| smallest | [out] The largest impact score for this term |
| largest | [out] The smallest impact score for this term |
- Returns
- The number of segments extracted and added to the list
Reimplemented in JASS::deserialised_jass_v2.
◆ postings()
| const uint8_t* JASS::deserialised_jass_v1::postings |
( |
void |
| ) |
const |
|
inline |
Return a pointer to the start of the postings "file".
- Returns
- A pointer to the start of the postings "file"
◆ postings_details()
| bool JASS::deserialised_jass_v1::postings_details |
( |
metadata & |
metadata, |
|
|
const query_term & |
term |
|
) |
| const |
|
inline |
Return the meta-data about the postings list.
- Parameters
-
| metadata | [out] If the term is found then this is is changed to contain the metadata about the term |
| term | [in] Find the metadata for this term |
- Returns
- true on success, false on fail (e.g. term not in dictionary)
◆ primary_keys()
| const std::vector<std::string>& JASS::deserialised_jass_v1::primary_keys |
( |
void |
| ) |
const |
|
inline |
Return the list of primary keys as a std::vector<std::string>
- Returns
- A reference to a vector of primary keys
◆ read_index()
| size_t JASS::deserialised_jass_v1::read_index |
( |
const std::string & |
directory = "" | ) |
|
Read a JASS v1 index into memory.
- Parameters
-
| directory | [in] The directory to search for and index |
- Returns
- 0 on failure, non-zero on success
◆ read_index_explicit()
| size_t JASS::deserialised_jass_v1::read_index_explicit |
( |
const std::string & |
primary_key_filename = PRIMARY_KEY_FILENAME, |
|
|
const std::string & |
vocab_filename = VOCAB_FILENAME, |
|
|
const std::string & |
terms_filename = TERMS_FILENAME, |
|
|
const std::string & |
postings_filename = POSTINGS_FILENAME |
|
) |
| |
|
protected |
Read a JASS v1 index into memory.
- Parameters
-
| primary_key_filename | [in] the name of the file containing the primary key list ("CIdoclist.bin") |
| vocab_filename | [in] the name of the file containing the vocabulary pointers ("CIvocab.bin") |
| terms_filename | [in] the name of the file containing the vocabulary strings ("CIvocab_terms.bin") |
| postings_filename | [in] the name of the file containing the postings ("CIpostings.bin") |
- Returns
- 0 on failure, non-zero on success
◆ read_postings()
| size_t JASS::deserialised_jass_v1::read_postings |
( |
const std::string & |
postings_filename = POSTINGS_FILENAME | ) |
|
|
protectedvirtual |
Read the JASS v1 index postings file.
- Parameters
-
| postings_filename | [in] the name of the file containing the postings ("CIpostings.bin") |
- Returns
- size of the posings file or 0 on failure
◆ read_primary_keys()
| size_t JASS::deserialised_jass_v1::read_primary_keys |
( |
const std::string & |
primary_key_filename = PRIMARY_KEY_FILENAME | ) |
|
|
protectedvirtual |
Read the JASS v1 index primary key file.
- Parameters
-
| primary_key_filename | [in] the name of the file containing the primary key list ("CIdoclist.bin") |
- Returns
- The number of documents in the collection (or 0 on error)
Reimplemented in JASS::deserialised_jass_v2.
◆ read_vocabulary()
| size_t JASS::deserialised_jass_v1::read_vocabulary |
( |
const std::string & |
vocab_filename = VOCAB_FILENAME, |
|
|
const std::string & |
terms_filename = TERMS_FILENAME |
|
) |
| |
|
protectedvirtual |
Read the JASS v1 index vocabulary files.
- Parameters
-
| vocab_filename | [in] the name of the file containing the vocabulary pointers ("CIvocab.bin") |
| terms_filename | [in] the name of the file containing the vocabulary strings ("CIvocab_terms.bin") |
- Returns
- The number of documents in the collection (or 0 on error)
Reimplemented in JASS::deserialised_jass_v2.
The documentation for this class was generated from the following files: