JASSv2
Public Member Functions | Protected Member Functions | List of all members
JASS::deserialised_jass_v2 Class Reference

Load and deserialise a JASS v2 index. More...

#include <deserialised_jass_v2.h>

Inheritance diagram for JASS::deserialised_jass_v2:
Inheritance graph
[legend]
Collaboration diagram for JASS::deserialised_jass_v2:
Collaboration graph
[legend]

Public Member Functions

 deserialised_jass_v2 (bool verbose=false)
 Constructor. More...
 
virtual ~deserialised_jass_v2 ()
 Destructor.
 
bool postings_details (metadata &metadata, const query_term &term) const
 Return the meta-data about the postings list. More...
 
virtual size_t get_segment_list (segment_header *segments, metadata &metadata, size_t query_term_frequency, uint32_t &smallest, uint32_t &largest, query::DOCID_TYPE &document_frequency) const
 Extract the segment headers and return them in the parameter called segments. More...
 
- Public Member Functions inherited from JASS::deserialised_jass_v1
 deserialised_jass_v1 (bool verbose=false)
 Constructor. More...
 
virtual ~deserialised_jass_v1 ()
 Destructor.
 
size_t read_index (const std::string &directory="")
 Read a JASS v1 index into memory. More...
 
compress_integercodex (std::string &name, int32_t &d_ness) const
 Return a reference to a decompressor that can be used with this index. More...
 
const std::vector< std::string > & primary_keys (void) const
 Return the list of primary keys as a std::vector<std::string> More...
 
const uint8_t * postings (void) const
 Return a pointer to the start of the postings "file". More...
 
query::DOCID_TYPE document_count (void) const
 Return the number of documents in the collection. More...
 
bool postings_details (metadata &metadata, const query_term &term) const
 Return the meta-data about the postings list. More...
 
auto begin (void)
 return an iterator over the vocabulary. More...
 
auto end (void)
 return an iterator to the end of the vocabulary. More...
 

Protected Member Functions

virtual size_t read_vocabulary (const std::string &vocab_filename="CIvocab.bin", const std::string &terms_filename="CIvocab_terms.bin")
 Read the JASS v2 index vocabulary files. More...
 
virtual size_t read_primary_keys (const std::string &primary_key_filename="CIdoclist.bin")
 Read the JASS v1 index primary key file. More...
 
- Protected Member Functions inherited from JASS::deserialised_jass_v1
virtual size_t read_postings (const std::string &postings_filename=POSTINGS_FILENAME)
 Read the JASS v1 index postings file. More...
 
size_t read_index_explicit (const std::string &primary_key_filename=PRIMARY_KEY_FILENAME, const std::string &vocab_filename=VOCAB_FILENAME, const std::string &terms_filename=TERMS_FILENAME, const std::string &postings_filename=POSTINGS_FILENAME)
 Read a JASS v1 index into memory. More...
 

Additional Inherited Members

- Protected Attributes inherited from JASS::deserialised_jass_v1
bool verbose
 Should this class produce diagnostics on stdout?
 
query::DOCID_TYPE documents
 The number of documents in the collection.
 
file::file_read_only primary_key_memory
 Memory used to store the primary key strings.
 
std::vector< std::string > primary_key_list
 The array of primary keys.
 
uint64_t terms
 The number of terms in the collection.
 
file::file_read_only vocabulary_memory
 Memory used to store the vocabulary pointers.
 
file::file_read_only vocabulary_terms_memory
 Memory used to store the vocabulary strings.
 
std::vector< metadatavocabulary_list
 The (sorted in alphabetical order) array of vocbulary terms.
 
file::file_read_only postings_memory
 Memory used to store the postings.
 

Detailed Description

Load and deserialise a JASS v2 index.

Constructor & Destructor Documentation

◆ deserialised_jass_v2()

JASS::deserialised_jass_v2::deserialised_jass_v2 ( bool  verbose = false)
inlineexplicit

Constructor.

Parameters
verbose[in] Should the index reading methods produce messages on stdout?

Member Function Documentation

◆ get_segment_list()

virtual size_t JASS::deserialised_jass_v2::get_segment_list ( segment_header segments,
metadata metadata,
size_t  query_term_frequency,
uint32_t &  smallest,
uint32_t &  largest,
query::DOCID_TYPE document_frequency 
) const
inlinevirtual

Extract the segment headers and return them in the parameter called segments.

Parameters
segments[out] The list of segments for the given search term (caller must ensure this ponts to a large enough array)
metadata[in] The metadata for the given search term
smallest[out] The largest impact score for this term
largest[out] The smallest impact score for this term
Returns
The number of segments extracted and added to the list

Reimplemented from JASS::deserialised_jass_v1.

◆ postings_details()

bool JASS::deserialised_jass_v2::postings_details ( metadata metadata,
const query_term term 
) const
inline

Return the meta-data about the postings list.

Parameters
metadata[out] If the term is found then this is is changed to contain the metadata about the term
term[in] Find the metadata for this term
Returns
true on success, false on fail (e.g. term not in dictionary)

◆ read_primary_keys()

size_t JASS::deserialised_jass_v2::read_primary_keys ( const std::string &  primary_key_filename = "CIdoclist.bin")
protectedvirtual

Read the JASS v1 index primary key file.

Parameters
primary_key_filename[in] the name of the file containing the primary key list ("CIdoclist.bin")
Returns
The number of documents in the collection (or 0 on error)

Reimplemented from JASS::deserialised_jass_v1.

◆ read_vocabulary()

size_t JASS::deserialised_jass_v2::read_vocabulary ( const std::string &  vocab_filename = "CIvocab.bin",
const std::string &  terms_filename = "CIvocab_terms.bin" 
)
protectedvirtual

Read the JASS v2 index vocabulary files.

Parameters
vocab_filename[in] the name of the file containing the vocabulary pointers ("CIvocab.bin")
terms_filename[in] the name of the file containing the vocabulary strings ("CIvocab_terms.bin")
Returns
The number of documents in the collection (or 0 on error)

Reimplemented from JASS::deserialised_jass_v1.


The documentation for this class was generated from the following files: