|
JASSv2
|
The API to JASS's anytime seach engine. More...
#include <JASS_anytime_api.h>


Classes | |
| class | thread_data |
Public Member Functions | |
| JASS_anytime_api () | |
| Constructor. | |
| virtual | ~JASS_anytime_api () |
| Destructor. | |
| JASS_ERROR | load_index (size_t index_version, const std::string &directory="", bool verbose=false) |
| Load a JASS index from the given directory. More... | |
| uint32_t | get_document_count (void) |
| Return the number of documents in the index. More... | |
| JASS_ERROR | get_encoding_scheme (std::string &codex_name, int32_t &d_ness) |
| Return the name of the compression algorithm and the delta (d-gap) size. More... | |
| std::string | get_encoding_scheme_name (void) |
| Return the name of the compression algorithm used on this index. More... | |
| int32_t | get_encoding_scheme_d (void) |
| Return the d-gap value used in this index. More... | |
| void | set_accumulator_manager (const std::string &name) |
| Set the name of the accumulator manager to use. More... | |
| JASS_ERROR | set_postings_to_process_proportion (double percent) |
| Set the number of postings to process as a proportion of the number of documents in the collection. More... | |
| JASS_ERROR | set_postings_to_process_relative (double percent) |
| Set the number of postings to process as a proportion of the number of postings for this query. More... | |
| JASS_ERROR | set_postings_to_process (size_t count) |
| Set the number of postings to process as an absolute number. More... | |
| uint32_t | get_postings_to_process (void) |
| Return the current maimum number of postings to process value. This might be very large. More... | |
| uint32_t | get_max_top_k (void) |
| Return the largest possible top_k value, which might be smaller than the number of documents in the collection. More... | |
| JASS_ERROR | set_top_k (size_t k) |
| Set the maximum numbre of documents to return in a resuts list. More... | |
| size_t | get_top_k (void) |
| Return the current top-k value. More... | |
| JASS_ERROR | set_accumulator_width (size_t width) |
| Set the accumulator page-table width (assuming a page-table like approach is being used) More... | |
| JASS_ERROR | use_ascii_parser (void) |
| Use the query parser that seperates tokens on whitespace along (this method is not normally needed). More... | |
| JASS_ERROR | use_query_parser (void) |
| Use the default query parser that understands alphanumnerics, spaces, unicode, and so on (this method is not normally needed). More... | |
| JASS_anytime_result | search (const std::string &query) |
| Search using the current index and the current parameters. More... | |
| JASS_ERROR | search (std::vector< JASS_anytime_thread_result > &output, std::vector< JASS_anytime_query > &query_list, size_t thread_count) |
| Search using the current index and the current parameters. More... | |
| std::vector< JASS_anytime_thread_result > | threaded_search (std::vector< std::string > &query_list, size_t thread_count) |
| Search using the current index and the current parameters. More... | |
Private Member Functions | |
| JASS_anytime_api (JASS_anytime_api &from)=delete | |
| JASS_anytime_api (JASS_anytime_api &&from)=delete | |
| JASS_anytime_api & | operator= (JASS_anytime_api &from)=delete |
| JASS_anytime_api & | operator= (JASS_anytime_api &&from)=delete |
| void | anytime (JASS_anytime_thread_result &output, std::vector< JASS_anytime_query > &query_list, size_t thread_number=0) |
| This method calls into the search engine with a set of queries and retrieves a set of results for each. More... | |
| thread_data & | get_thread_local_data (size_t thread_number) |
| Return the expensive-to-initialise thread local data that has already been allocated on index load. More... | |
Static Private Member Functions | |
| static void | anytime_bootstrap (JASS_anytime_api *thiss, JASS_anytime_thread_result &output, std::vector< JASS_anytime_query > &query_list, size_t thread_number) |
| Bootstrapping method for a thread to call into anytime() More... | |
Private Attributes | |
| JASS::deserialised_jass_v1 * | index |
| The index. | |
| size_t | postings_to_process |
| The maximunm number of postings to process. | |
| double | relative_postings_to_process |
| If not 1 then then this is the proportion of this query's postings that should be processed. | |
| size_t | top_k |
| The number of documents we want in the results list. | |
| JASS::parser_query::parser_type | which_query_parser |
| Use the simple ASCII parser or the regular query parser. | |
| size_t | accumulator_width |
| Width of the accumulator array. | |
| JASS_anytime_stats | stats |
| Stats for this "session". | |
| std::map< size_t, thread_data > | thread_local_data |
| Data needed by each thread (the accumulators array, etc) | |
| std::string | accumulator_manager |
| The name of the accumulator manager. | |
Static Private Attributes | |
| static constexpr size_t | MAX_QUANTUM = 0x0FFF |
| The maximum number of segments in a query. | |
| static constexpr size_t | MAX_TERMS_PER_QUERY = 1024 |
| The maximum number of terms in a query. | |
The API to JASS's anytime seach engine.
|
private |
This method calls into the search engine with a set of queries and retrieves a set of results for each.
| output | [out] The results for each query |
| query_list | [in] The list of queries to perform |
|
staticprivate |
Bootstrapping method for a thread to call into anytime()
| thiss | [in] Pointer to the object to call into |
| output | [out] The results for each query |
| query_ist | [in] The list of queries to perform |
| thread_number | [in] The ID of this thread |
| JASS::query::DOCID_TYPE JASS_anytime_api::get_document_count | ( | void | ) |
Return the number of documents in the index.
| JASS_ERROR JASS_anytime_api::get_encoding_scheme | ( | std::string & | codex_name, |
| int32_t & | d_ness | ||
| ) |
Return the name of the compression algorithm and the delta (d-gap) size.
This method might not be exposed to languages (such as Python) that do not support non-const references to strings, use get_encoding_scheme_name() or get_encoding_scheme_d() instead.
| codex_name | [out] The compression scheme |
| d_ness | [out] The d-gap size (normally 1)] |
| int32_t JASS_anytime_api::get_encoding_scheme_d | ( | void | ) |
Return the d-gap value used in this index.
| std::string JASS_anytime_api::get_encoding_scheme_name | ( | void | ) |
Return the name of the compression algorithm used on this index.
| JASS::query::DOCID_TYPE JASS_anytime_api::get_max_top_k | ( | void | ) |
Return the largest possible top_k value, which might be smaller than the number of documents in the collection.
| uint32_t JASS_anytime_api::get_postings_to_process | ( | void | ) |
Return the current maimum number of postings to process value. This might be very large.
|
private |
Return the expensive-to-initialise thread local data that has already been allocated on index load.
| thread_number | [in] The therad number (counts from 0) |
| size_t JASS_anytime_api::get_top_k | ( | void | ) |
Return the current top-k value.
| JASS_ERROR JASS_anytime_api::load_index | ( | size_t | index_version, |
| const std::string & | directory = "", |
||
| bool | verbose = false |
||
| ) |
Load a JASS index from the given directory.
| index_version | [in] What verison of the index is this - normally 2. |
| directory[in] | The path to the index, default = "." |
| verbose | [in] if true, diagnostics are printed while the index is loading, default = false |
| JASS_anytime_result JASS_anytime_api::search | ( | const std::string & | query | ) |
Search using the current index and the current parameters.
| query[in] | The query. If the query starts with a numner that number is assumed to be the query_id and NOT a search term (so it is not searched for). |
| JASS_ERROR JASS_anytime_api::search | ( | std::vector< JASS_anytime_thread_result > & | output, |
| std::vector< JASS_anytime_query > & | query_list, | ||
| size_t | thread_count | ||
| ) |
Search using the current index and the current parameters.
This method might not be exposed to languages (such as Python) that do not support non-const references to objects. Search one query at a time using search() and do the threading in the calling laguage if need be.
| output | [out] a vector of results, one for each thread |
| query_list | [in] A vector of queries to be spread over all the threads |
| thread_count | [in] The number of threads to use for searching |
|
inline |
Set the name of the accumulator manager to use.
| [in] | name | The name of the manager to use |
| JASS_ERROR JASS_anytime_api::set_accumulator_width | ( | size_t | width | ) |
Set the accumulator page-table width (assuming a page-table like approach is being used)
| width | [in[ The width of the table will be set to 2^width |
| JASS_ERROR JASS_anytime_api::set_postings_to_process | ( | size_t | count | ) |
Set the number of postings to process as an absolute number.
An index does not need to be loaded first. Which ever of set_postings_to_process() and set_postings_to_process_proportion() is called last takes presidence. By default all postings are processed.
| count | [in] The maximum number of postings to process |
| JASS_ERROR JASS_anytime_api::set_postings_to_process_proportion | ( | double | percent | ) |
Set the number of postings to process as a proportion of the number of documents in the collection.
An index must be loaded before this method is called, if not it returns JASS_ERROR_NO_INDEX and has no effect Which ever of set_postings_to_process() and set_postings_to_process_proportion() is called last takes presidence. By default all postings are processed.
| percent | [in] The percent to use (for example, 10 is use 10% of the postings) |
| JASS_ERROR JASS_anytime_api::set_postings_to_process_relative | ( | double | percent | ) |
Set the number of postings to process as a proportion of the number of postings for this query.
An index does not need to be loaded first. This method takes precidence over set_postings_to_process() and set_postings_to_process_proportion(). By default all postings are processed.
| percent | [in] The percent to use (for example, 10 is use 10% of the postings) |
| JASS_ERROR JASS_anytime_api::set_top_k | ( | size_t | k | ) |
Set the maximum numbre of documents to return in a resuts list.
| k | [in] The k value for top-k search |
| std::vector< JASS_anytime_thread_result > JASS_anytime_api::threaded_search | ( | std::vector< std::string > & | query_list, |
| size_t | thread_count | ||
| ) |
Search using the current index and the current parameters.
| query_list | [in] A vector of queries to be spread over all the threads |
| thread_count | [in] The number of threads to use for searching |
| JASS_ERROR JASS_anytime_api::use_ascii_parser | ( | void | ) |
Use the query parser that seperates tokens on whitespace along (this method is not normally needed).
| JASS_ERROR JASS_anytime_api::use_query_parser | ( | void | ) |
Use the default query parser that understands alphanumnerics, spaces, unicode, and so on (this method is not normally needed).
1.8.13