JASSv2
Public Member Functions | Static Public Member Functions | Private Attributes | List of all members
JASS::ranking_function_atire_bm25 Class Reference

The ATIRE verison of BM25. More...

#include <ranking_function_atire_bm25.h>

Public Member Functions

 ranking_function_atire_bm25 (double k1, double b, std::vector< compress_integer::integer > &document_lengths)
 Constructor. More...
 
 ~ranking_function_atire_bm25 ()
 Destructor.
 
forceinline void compute_idf_component (compress_integer::integer document_frequency, compress_integer::integer documents_in_collection)
 Called once per term (per query). Computes the IDF component of the ranking function and stores it internally. More...
 
forceinline void compute_tf_component (index_postings_impact::impact_type term_frequency)
 Compute and store internally the term-frequency based component of the ranking function (useful when postings lists are impact ordered) More...
 
forceinline double compute_score (compress_integer::integer document_id, index_postings_impact::impact_type term_frequency)
 Compute BM25 from the given document, assuming pieces have already been computed. More...
 

Static Public Member Functions

static void unittest (void)
 Unit test this class.
 

Private Attributes

double idf
 the IDF of the term being processed
 
double top_row
 the top-row of the ranking function for the term being processed (tf(td) * (k1 + 1))
 
double k1_plus_1
 k1 + 1
 
double mean_document_length
 the mean of the document lengths
 
std::vector< float > length_correction
 most of the bottom row of BM25 (k1 * ((1 - b) + b * length / mean_document_length)) for the current term being processed
 

Detailed Description

The ATIRE verison of BM25.

Constructor & Destructor Documentation

◆ ranking_function_atire_bm25()

JASS::ranking_function_atire_bm25::ranking_function_atire_bm25 ( double  k1,
double  b,
std::vector< compress_integer::integer > &  document_lengths 
)
inline

Constructor.

Parameters
k1[in] the BM25 k1 parameter, 0.9 is a good value.
b[in] the BM25 b parameter, 0.4 is a good value.
document_lengths[in] a vector holding the length of each document in the collection.

Member Function Documentation

◆ compute_idf_component()

forceinline void JASS::ranking_function_atire_bm25::compute_idf_component ( compress_integer::integer  document_frequency,
compress_integer::integer  documents_in_collection 
)
inline

Called once per term (per query). Computes the IDF component of the ranking function and stores it internally.

Parameters
document_frequency[in] the number of documents that contain this term.
documents_in_collection[in] the number of documents in the collection.

◆ compute_score()

forceinline double JASS::ranking_function_atire_bm25::compute_score ( compress_integer::integer  document_id,
index_postings_impact::impact_type  term_frequency 
)
inline

Compute BM25 from the given document, assuming pieces have already been computed.

First compute the IDF for the term using compute_idf_component(), then the TF component using compute_tf_component(), then call this.

Parameters
document_id[in] The ID of the document (used to look up the length)
term_frequency[in] The number of times the term occurs in the document.

◆ compute_tf_component()

forceinline void JASS::ranking_function_atire_bm25::compute_tf_component ( index_postings_impact::impact_type  term_frequency)
inline

Compute and store internally the term-frequency based component of the ranking function (useful when postings lists are impact ordered)

Parameters
term_frequency[in] The number of times the term occurs in the document.

The documentation for this class was generated from the following file: