crawlserv++  [under development]
Application for crawling and analyzing textual content of websites.
crawlservpp::Query::XPath Class Reference

Implements a XPath query using the pugixml library. More...

#include <XPath.hpp>

Classes

class  Exception
 Class for XPath exceptions. More...
 

Construction

 XPath (const std::string &xpath, bool textOnly)
 Constructor setting a XPath string and whether the result should be text-only. More...
 

Getters

bool getBool (const Parsing::XML &doc) const
 Gets a boolean result from performing the query on a parsed XML document. More...
 
void getFirst (const Parsing::XML &doc, std::string &resultTo) const
 Gets the first match from performing the query on a parsed JSON document. More...
 
void getAll (const Parsing::XML &doc, std::vector< std::string > &resultTo) const
 Gets all matches from performing the query on a parsed JSON document. More...
 
void getSubSets (const Parsing::XML &doc, std::vector< Parsing::XML > &resultTo) const
 Gets all matching subsets from performing the query on a parsed JSON document. More...
 

Detailed Description

Implements a XPath query using the pugixml library.

For more information about the pugixml library, see its GitHub repository

Constructor & Destructor Documentation

◆ XPath()

crawlservpp::Query::XPath::XPath ( const std::string &  xpath,
bool  textOnly 
)
inline

Constructor setting a XPath string and whether the result should be text-only.

Parameters
xpathConst reference to a string containing the XPath expression.
textOnlySet whether the query should result in raw text only.
Exceptions
XPath::Exceptionif an error occurs during the compilation of the XPath expression.

Member Function Documentation

◆ getAll()

void crawlservpp::Query::XPath::getAll ( const Parsing::XML doc,
std::vector< std::string > &  resultTo 
) const
inline

Gets all matches from performing the query on a parsed JSON document.

Parameters
docConst reference to a XML document parsed by tidy-html5.
resultToReference to a vector to which the results will be written. The vector will be cleared even if an error occurs during execution of the query.
Exceptions
XPath::Exceptionif no XPath query has been compiled, no XML document has been parsed, or an error occurs during the execution of the query.

Referenced by crawlservpp::Main::Server::tick().

◆ getBool()

bool crawlservpp::Query::XPath::getBool ( const Parsing::XML doc) const
inline

Gets a boolean result from performing the query on a parsed XML document.

Parameters
docConst reference to a XML document parsed by tidy-html5.
Returns
True, if there is at least one match after performing the query on the document. False otherwise.
Exceptions
XPath::Exceptionif no XPath query has been compiled, no XML document has been parsed, or an error occurs during the execution of the query.

Referenced by crawlservpp::Main::Server::tick().

◆ getFirst()

void crawlservpp::Query::XPath::getFirst ( const Parsing::XML doc,
std::string &  resultTo 
) const
inline

Gets the first match from performing the query on a parsed JSON document.

Parameters
docConst reference to a XML document parsed by tidy-html5.
resultToReference to a string to which the result will be written. The string will be cleared even if an error occurs during execution of the query.
Exceptions
XPath::Exceptionif no XPath query has been compiled, no XML document has been parsed, or an error occurs during the execution of the query.

Referenced by crawlservpp::Main::Server::tick().

◆ getSubSets()

void crawlservpp::Query::XPath::getSubSets ( const Parsing::XML doc,
std::vector< Parsing::XML > &  resultTo 
) const
inline

Gets all matching subsets from performing the query on a parsed JSON document.

The subsets will be saved as JSON documents as defined by the jsoncons library.

Parameters
docConst reference to a XML document parsed by tidy-html5.
resultToReference to a vector to which the results will be written. The vector will be cleared even if an error occurs during execution of the query.
Exceptions
XPath::Exceptionif no XPath query has been compiled, no XML document has been parsed, or an error occurs during the execution of the query.

References crawlservpp::Query::cDataHead, crawlservpp::Query::cDataTail, and crawlservpp::Helper::Strings::trim().

Referenced by crawlservpp::Main::Server::tick().


The documentation for this class was generated from the following file: