crawlserv++
[under development]
Application for crawling and analyzing textual content of websites.
|
Namespace for global JSON helper functions. More...
Classes | |
class | Exception |
Class for JSON exceptions. More... | |
Constants | |
constexpr auto | unicodeEscapeLength {6} |
The length of an escaped Unicode character in JSON code (including the '\u'). More... | |
constexpr auto | unicodeEscapeDigit1 {2} |
The offset of the first Unicode character digit in JSON code (from the '\'). More... | |
constexpr auto | unicodeEscapeDigit2 {3} |
The offset of the second Unicode character digit in JSON code (from the '\'). More... | |
constexpr auto | unicodeEscapeDigit3 {4} |
The offset of the third Unicode character digit in JSON code (from the '\'). More... | |
constexpr auto | unicodeEscapeDigit4 {5} |
The offset of the fourth Unicode character digit in JSON code (from the '\'). More... | |
constexpr auto | numDebugChars {25} |
The number of characters to show before and behind a JSON error. More... | |
Stringification | |
std::string | stringify (const std::vector< std::string > &vectorToStringify) |
Stringifies a vector of strings into one string containing a JSON array. More... | |
std::string | stringify (const std::string &stringToStringify) |
Converts a string into a JSON array with the string as the only element inside it. More... | |
std::string | stringify (const char *stringToStringify) |
Converts a string into a JSON array with the string as the only element inside it. More... | |
std::string | stringify (const std::vector< std::vector< std::pair< std::string, std::string >>> &vectorToStringify) |
Converts a vector of vectors of string pairs into a JSON array with corresponding objects containing [key, value] pairs. More... | |
std::string | stringify (const Struct::TextMap &textMapToStringify) |
Converts a text map into a JSON array with corresponding objects containing [key, value] pairs. More... | |
std::string | stringify (const rapidjson::Value &value) |
Stringifies a JSON value using the RapidJSON . More... | |
std::string | stringify (const jsoncons::json &json) |
Stringifies a JSON value using jsoncons . More... | |
Parsing | |
std::string | cleanCopy (std::string_view json) |
Copies and cleans the given JSON code to prepare it for parsing. More... | |
rapidjson::Document | parseRapid (std::string_view json) |
Parses JSON code using RapidJSON . More... | |
jsoncons::json | parseCons (std::string_view json) |
Parses JSON code using jsoncons . More... | |
Struct::TextMap | parseTextMapJson (std::string_view json) |
Parses JSON code using RapidJSON and converts it into a text map. More... | |
std::vector< std::pair< std::size_t, std::size_t > > | parsePosLenPairsJson (std::string_view json) |
Parses JSON code using RapidJSON and converts it into [pos;length] pairs. More... | |
Memory | |
static void | free (rapidjson::Document &target) |
Frees memory by swapping. More... | |
Namespace for global JSON helper functions.
|
inline |
Copies and cleans the given JSON code to prepare it for parsing.
Removes control characters and corrects escape sequences in the given JSON code.
If the given JSON code is empty, an empty string will be returned.
"
\
/
b
f
n
r
t
, as well as u
+ 4 hex digits.json | A string view containing the JSON code to be copied and cleaned. |
References unicodeEscapeDigit1, unicodeEscapeDigit2, unicodeEscapeDigit3, unicodeEscapeDigit4, and unicodeEscapeLength.
Referenced by parseCons(), and parseRapid().
|
inlinestatic |
Frees memory by swapping.
target | The rapidjson object to be freed by swapping. |
Referenced by crawlservpp::Query::Container::clearQueryTarget(), crawlservpp::Query::Container::nextSubSet(), crawlservpp::Query::Container::reserveForSubSets(), and crawlservpp::Main::Server::tick().
|
inline |
Parses JSON code using jsoncons
.
If the initial parsing fails and the JSON code contains backslashes, it tries again after escaping these backslashes – i.e., replacing '\' with '\'.
For more information about jsoncons
, see its GitHub repository.
json | A string view containing the JSON code to parse. |
Json::Exception | if an error occurs while parsing the given JSON code. |
References cleanCopy(), and crawlservpp::Helper::Strings::replaceAll().
Referenced by crawlservpp::Query::Container::addSubSetsFromQueryOnSubSet(), crawlservpp::Query::Container::getBoolFromQuery(), crawlservpp::Query::Container::getBoolFromQueryOnSubSet(), crawlservpp::Query::Container::getMultiFromQuery(), crawlservpp::Query::Container::getMultiFromQueryOnSubSet(), crawlservpp::Query::Container::getSingleFromQuery(), crawlservpp::Query::Container::getSingleFromQueryOnSubSet(), crawlservpp::Query::Container::reserveForSubSets(), crawlservpp::Query::Container::setSubSetsFromQuery(), and crawlservpp::Main::Server::tick().
|
inline |
Parses JSON code using RapidJSON
and converts it into [pos;length] pairs.
If the given JSON code is empty, an empty array will be returned.
For more information about RapidJSON
, see its GitHub repository.
json | A string view containing the JSON code to be parsed and converted into pairs of numbers. |
Json::Exception | if the given string view does not contain valid JSON code or the contained JSON code does not describe a valid array of [pos;length] pairs. |
References parseRapid().
Referenced by crawlservpp::Module::Analyzer::Database::checkSources().
|
inline |
Parses JSON code using RapidJSON
.
If the initial parsing fails and the JSON code contains backslashes, it tries again after escaping these backslashes – i.e., replacing '\' with '\'.
For more information about RapidJSON
, see its GitHub repository.
json | A string view containing the JSON code to parse. |
RapidJSON
document containing the parsed JSON.Json::Exception | if an error occurs while parsing the given JSON code. |
References cleanCopy(), numDebugChars, and crawlservpp::Helper::Strings::replaceAll().
Referenced by crawlservpp::Query::Container::addSubSetsFromQueryOnSubSet(), crawlservpp::Main::Database::duplicateWebsite(), crawlservpp::Query::Container::getBoolFromQuery(), crawlservpp::Query::Container::getBoolFromQueryOnSubSet(), crawlservpp::Query::Container::getMultiFromQuery(), crawlservpp::Query::Container::getMultiFromQueryOnSubSet(), crawlservpp::Query::Container::getSingleFromQuery(), crawlservpp::Query::Container::getSingleFromQueryOnSubSet(), crawlservpp::Module::Config::loadConfig(), parsePosLenPairsJson(), parseTextMapJson(), crawlservpp::Query::Container::reserveForSubSets(), crawlservpp::Query::Container::setSubSetsFromQuery(), and crawlservpp::Main::Server::tick().
|
inline |
Parses JSON code using RapidJSON
and converts it into a text map.
If the given JSON code is empty, an empty text map will be returned.
For more information about RapidJSON
, see its GitHub repository.
json | A string view containing the JSON code to be parsed and converted into a text map. |
Json::Exception | if the given string view does not contain valid JSON code or the contained JSON code does not describe a valid text map. |
References parseRapid().
Referenced by crawlservpp::Module::Analyzer::Database::checkSources().
|
inline |
Stringifies a vector of strings into one string containing a JSON array.
Uses RapidJSON
for conversion into JSON. For more information about RapidJSON
, see its GitHub repository.
vectorToStringify | A const reference to the vector of strings to be combined and converted into valid JSON code. |
Referenced by crawlservpp::Module::Analyzer::Database::checkSources(), crawlservpp::Main::Database::duplicateWebsite(), crawlservpp::Query::JsonPointer::getAll(), crawlservpp::Query::JsonPointer::getFirst(), crawlservpp::Module::Parser::Thread::onReset(), crawlservpp::Module::Extractor::Thread::onReset(), crawlservpp::Query::Container::reserveForSubSets(), crawlservpp::Main::Server::tick(), and crawlservpp::Module::Analyzer::Thread::uploadResult().
|
inline |
Converts a string into a JSON array with the string as the only element inside it.
Uses RapidJSON
for conversion into JSON. For more information about RapidJSON
, see its GitHub repository.
stringToStringify | A const reference to the string to be converted into a JSON array. |
|
inline |
Converts a string into a JSON array with the string as the only element inside it.
Uses RapidJSON
for conversion into JSON. For more information about RapidJSON
, see its GitHub repository.
stringToStringify | A const pointer to a null-terminated string to be converted into a JSON array. |
References stringify().
|
inline |
Converts a vector of vectors of string pairs into a JSON array with corresponding objects containing [key, value] pairs.
Uses RapidJSON
for conversion into JSON. For more information about RapidJSON
, see its GitHub repository.
vectorToStringify | A const reference to the vector containing vectors of string pairs, each of which represents a [key, value] pair. |
|
inline |
Converts a text map into a JSON array with corresponding objects containing [key, value] pairs.
Uses RapidJSON
for conversion into JSON. For more information about RapidJSON
, see its GitHub repository.
textMapToStringify | A const reference to the text map to be represented as [key, value] pairs. |
References crawlservpp::Helper::Utf8::length().
|
inline |
Stringifies a JSON value using the RapidJSON
.
For more information about RapidJSON
, see its GitHub repository.
value | The RapidJSON value to be stringified. |
|
inline |
Stringifies a JSON value using jsoncons
.
For more information about jsoncons
, see its GitHub repository.
json | The jsoncons value to be stringified. |
Referenced by stringify().
|
inline |
The number of characters to show before and behind a JSON error.
Referenced by parseRapid().
|
inline |
The offset of the first Unicode character digit in JSON code (from the '\').
Referenced by cleanCopy().
|
inline |
The offset of the second Unicode character digit in JSON code (from the '\').
Referenced by cleanCopy().
|
inline |
The offset of the third Unicode character digit in JSON code (from the '\').
Referenced by cleanCopy().
|
inline |
The offset of the fourth Unicode character digit in JSON code (from the '\').
Referenced by cleanCopy().
|
inline |
The length of an escaped Unicode character in JSON code (including the '\u').
Referenced by cleanCopy().