|
crawlserv++
[under development]
Application for crawling and analyzing textual content of websites.
|
Namespace for global string helper functions. More...
Constants | |
| constexpr std::array | utfWhitespaces |
| UTF-8 whitespaces used by utfTidy(). More... | |
| constexpr auto | checkHexLength {3} |
| Length of a two-digit hexademical number including the preceding percentage sign. More... | |
| constexpr auto | randCharSet |
| Characters to be chosen from for random string generation performed by generateRandom(). More... | |
Replacing | |
| void | replaceAll (std::string &strInOut, std::string_view needle, std::string_view replacement) |
| Replaces all occurences within a string with another string. More... | |
Conversion | |
| bool | stringToBool (std::string inputString) |
| Converts a string into a boolean value. More... | |
Number Format Checking | |
| bool | isDec (std::string_view inputString) |
Checks whether a string contains only decimal digits and max. one dot (.). More... | |
| bool | isHex (std::string_view inputString) |
| Checks whether a string contains only hexadecimal digits. More... | |
Trimming | |
| void | trim (std::string &stringToTrim) |
| Removes whitespaces around a string. More... | |
Joining | |
| std::string | join (const std::vector< std::string > &strings, char delimiter, bool ignoreEmpty) |
| Concatenates all elements of a vector into a single string. More... | |
| std::string | join (const std::vector< std::string > &strings, std::string_view delimiter, bool ignoreEmpty) |
| Concatenates all elements of a vector into a single string. More... | |
| std::string | join (std::queue< std::string > &strings, char delimiter, bool ignoreEmpty) |
| Concatenates all elements of a queue into a single string. More... | |
| std::string | join (std::queue< std::string > &strings, std::string_view delimiter, bool ignoreEmpty) |
| Concatenates all elements of a queue into a single string. More... | |
| void | join (const std::vector< std::string > &strings, char delimiter, bool ignoreEmpty, std::string &appendTo) |
| Concatenates all elements of a vector and appends them to a string. More... | |
| void | join (const std::vector< std::string > &strings, std::string_view delimiter, bool ignoreEmpty, std::string &appendTo) |
| Concatenates all elements of a vector and appends them to a string. More... | |
| void | join (std::queue< std::string > &strings, char delimiter, bool ignoreEmpty, std::string &appendTo) |
| Concatenates all elements of a queue into a single string. More... | |
| void | join (std::queue< std::string > &strings, std::string_view delimiter, bool ignoreEmpty, std::string &appendTo) |
| Concatenates all elements of a queue into a single string. More... | |
Splitting | |
| std::vector< std::string > | split (const std::string &str, char delimiter) |
| Splits a string into a vector of strings using the given delimiter. More... | |
| std::vector< std::string > | split (std::string_view str, std::string_view delimiter) |
| Splits a string into a vector of strings using the given delimiter. More... | |
| std::queue< std::string > | splitToQueue (std::string_view str, char delimiter, bool removeEmpty) |
| Splits a string into a queue of strings using the given delimiter. More... | |
| std::queue< std::string > | splitToQueue (std::string_view str, std::string_view delimiter, bool removeEmpty) |
| Splits a string into a queue of strings using the given delimiter. More... | |
Sorting | |
| void | sortAndRemoveDuplicates (std::vector< std::string > &vectorOfStrings, bool caseSensitive) |
| Sorts the given vector of strings and removes duplicates. More... | |
Escape Characters | |
| char | getFirstOrEscapeChar (std::string_view from) |
| Gets the first character or an escaped character from the beginning of the given string. More... | |
Encoding | |
| void | encodePercentage (std::string &stringToEncode) |
Encodes percentage signs that are not followed by a two-digit hexadecimal number with %25. More... | |
Tidying | |
| void | utfTidy (std::string &stringToTidy) |
| Removes new lines and unnecessary spaces, including UTF-8 whitespaces. More... | |
Name Checking | |
| bool | checkDomainName (std::string_view name) |
| Checks whether the given string is a a valid domain name. More... | |
| bool | checkSQLName (std::string_view name) |
| Checks whether the given string is a valid name for MySQL tables and fields. More... | |
Random String Generation | |
| std::string | generateRandom (std::size_t length) |
| Generates a random alpha-numerical string of the given length. More... | |
Namespace for global string helper functions.
|
inline |
Checks whether the given string is a a valid domain name.
/, \, and '.| name | View of the string to be checked for a valid domain name. |
Referenced by crawlservpp::Main::Server::tick().
|
inline |
Checks whether the given string is a valid name for MySQL tables and fields.
| name | View of the string to be checked for a valid MySQL table or field name. |
Referenced by crawlservpp::Main::Database::clearTable(), crawlservpp::Main::Database::isTargetTable(), crawlservpp::Module::Config::option(), crawlservpp::Main::Database::readColumnAsStrings(), crawlservpp::Main::Database::readTableAsStrings(), and crawlservpp::Main::Server::tick().
|
inline |
Encodes percentage signs that are not followed by a two-digit hexadecimal number with %25.
| stringToEncode | Reference to the string in which the percentage signs will be encoded in-situ. |
References checkHexLength, and isHex().
Referenced by crawlservpp::Parsing::URI::escapeUri().
|
inline |
Generates a random alpha-numerical string of the given length.
| length | Length of the string to be generated. |
References randCharSet.
Referenced by crawlservpp::Main::WebServer::getIP(), and crawlservpp::Main::Server::tick().
|
inline |
Gets the first character or an escaped character from the beginning of the given string.
Escaped characters that are supported: \n, \t, and \\
| from | A view of the string to extract the character from. |
Referenced by crawlservpp::Module::Config::option().
|
inline |
Checks whether a string contains only decimal digits and max. one dot (.).
| inputString | A view into the string to check for decimal digits. |
0-9), max. one dot (.) and no whitespaces. False otherwise. Referenced by crawlservpp::Data::ImportExport::OpenDocument::cell().
|
inline |
Checks whether a string contains only hexadecimal digits.
Case-insensitive.
| inputString | A view into the string to check for hexadecimal digits. |
0-F) and no whitespaces. False otherwise. Referenced by encodePercentage().
|
inline |
Concatenates all elements of a vector into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A character to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
Referenced by crawlservpp::Data::ImportExport::Text::exportList(), crawlservpp::Module::Parser::Thread::onReset(), and crawlservpp::Module::Extractor::Thread::onReset().
|
inline |
Concatenates all elements of a vector into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | View of a string to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
|
inline |
Concatenates all elements of a queue into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A character to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
|
inline |
Concatenates all elements of a queue into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | View of a string to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
|
inline |
Concatenates all elements of a vector and appends them to a string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A character to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
| appendTo | The string that will be appended with the concatenated elements, separated by the given delimiter. It will remain unchanged, if no elements have been concatenated. |
|
inline |
Concatenates all elements of a vector and appends them to a string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A view of the string to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
| appendTo | The string that will be appended with the concatenated elements, separated by the given delimiter. It will remain unchanged, if no elements have been concatenated. |
|
inline |
Concatenates all elements of a queue into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A character to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
| appendTo | The string that will be appended with the concatenated elements, separated by the given delimiter. It will remain unchanged, if no elements have been concatenated. |
|
inline |
Concatenates all elements of a queue into a single string.
| strings | Constant reference to a vector containing the strings to be concatenated. |
| delimiter | A view of the string to be inserted inbetween the concatenated strings. |
| ignoreEmpty | Ignore empty strings when concatenating the given elements. |
| appendTo | The string that will be appended with the concatenated elements, separated by the given delimiter. It will remain unchanged, if no elements have been concatenated. |
|
inline |
Replaces all occurences within a string with another string.
while loop for that.| strInOut | A reference to the string in which the occurences will be replaced. |
| needle | A string view defining the occurence to be replaced. |
| replacement | A string view defining the replacement. |
Referenced by crawlservpp::Data::ImportExport::OpenDocument::cell(), crawlservpp::Helper::DateTime::fixFinnishMonths(), crawlservpp::Helper::DateTime::fixFrenchMonths(), crawlservpp::Helper::DateTime::fixRussianMonths(), crawlservpp::Helper::DateTime::fixUkrainianMonths(), crawlservpp::Query::JsonPointer::getAll(), crawlservpp::Query::JsonPointer::getSubSets(), crawlservpp::Helper::DateTime::handle12hTime(), crawlservpp::Query::JsonPointer::JsonPointer(), crawlservpp::Module::Extractor::Thread::onReset(), crawlservpp::Module::Crawler::Thread::onReset(), crawlservpp::Helper::Json::parseCons(), crawlservpp::Helper::Json::parseRapid(), and utfTidy().
|
inline |
Sorts the given vector of strings and removes duplicates.
| vectorOfStrings | Reference to the vector of strings, which will be sorted and from which duplicates will be removed in-situ. |
| caseSensitive | True, if the removal should be performed case-sensitive. False otherwise. |
Referenced by crawlservpp::Module::Crawler::Thread::onReset().
|
inline |
Splits a string into a vector of strings using the given delimiter.
| str | A const reference to the string to be split up. |
| delimiter | The character around which the resulting elements will be splitted. |
References split().
Referenced by crawlservpp::Data::Lemmatizer::clear(), crawlservpp::Main::Database::connect(), and crawlservpp::Helper::Portability::enumLocales().
|
inline |
Splits a string into a vector of strings using the given delimiter.
| str | A const reference to the string to be split up. |
| delimiter | A view of the string around which the resulting elements will be splitted. |
Referenced by split(), and splitToQueue().
|
inline |
Splits a string into a queue of strings using the given delimiter.
| str | A const reference to the string to be split up. |
| delimiter | The character around which the resulting elements will be splitted. |
| removeEmpty | Set whether to ignore empty strings and not add them to the resulting queue. |
References split().
Referenced by crawlservpp::Wrapper::TidyDoc::cleanAndRepair(), crawlservpp::Wrapper::TidyDoc::getOutput(), crawlservpp::Data::ImportExport::Text::importList(), and crawlservpp::Wrapper::TidyDoc::parse().
|
inline |
Splits a string into a queue of strings using the given delimiter.
| str | A const reference to the string to be split up. |
| delimiter | A view of the string around which the resulting elements will be splitted. |
| removeEmpty | Set whether to ignore empty strings and not add them to the resulting queue. |
|
inline |
Converts a string into a boolean value.
Only case-insensitive variations of "true" will be converted into true.
| inputString | The string to be converted into a boolean value. |
true. False otherwise.
|
inline |
Removes whitespaces around a string.
| stringToTrim | Reference to the string to be trimmed in-situ. |
Referenced by crawlservpp::Main::Database::cloneTable(), crawlservpp::Main::Database::connect(), crawlservpp::Main::WebServer::getIP(), crawlservpp::Query::XPath::getSubSets(), crawlservpp::Query::JsonPath::JsonPath(), crawlservpp::Query::JsonPointer::JsonPointer(), crawlservpp::Module::Crawler::Thread::onReset(), crawlservpp::Module::Config::option(), crawlservpp::Parsing::URI::parseLink(), crawlservpp::Data::Stemmer::stemEnglish(), crawlservpp::Main::Server::tick(), and utfTidy().
|
inline |
Removes new lines and unnecessary spaces, including UTF-8 whitespaces.
| stringToTidy | Reference to the string from which new lines and unnecessary spaces will be removed in-situ. |
References replaceAll(), trim(), and utfWhitespaces.
Referenced by crawlservpp::Module::Parser::Thread::onReset(), and crawlservpp::Module::Extractor::Thread::onReset().
|
inline |
Length of a two-digit hexademical number including the preceding percentage sign.
Referenced by encodePercentage().
|
inline |
Characters to be chosen from for random string generation performed by generateRandom().
Referenced by generateRandom().
|
inline |
UTF-8 whitespaces used by utfTidy().
Referenced by utfTidy().