crawlserv++  [under development]
Application for crawling and analyzing textual content of websites.
German.hpp File Reference
#include <cstddef>
#include <string>
Include dependency graph for German.hpp:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Namespaces

 crawlservpp::Data::Stemmer
 Namespace for linguistic stemmers.
 

Functions

void crawlservpp::Data::Stemmer::stemGerman (std::string &token)
 Stems a token in German. More...
 
constexpr auto crawlservpp::Data::Stemmer::minLengthStrip2 {6}
 Minimum length of a token to strip two letters from the end or the beginning. More...
 
constexpr auto crawlservpp::Data::Stemmer::minLengthStrip1 {4}
 Minimum length of a token to strip one letter from the end. More...
 
constexpr auto crawlservpp::Data::Stemmer::binInv {0xff}
 Literal for binary inversion. More...
 
constexpr auto crawlservpp::Data::Stemmer::toLowerCase {32}
 Number to add to make uppercase ASCII letters lowercase. More...
 
constexpr auto crawlservpp::Data::Stemmer::utf8mb2 {0xC3}
 First byte of 2-byte UTF-8 characters for umlauts and sharp s. More...
 
constexpr auto crawlservpp::Data::Stemmer::utf8mb3 {0xE1}
 First byte of 3-byte UTF-8 character for capital sharp s. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautA2sm {0xA4}
 Second byte of UTF-8 umlaut ä. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautA2l {0x84}
 Second byte of UTF-8 umlaut Ä. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautO2sm {0xB6}
 Second byte of UTF-8 umlaut ö. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautO2l {0x96}
 Second byte of UTF-8 umlaut Ö. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautU2sm {0xBC}
 Second byte of UTF-8 umlaut ü. More...
 
constexpr auto crawlservpp::Data::Stemmer::umlautU2l {0x9C}
 Second byte of UTF-8 umlaut Ü. More...
 
constexpr auto crawlservpp::Data::Stemmer::sharpS2sm {0x9F}
 Second byte of UTF-8 sharp s. More...
 
constexpr auto crawlservpp::Data::Stemmer::sharpS2l {0xBA}
 Second byte of UTF-8 capital sharp s. More...
 
constexpr auto crawlservpp::Data::Stemmer::sharpS3l {0x9E}
 Third byte of UTF-8 capital sharp s. More...