crawlserv++  [under development]
Application for crawling and analyzing textual content of websites.
crawlservpp::Data::TokenRemover Class Reference

Token remover and trimmer. More...

#include <TokenRemover.hpp>

Token removal and trimming

void remove (std::string &token, const std::string &dictionary)
 Removes a token if found in the dictionary. More...
 
void trim (std::string &token, const std::string &dictionary)
 Removes dictionary entries from the beginning and the end of a string. More...
 

Cleanup

void clear ()
 Clears the lemmatizer, freeing the memory used by all dictionaries. More...
 

Detailed Description

Token remover and trimmer.

Member Function Documentation

◆ clear()

void crawlservpp::Data::TokenRemover::clear ( )
inline

Clears the lemmatizer, freeing the memory used by all dictionaries.

References crawlservpp::Data::dictDir, and crawlservpp::Helper::FileSystem::getPathSeparator().

◆ remove()

void crawlservpp::Data::TokenRemover::remove ( std::string &  token,
const std::string &  dictionary 
)
inline

Removes a token if found in the dictionary.

Parameters
tokenReference to a string containing the token to be removed, if necessary.
dictionaryView of a string containing the name of the dictionary to be used for checking whether to remove the token.

References crawlservpp::Helper::Memory::free().

◆ trim()

void crawlservpp::Data::TokenRemover::trim ( std::string &  token,
const std::string &  dictionary 
)
inline

Removes dictionary entries from the beginning and the end of a string.

Parameters
tokenReference to a string containing the token to be trimmed, if necessary.
dictionaryView of a string containing the name of the dictionary to be used for checking whether to trim part(s) of the token.

The documentation for this class was generated from the following file: