|
crawlserv++
[under development]
Application for crawling and analyzing textual content of websites.
|
This is the complete list of members for crawlservpp::Data::Corpus, including all inherited members.
| ArticleFunc typedef | crawlservpp::Data::Corpus | |
| articleMap | crawlservpp::Data::Corpus | protected |
| clear() | crawlservpp::Data::Corpus | inline |
| combineContinuous(Tokens &chunks, std::vector< TextMap > &articleMaps, std::vector< TextMap > &dateMaps, bool deleteInputData) | crawlservpp::Data::Corpus | inline |
| combineTokenized(Tokens &chunks, Sizes &tokenNums, std::vector< TextMap > &articleMaps, std::vector< TextMap > &dateMaps, std::vector< SentenceMap > &sentenceMaps, bool deleteInputData) | crawlservpp::Data::Corpus | inline |
| copyChunksContinuous(std::size_t chunkSize, Tokens &to, std::vector< TextMap > &articleMapsTo, std::vector< TextMap > &dateMapsTo) const | crawlservpp::Data::Corpus | inline |
| copyChunksTokenized(std::size_t chunkSize, Tokens &to, Sizes &tokenNumsTo, std::vector< TextMap > &articleMapsTo, std::vector< TextMap > &dateMapsTo, std::vector< SentenceMap > &sentenceMapsTo) const | crawlservpp::Data::Corpus | inline |
| copyContinuous(std::string &to) const | crawlservpp::Data::Corpus | inline |
| copyContinuous(std::string &to, TextMap &articleMapTo, TextMap &dateMapTo) const | crawlservpp::Data::Corpus | inline |
| Corpus(bool consistencyChecks) | crawlservpp::Data::Corpus | inlineexplicit |
| Corpus(std::vector< Corpus > &others, bool consistencyChecks, StatusSetter &statusSetter) | crawlservpp::Data::Corpus | inline |
| corpus | crawlservpp::Data::Corpus | protected |
| create(Tokens &texts, bool deleteInputData) | crawlservpp::Data::Corpus | inline |
| create(Tokens &texts, std::vector< std::string > &articleIds, std::vector< std::string > &dateTimes, bool deleteInputData) | crawlservpp::Data::Corpus | inline |
| DateArticleSentenceMap typedef | crawlservpp::Data::Corpus | |
| dateMap | crawlservpp::Data::Corpus | protected |
| empty() const | crawlservpp::Data::Corpus | inline |
| filterArticles(const ArticleFunc &callbackArticle, StatusSetter &statusSetter) | crawlservpp::Data::Corpus | inline |
| filterByDate(const std::string &from, const std::string &to) | crawlservpp::Data::Corpus | inline |
| get(std::size_t index) const | crawlservpp::Data::Corpus | inline |
| get(const std::string &id) const | crawlservpp::Data::Corpus | inline |
| getArticleMap() | crawlservpp::Data::Corpus | inline |
| getArticles() const | crawlservpp::Data::Corpus | inline |
| getcArticleMap() const | crawlservpp::Data::Corpus | inline |
| getcCorpus() const | crawlservpp::Data::Corpus | inline |
| getcDateMap() const | crawlservpp::Data::Corpus | inline |
| getCorpus() | crawlservpp::Data::Corpus | inline |
| getcSentenceMap() const | crawlservpp::Data::Corpus | inline |
| getcTokens() const | crawlservpp::Data::Corpus | inline |
| getDate(const std::string &date) const | crawlservpp::Data::Corpus | inline |
| getDateMap() | crawlservpp::Data::Corpus | inline |
| getDateTokenized(const std::string &date) const | crawlservpp::Data::Corpus | inline |
| getNumTokens() const | crawlservpp::Data::Corpus | inline |
| getSentenceMap() | crawlservpp::Data::Corpus | inline |
| getTokenized(std::size_t index) const | crawlservpp::Data::Corpus | inline |
| getTokenized(const std::string &id) const | crawlservpp::Data::Corpus | inline |
| getTokens() | crawlservpp::Data::Corpus | inline |
| hasArticleMap() const | crawlservpp::Data::Corpus | inline |
| hasDateMap() const | crawlservpp::Data::Corpus | inline |
| hasSentenceMap() const | crawlservpp::Data::Corpus | inline |
| isTokenized() const | crawlservpp::Data::Corpus | inline |
| PositionLength typedef | crawlservpp::Data::Corpus | |
| SentenceFunc typedef | crawlservpp::Data::Corpus | |
| sentenceMap | crawlservpp::Data::Corpus | protected |
| SentenceMap typedef | crawlservpp::Data::Corpus | |
| SentenceMapEntry typedef | crawlservpp::Data::Corpus | |
| size() const | crawlservpp::Data::Corpus | inline |
| Sizes typedef | crawlservpp::Data::Corpus | |
| substr(std::size_t from, std::size_t len) | crawlservpp::Data::Corpus | inline |
| tokenize(const std::vector< std::uint16_t > &manipulators, const std::vector< std::string > &models, const std::vector< std::string > &dictionaries, const std::vector< std::string > &languages, std::uint64_t freeMemoryEvery, StatusSetter &statusSetter) | crawlservpp::Data::Corpus | inline |
| tokenizeCustom(const std::optional< SentenceFunc > &callback, std::uint64_t freeMemoryEvery, StatusSetter &statusSetter) | crawlservpp::Data::Corpus | inline |
| Tokens typedef | crawlservpp::Data::Corpus | |
| tokens | crawlservpp::Data::Corpus | protected |