|
crawlserv++
[under development]
Application for crawling and analyzing textual content of websites.
|
| ▼Ncrawlservpp | The global namespace of crawlserv++ |
| ►NData | Namespace for different types of data |
| ►NCompression | Namespace for data compression |
| ►NZip | Namespace for compressing and decompressing zip |
| CException | Class for zip exceptions |
| ►NZlib | Namespace for compressing and decompressing zlib |
| CException | Class for zlib exceptions |
| ►NFile | Namespace for functions accessing files |
| CException | Class for file exceptions |
| ►NImportExport | Namespace for the import and export of data |
| ►CCorpus | Class representing a text corpus |
| CException | Class for corpus-specific exceptions |
| CGetColumn | Structure for retrieving the values in a table column |
| CGetColumns | Structure for retrieving multiple table columns of the same type |
| CGetColumnsMixed | Structure for retrieving multiple table columns of different types |
| CGetFields | Structure for retrieving multiple values of the same type from a table column |
| CGetFieldsMixed | Structure for getting multiple values of different types from a table column |
| CGetValue | Structure for retrieving one value from a table column |
| CInsertFields | Structure for inserting multiple values of the same type into a table |
| CInsertFieldsMixed | Structure for inserting multiple values of different types into a row |
| CInsertValue | Structure for inserting one value into a table |
| CLemmatizer | Lemmatizer |
| ►CPickleDict | Simple Python pickle dictionary |
| CException | Class for Python pickle exceptions |
| CSentiment | Implementation of the VADER sentiment analysis algorithm |
| CSentimentScores | Structure for VADER sentiment scores |
| ►CTagger | Multilingual POS (part of speech) tagger using Wapiti by Thomas Lavergne |
| CException | POS (part of speech)-tagging exception |
| CTokenCorrect | Corrects tokens using an aspell dictionary |
| CTokenRemover | Token remover and trimmer |
| ►CTopicModel | Topic modeller |
| CException | Class for topic modelling-specific exceptions |
| CUpdateFields | Structure for updating multiple values of the same type in a table |
| CUpdateFieldsMixed | Structure for updating multiple values of different types in a table |
| CUpdateValue | Structure for updating one value in a table |
| CValue | A generic value |
| ►NHelper | Namespace for global helper functions |
| ►NDateTime | Namespace for global date/time helper functions |
| CException | Class for date/time exceptions |
| CLocaleException | Class for date/time locale exception |
| ►NFileSystem | Namespace for global file system helper functions |
| CException | Class for file system exceptions |
| ►NJson | Namespace for global JSON helper functions |
| CException | Class for JSON exceptions |
| ►NUtf8 | Namespace for global UTF-8 encoding functions |
| CException | Class for UTF-8 exceptions |
| CCommaLocale | |
| CDotLocale | |
| ►NMain | Namespace for the main classes of the program |
| CApp | Main application |
| ►CConfigFile | Configuration file |
| CException | Class for configuration file exceptions |
| ►CDatabase | Class handling database access for the command-and-control and its threads |
| CConnectionException | Class for database connection exceptions |
| CException | Class for generic database exceptions |
| CIncorrectPathException | Class for incorrect path exceptions |
| CPrivilegesException | Class for insufficient privileges exceptions |
| CStorageEngineException | Class for storage engine exceptions |
| CTransaction | Wrapper class for in-scope transactions |
| CWrongArgumentsException | Class for wrong arguments exceptions |
| CException | Base class for all exceptions thrown by the application |
| CServer | The command-and-control server |
| CSignalHandler | |
| ►CWebServer | Embedded web server class using the mongoose library |
| CException | Class for web server exceptions |
| ►NModule | Namespace for the different modules run by threads |
| ►NAnalyzer | Namespace for analyzer classes |
| ►NAlgo | Namespace for algorithm classes |
| CAllTokens | Counts all tokens in a corpus |
| CAssoc | Empty algorithm template |
| CAssocOverTime | Empty algorithm template |
| CCorpusGenerator | Algorithm building a text corpus and creating corpus statistics from the input data |
| CEmpty | Empty algorithm template |
| CExtractIds | Extracts the parsed IDs from a filtered corpus |
| CSentimentOverTime | Sentiment analysis using the VADER algorithm |
| CTermsOverTime | Algorithm counting specific terms in a text corpus over time |
| CTopicModelling | Topic Modeller |
| CWordsOverTime | Counts the occurrence of articles, sentences, and tokens in a corpus over time |
| ►CConfig | Abstract configuration for analyzers, to be implemented by algorithm classes |
| CEntries | Configuration entries for analyzer threads |
| ►CDatabase | Class providing database functionality for analyzer threads by implementing Wrapper::Database |
| CException | Class for analyzer-specific database exceptions |
| ►CThread | Abstract class providing thread functionality to algorithm (child) classes |
| CException | Class for analyzer exceptions to be used by algorithms |
| ►NCrawler | Namespace for crawler classes |
| ►CConfig | Configuration for crawlers |
| CEntries | Configuration entries for crawler threads |
| CException | Class for crawler configuration exceptions |
| ►CDatabase | Class providing database functionality for crawler threads by implementing Wrapper::Database |
| CException | Class for crawler-specific database exceptions |
| ►CThread | Crawler thread |
| CException | Class for crawler exceptions |
| ►NExtractor | Namespace for extractor classes |
| ►CConfig | Configuration for extractors |
| CEntries | Configuration entries for extractor threads |
| CException | Class for extractor configuration exceptions |
| ►CDatabase | Class providing database functionality for extractor threads by implementing Wrapper::Database |
| CException | Class for parser database exceptions |
| ►CThread | Extractor thread |
| CException | Class for extractor exceptions |
| ►NParser | Namespace for parser classes |
| ►CConfig | Configuration for parsers |
| CEntries | Configuration entries for parser threads |
| CException | Class for parser configuration exceptions |
| ►CDatabase | Class providing database functionality for parser threads by implementing Wrapper::Database |
| CException | Class for parser database exceptions |
| ►CThread | Parser thread |
| CException | Class for parser exceptions |
| ►CConfig | Abstract class as base for module-specific configurations |
| CException | Class for configuration exceptions |
| ►CDatabase | Class handling database access for threads |
| CException | Class for Module::Database exceptions |
| ►CThread | Abstract class providing module-independent thread functionality |
| CException | Class for generic thread exceptions |
| ►NNetwork | Namespace for networking classes |
| ►NFTPUpload | |
| CState | Stores content and status of a FTP upload |
| ►CConfig | Abstract class containing the network-specific configuration for threads |
| CEntries | Configuration entries for analyzer threads |
| ►CCurl | Provides an interface to the libcurl library for sending and receiving data over the network |
| CException | Class for libcurl exceptions |
| CDownloader | Downloader using the libcurl library to download a URL in an extra thread |
| ►CTorControl | Controls a TOR service via a TOR control server/port, if available |
| CException | Class for TOR control exceptions |
| ►NParsing | Namespace for classes parsing HTML, URIs, and XML |
| ►CHTML | Parses and cleans HTML markup |
| CException | Class for HTML exceptions |
| ►CURI | Parser for RFC 3986 URIs that can also analyze their relationships with each other |
| CException | Class for URI exceptions |
| ►CXML | Parses HTML markup into clean XML |
| CException | Class for XML exceptions |
| ►NQuery | Namespace for classes handling queries |
| ►CContainer | Query container |
| CException | Class for query container exceptions |
| ►CJsonPath | Implements a JSONPath query using the jsoncons library |
| CException | Class for JSONPath exceptions |
| ►CJsonPointer | Implements an extended JSONPointer query using the rapidJSON library |
| CException | Class for JSONPointer exceptions |
| ►CRegEx | Implements a RegEx query using the PCRE2 library |
| CException | Class for JSONPath exceptions |
| ►CXPath | Implements a XPath query using the pugixml library |
| CException | Class for XPath exceptions |
| ►NStruct | Namespace for data structures |
| CAlgoThreadProperties | Properties of an algorithm thread |
| CConfigItem | Configuration item containing its category, name, and JSON value |
| CConfigProperties | Configuration properties containing its module, name, and JSON string |
| CCorpusProperties | Corpus properties containing the type, table, and column name of its source |
| CCrawlStatsTick | Statistics for crawling tick |
| CCrawlTimersContent | Timers for crawling content |
| CCrawlTimersTick | Timers for crawling tick |
| CDatabaseSettings | Database settings containing its host, port, user, password, schema, and compression |
| CDataEntry | A data entry containing either parsed or extracted data |
| CModuleOptions | Module options containing the thread ID, as well as ID and namespace of website and URL list used by the thread |
| CNetworkSettings | Network settings containing the default proxy as well as host, port, and password of the TOR control server |
| CQueryProperties | Query properties containing its name, text, type, and result type(s) |
| CQueryStruct | Structure to identify a query including its type and result type(s) |
| CServerCommandResponse | Response from the command-and-control server |
| CServerSettings | Server settings containing its port, as well as allowed clients, origins, and actions |
| CStatusSetter | Structure containing all the data needed to keep the status of a thread updated |
| CTableColumn | Structure for table columns containing its name, type, reference, and indexing |
| CTableProperties | Table properties containing its name, columns, data directory, and compression |
| CTargetTableProperties | Target table properties containing its type, website, URL list, table names, columns, and compression |
| CTextMapEntry | Text map entry |
| CThreadDatabaseEntry | Information about a thread as stored in the database, containing both the options for and the status of the thread |
| CThreadOptions | Thread options containing the name of the module run, as well as the IDs of the website, URL list, and configuration used |
| CThreadStatus | Thread status containing its ID, status message, pause state, and progress |
| CTopicModelInfo | Structure containing information about the currently trained Hierarchical Dirichlet Process (HDP) model |
| CUrlListProperties | Properties of a URL list containing its namespace and name |
| CWebsiteProperties | Website properties containing its domain, namespace, name, and data directory |
| ►NTimer | Namespace for timers |
| CSimple | A simple timer |
| CSimpleHR | A simple timer with high resolution |
| CStartStop | A simple start/stop watch |
| CStartStopHR | A simple start/stop watch with high resolution |
| ►NWrapper | Namespace for RAII wrappers and Wrapper::Database |
| ►CAspellChecker | RAII wrapper for aspell spell checkers |
| CException | Class for aspell spell checker-specific exceptions |
| ►CAspellConfig | RAII wrapper for aspell configurations |
| CException | Class for aspell configuration-specific exceptions |
| CAspellList | RAII wrapper for aspell word lists |
| CCurl | RAII wrapper for handles of the libcurl API |
| CCurlList | RAII wrapper for lists used by the libcurl API |
| CDatabase | Wrapper class providing the database functionality of Module::Database to its child classes |
| CDatabaseLock | Template class for safe in-scope database locks |
| CDatabaseTryLock | Template class for safe in-scope database locks |
| CPCRE | RAII wrapper for Perl-compatible regular expressions |
| CPCREMatch | RAII wrapper for Perl-compatible regular expression matches |
| CPreparedSqlStatement | RAII wrapper for prepared MySQL statements |
| CTidyBuffer | RAII wrapper for buffers used by the tidy-html5 API |
| ►CTidyDoc | RAII wrapper for documents used by the tidy-html5 API |
| CException | Class for tidy-html5 document exceptions |
| CURI | RAII wrapper for the RFC 3986 URI structure used by uriparser |
| CURIQueryList | RAII wrapper for the URI query list used by uriparser |
| CZipArchive | RAII wrapper for ZIP archives used by libzip |
| CZipSource | RAII wrapper for sources used by libzip |