crawlserv++
[under development]
Application for crawling and analyzing textual content of websites.
|
Controls a TOR service via a TOR control server/port, if available. More...
#include <TorControl.hpp>
Classes | |
class | Exception |
Class for TOR control exceptions. More... | |
Construction and Destruction | |
TorControl (std::string_view controlServer, std::uint16_t controlPort, std::string_view controlPassword) | |
Constructor creating context and socket for the connection to the TOR control server/port. More... | |
virtual | ~TorControl () |
Destructor shutting down remaining connections to the TOR control server/port if necessary. More... | |
Getter | |
bool | active () const noexcept |
Gets whether a TOR control server/port is set. More... | |
Setters | |
void | setNewIdentityMin (std::uint64_t seconds) |
Sets the time (in seconds) in which to ignore requests for a new identity. More... | |
void | setNewIdentityMax (std::uint64_t seconds) |
Sets the time (in seconds) after which to automatically request a new TOR identity. More... | |
Identity | |
bool | newIdentity () |
Requests a new TOR identity via the set TOR control server/port. More... | |
Tick | |
void | tick () |
Checks whether to request a new TOR identity. More... | |
Copy and Move | |
TorControl (TorControl &)=delete | |
Deleted copy constructor. More... | |
TorControl & | operator= (TorControl &)=delete |
Deleted copy assignment operator. More... | |
TorControl (TorControl &&)=delete | |
Deleted move constructor. More... | |
TorControl & | operator= (TorControl &&)=delete |
Deleted move assignment operator. More... | |
Controls a TOR service via a TOR control server/port, if available.
Allows crawlserv++ to automatically request a new TOR identity when needed if the TOR control server/port has been set in the configuration.
This class is used both by crawler and by extractor threads.
It uses the asio library for connecting to the TOR control server/port. For more information, see its GitHub repository.
|
inline |
Constructor creating context and socket for the connection to the TOR control server/port.
controlServer | A string view containing the address of the TOR control server. It will be copied into the instance of the class for later use. |
controlPort | The port used for controlling the TOR service. |
controlPassword | A string view containing the password with which to authentificate to the TOR control server/port. It will be copied into the instance of the class for later use. |
|
inlinevirtual |
Destructor shutting down remaining connections to the TOR control server/port if necessary.
|
delete |
Deleted copy constructor.
|
delete |
Deleted move constructor.
|
inlinenoexcept |
Gets whether a TOR control server/port is set.
Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().
|
inline |
Requests a new TOR identity via the set TOR control server/port.
The request will be ignored if not enough time (set via setNewIdentityMin) has been passed. Sends the NEWNYM signal to the TOR control server/port, requesting a new circuit.
TorControl::Exception | if no TOR control server/port has been set, authentification with the given password to the TOR control server/port failed, or an error occured while connecting to the TOR control server/port. |
References crawlservpp::Network::responseCodeLength, crawlservpp::Timer::Simple::tick(), and crawlservpp::Data::File::write().
Referenced by crawlservpp::Module::Extractor::Thread::onReset(), crawlservpp::Module::Crawler::Thread::onReset(), and tick().
|
delete |
Deleted copy assignment operator.
|
delete |
Deleted move assignment operator.
|
inline |
Sets the time (in seconds) after which to automatically request a new TOR identity.
After the time has passed (and tick() is executed), a new TOR identity will be automatically requested.
seconds | Time (in seconds) after which to automatically request a new TOR identity. Set it to zero (default) for no automatic request of new TOR identities. |
References crawlservpp::Timer::Simple::tick().
Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().
|
inline |
Sets the time (in seconds) in which to ignore requests for a new identity.
After having already requestes a new TOR identity (or having started this instance of the TOR controller) all requests for a new TOR identity will be discarded for the given amount of time.
seconds | Time (in seconds) in which to ignore requests for a new identity. Set it to zero (default) if every request for a new TOR identity should be sent to the TOR control server/port. |
References crawlservpp::Timer::Simple::tick().
Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().
|
inline |
Checks whether to request a new TOR identity.
This function will be called every server tick and will request a new TOR identity if necessary.
References crawlservpp::Network::millisecondsPerSecond, newIdentity(), and crawlservpp::Timer::Simple::tick().
Referenced by crawlservpp::Module::Extractor::Thread::onTick(), and crawlservpp::Module::Crawler::Thread::onTick().