crawlserv++  [under development]
Application for crawling and analyzing textual content of websites.
crawlservpp::Network::TorControl Class Reference

Controls a TOR service via a TOR control server/port, if available. More...

#include <TorControl.hpp>

Classes

class  Exception
 Class for TOR control exceptions. More...
 

Construction and Destruction

 TorControl (std::string_view controlServer, std::uint16_t controlPort, std::string_view controlPassword)
 Constructor creating context and socket for the connection to the TOR control server/port. More...
 
virtual ~TorControl ()
 Destructor shutting down remaining connections to the TOR control server/port if necessary. More...
 

Getter

bool active () const noexcept
 Gets whether a TOR control server/port is set. More...
 

Setters

void setNewIdentityMin (std::uint64_t seconds)
 Sets the time (in seconds) in which to ignore requests for a new identity. More...
 
void setNewIdentityMax (std::uint64_t seconds)
 Sets the time (in seconds) after which to automatically request a new TOR identity. More...
 

Identity

bool newIdentity ()
 Requests a new TOR identity via the set TOR control server/port. More...
 

Tick

void tick ()
 Checks whether to request a new TOR identity. More...
 

Copy and Move

The class is not copyable and not moveable.

 TorControl (TorControl &)=delete
 Deleted copy constructor. More...
 
TorControloperator= (TorControl &)=delete
 Deleted copy assignment operator. More...
 
 TorControl (TorControl &&)=delete
 Deleted move constructor. More...
 
TorControloperator= (TorControl &&)=delete
 Deleted move assignment operator. More...
 

Detailed Description

Controls a TOR service via a TOR control server/port, if available.

Allows crawlserv++ to automatically request a new TOR identity when needed if the TOR control server/port has been set in the configuration.

This class is used both by crawler and by extractor threads.

It uses the asio library for connecting to the TOR control server/port. For more information, see its GitHub repository.

See also
Network::Config

Constructor & Destructor Documentation

◆ TorControl() [1/3]

crawlservpp::Network::TorControl::TorControl ( std::string_view  controlServer,
std::uint16_t  controlPort,
std::string_view  controlPassword 
)
inline

Constructor creating context and socket for the connection to the TOR control server/port.

Parameters
controlServerA string view containing the address of the TOR control server. It will be copied into the instance of the class for later use.
controlPortThe port used for controlling the TOR service.
controlPasswordA string view containing the password with which to authentificate to the TOR control server/port. It will be copied into the instance of the class for later use.

◆ ~TorControl()

crawlservpp::Network::TorControl::~TorControl ( )
inlinevirtual

Destructor shutting down remaining connections to the TOR control server/port if necessary.

Note
Errors during shutdown will be written to stderr.

◆ TorControl() [2/3]

crawlservpp::Network::TorControl::TorControl ( TorControl )
delete

Deleted copy constructor.

◆ TorControl() [3/3]

crawlservpp::Network::TorControl::TorControl ( TorControl &&  )
delete

Deleted move constructor.

Member Function Documentation

◆ active()

bool crawlservpp::Network::TorControl::active ( ) const
inlinenoexcept

Gets whether a TOR control server/port is set.

Returns
True, if a TOR control server/port is set. False otherwise.

Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().

◆ newIdentity()

bool crawlservpp::Network::TorControl::newIdentity ( )
inline

Requests a new TOR identity via the set TOR control server/port.

The request will be ignored if not enough time (set via setNewIdentityMin) has been passed. Sends the NEWNYM signal to the TOR control server/port, requesting a new circuit.

Note
The TOR service itself does not allow too many requests for new circuits during a specific period of time.
Returns
True if a new identity has been requested. False otherwise.
Exceptions
TorControl::Exceptionif no TOR control server/port has been set, authentification with the given password to the TOR control server/port failed, or an error occured while connecting to the TOR control server/port.

References crawlservpp::Network::responseCodeLength, crawlservpp::Timer::Simple::tick(), and crawlservpp::Data::File::write().

Referenced by crawlservpp::Module::Extractor::Thread::onReset(), crawlservpp::Module::Crawler::Thread::onReset(), and tick().

◆ operator=() [1/2]

TorControl& crawlservpp::Network::TorControl::operator= ( TorControl )
delete

Deleted copy assignment operator.

◆ operator=() [2/2]

TorControl& crawlservpp::Network::TorControl::operator= ( TorControl &&  )
delete

Deleted move assignment operator.

◆ setNewIdentityMax()

void crawlservpp::Network::TorControl::setNewIdentityMax ( std::uint64_t  seconds)
inline

Sets the time (in seconds) after which to automatically request a new TOR identity.

After the time has passed (and tick() is executed), a new TOR identity will be automatically requested.

Parameters
secondsTime (in seconds) after which to automatically request a new TOR identity. Set it to zero (default) for no automatic request of new TOR identities.

References crawlservpp::Timer::Simple::tick().

Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().

◆ setNewIdentityMin()

void crawlservpp::Network::TorControl::setNewIdentityMin ( std::uint64_t  seconds)
inline

Sets the time (in seconds) in which to ignore requests for a new identity.

After having already requestes a new TOR identity (or having started this instance of the TOR controller) all requests for a new TOR identity will be discarded for the given amount of time.

Parameters
secondsTime (in seconds) in which to ignore requests for a new identity. Set it to zero (default) if every request for a new TOR identity should be sent to the TOR control server/port.

References crawlservpp::Timer::Simple::tick().

Referenced by crawlservpp::Module::Extractor::Thread::onReset(), and crawlservpp::Module::Crawler::Thread::onReset().

◆ tick()

void crawlservpp::Network::TorControl::tick ( )
inline

Checks whether to request a new TOR identity.

This function will be called every server tick and will request a new TOR identity if necessary.

References crawlservpp::Network::millisecondsPerSecond, newIdentity(), and crawlservpp::Timer::Simple::tick().

Referenced by crawlservpp::Module::Extractor::Thread::onTick(), and crawlservpp::Module::Crawler::Thread::onTick().


The documentation for this class was generated from the following file: