Implementation of a log archiver using asynchronous reader and writer threads. More...

#include <logarchiver.h>

Inheritance diagram for LogArchiver:

Public Member Functions
	LogArchiver (const sm_options &options)

	LogArchiver (ArchiveIndex , LogConsumer , ArchiverHeap , BlockAssembly )

virtual	~LogArchiver ()

virtual void	run ()

void	activate (lsn_t endLSN=lsn_t::null, bool wait=true)

void	shutdown ()

bool	requestFlushAsync (lsn_t)

void	requestFlushSync (lsn_t)

void	archiveUntilLSN (lsn_t)

std::shared_ptr< ArchiveIndex >	getIndex ()

lsn_t	getNextConsumedLSN ()

void	setEager (bool e)

bool	getEager () const

Public Member Functions inherited from thread_wrapper_t
	thread_wrapper_t ()

virtual	~thread_wrapper_t ()

virtual void	before_run ()

virtual void	after_run ()

void	spawn ()

void	fork ()

void	join ()

Static Public Attributes
static const bool	DFT_EAGER = true

static const bool	DFT_READ_WHOLE_BLOCKS = true

static const int	DFT_GRACE_PERIOD = 1000000

Private Member Functions
void	replacement ()

bool	selection ()

void	pushIntoHeap (logrec_t *, bool duplicate)

bool	waitForActivation ()

bool	processFlushRequest ()

bool	isLogTooSlow ()

bool	shouldActivate (bool logTooSlow)

Private Attributes
std::shared_ptr< ArchiveIndex >	index

LogConsumer *	consumer

ArchiverHeap *	heap

BlockAssembly *	blkAssemb

MergerDaemon *	merger

std::atomic< bool >	shutdownFlag

ArchiverControl	control

bool	selfManaged

bool	eager

bool	readWholeBlocks

int	slowLogGracePeriod

lsn_t	nextActLSN

lsn_t	flushReqLSN

Detailed Description

Implementation of a log archiver using asynchronous reader and writer threads.

The log archiver runs as a background daemon whose execution is controlled by an ArchiverControl object. Once a log archiver thread is created and forked, it waits for an activation to start working. The caller thread must invoke the activate() method to perform this activation.

Log archiving works in activation cycles, in which it first waits for an activation and then consumes the recovery log up to a given LSN value (

See also: activate(bool, lsn_t)). This cycle is executed in an infinite loop until the method shutdown() is invoked. Once shutdown is invoked, the current cycle is not interrupted. Instead, it finishes consuming the log until the LSN given in the last successful activation and only then it exits. The destructor also invokes shutdown() if not done yet.

The class LogArchiver itself serves merely as an orchestrator of its components, which are:

LogArchiver::LogConsumer, which encapsulates a reader thread and parsing individual log records from the recovery log.
LogArchiver::ArchiverHeap, which performs run generation by sorting the input stream given by the log consumer.
LogArchiver::BlockAssembly, which consumes the sorted output from the heap, builds indexed blocks of log records (used for instant restore), and passes them over to the asynchronous writer thread
LogArchiver::ArchiveDirectory, which represents the set of sorted runs that compose the log archive itself. It manages filesystem operations to read from and write to the log archive, controls access to the archive index, and provides scanning facilities used by restore.

One activation cycle consists of consuming all log records from the log consumer, which must first be opened with the given "end LSN". Each log record is then inserted into the heap until it becomes full. Then, log records are removed from the heap (usually in bulk, e.g., one block at a time) and passed to the block assembly component. The cycle finishes once all log records up to the given LSN are inserted into the heap, which does not necessarily mean that the persistent log archive will contain all those log records. The only way to enforce that is to perform a shutdown. This design maintains the heap always as full as possible, which generates runs whose size is (i) as large as possible and (ii) independent of the activation behavior.

In the typical operation mode, a LogArchiver instance is constructed using the sm_options provided by the user, but for tests and external experiments, it can also be constructed by passing instances of these four components above.

A note on processing older log partitions (TODO): Before we implemented the archiver, the log manager would delete a partition once it was eliminated from the list of 8 open partitions. The compiler flag KEEP_LOG_PARTITIONS was used to omit the delete operation, leaving the complete history of the database in the log directory. However, if log archiving is enabled, it should take over the responsibility of deleting old log partitions. Currently, if the flag is not set and the archiver cannot keep up with the growth of the log, partitions would be lost from archiving.

See also: LogArchiver::LogConsumer; LogArchiver::ArchiverHeap; LogArchiver::BlockAssembly; LogArchiver::ArchiveDirectory

Author: Caetano Sauer

Constructor & Destructor Documentation

§ LogArchiver() [1/2]

LogArchiver::LogArchiver ( const sm_options & options )

§ LogArchiver() [2/2]

LogArchiver::LogArchiver	(	ArchiveIndex *	d,
		LogConsumer *	c,
		ArchiverHeap *	h,
		BlockAssembly *	b
	)

§ ~LogArchiver()

LogArchiver::~LogArchiver ( )

virtual

Member Function Documentation

§ activate()

void LogArchiver::activate	(	lsn_t	endLSN = `lsn_t::null`,
		bool	wait = `true`
	)

§ archiveUntilLSN()

void LogArchiver::archiveUntilLSN ( lsn_t lsn )

§ getEager()

bool LogArchiver::getEager ( ) const

inline

§ getIndex()

std::shared_ptr<ArchiveIndex> LogArchiver::getIndex ( )

inline

§ getNextConsumedLSN()

lsn_t LogArchiver::getNextConsumedLSN ( )

inline

§ isLogTooSlow()

bool LogArchiver::isLogTooSlow ( )

private

§ processFlushRequest()

bool LogArchiver::processFlushRequest ( )

private

§ pushIntoHeap()

void LogArchiver::pushIntoHeap	(	logrec_t *	lr,
		bool	duplicate
	)

private

§ replacement()

void LogArchiver::replacement ( )

private

Replacement part of replacement-selection algorithm. Fetches log records from the read buffer into the sort workspace and adds a correspondent entry to the heap. When workspace is full, invoke selection until there is space available for the current log record.

Unlike standard replacement selection, runs are limited to the size of the workspace, in order to maintain a simple non-overlapping mapping between regions of the input file (i.e., the recovery log) and the runs. To achieve that, we change the logic that assigns run numbers to incoming records:

a) Standard RS: if incoming key is larger than the last record written, assign to current run, otherwise to the next run. b) Log-archiving RS: keep track of run number currently being written, always assigning the incoming records to a greater run. Once all records from the current run are removed from the heap, increment the counter. To start, initial input records are assigned to run 1 until the workspace is full, after which incoming records are assigned to run 2.

§ requestFlushAsync()

bool LogArchiver::requestFlushAsync ( lsn_t reqLSN )

§ requestFlushSync()

void LogArchiver::requestFlushSync ( lsn_t reqLSN )

§ run()

void LogArchiver::run ( )

virtual

Implements thread_wrapper_t.

§ selection()

bool LogArchiver::selection ( )

private

Selection part of replacement-selection algorithm. Takes the smallest record from the heap and copies it to the write buffer, one IO block at a time. The block header contains the run number (1 byte) and the logical size of the block (4 bytes). The former is required so that the asynchronous writer thread knows when to start a new run file. The latter simplifies the write process by not allowing records to be split in the middle by block boundaries.

§ setEager()

void LogArchiver::setEager ( bool e )

inline

§ shouldActivate()

bool LogArchiver::shouldActivate ( bool logTooSlow )

private

§ shutdown()

void LogArchiver::shutdown ( )

§ waitForActivation()

bool LogArchiver::waitForActivation ( )

private

Member Data Documentation

§ blkAssemb

BlockAssembly* LogArchiver::blkAssemb

private

§ consumer

LogConsumer* LogArchiver::consumer

private

§ control

ArchiverControl LogArchiver::control

private

§ DFT_EAGER

const bool LogArchiver::DFT_EAGER = true

static

§ DFT_GRACE_PERIOD

const int LogArchiver::DFT_GRACE_PERIOD = 1000000

static

§ DFT_READ_WHOLE_BLOCKS

const bool LogArchiver::DFT_READ_WHOLE_BLOCKS = true

static

§ eager

bool LogArchiver::eager

private

§ flushReqLSN

lsn_t LogArchiver::flushReqLSN

private

§ heap

ArchiverHeap* LogArchiver::heap

private

§ index

std::shared_ptr<ArchiveIndex> LogArchiver::index

private

§ merger

MergerDaemon* LogArchiver::merger

private

§ nextActLSN

lsn_t LogArchiver::nextActLSN

private

§ readWholeBlocks

bool LogArchiver::readWholeBlocks

private

§ selfManaged

bool LogArchiver::selfManaged

private

§ shutdownFlag

std::atomic<bool> LogArchiver::shutdownFlag

private

§ slowLogGracePeriod

int LogArchiver::slowLogGracePeriod

private

The documentation for this class was generated from the following files:

src/sm/logarchiver.h
src/sm/logarchiver.cpp

Public Member Functions

Static Public Attributes

Private Member Functions

Private Attributes

Detailed Description

Constructor & Destructor Documentation

§ LogArchiver() [1/2]

§ LogArchiver() [2/2]

§ ~LogArchiver()

Member Function Documentation

§ activate()

§ archiveUntilLSN()

§ getEager()

§ getIndex()

§ getNextConsumedLSN()

§ isLogTooSlow()

§ processFlushRequest()

§ pushIntoHeap()

§ replacement()

§ requestFlushAsync()

§ requestFlushSync()

§ run()

§ selection()

§ setEager()

§ shouldActivate()

§ shutdown()

§ waitForActivation()

Member Data Documentation

§ blkAssemb

§ consumer

§ control

§ DFT_EAGER

§ DFT_GRACE_PERIOD

§ DFT_READ_WHOLE_BLOCKS

§ eager

§ flushReqLSN

§ heap

§ index

§ merger

§ nextActLSN

§ readWholeBlocks

§ selfManaged

§ shutdownFlag

§ slowLogGracePeriod