JASSv2
Public Member Functions | Protected Attributes | List of all members
JASS::instream Class Referenceabstract

Read data from an input stream. More...

#include <instream.h>

Inheritance diagram for JASS::instream:
JASS::instream_document_trec JASS::instream_file JASS::instream_memory

Public Member Functions

 instream (allocator *memory=nullptr, instream *source=nullptr)
 Constructor. More...
 
virtual ~instream ()
 Destructor. More...
 
virtual void read (document &buffer)=0
 Read at most buffer.contents.size() bytes of data into buffer, resizing on eof. More...
 
size_t fetch (void *buffer, size_t bytes)
 fetch() generates a document object, sets its contents to the passed buffer, calls read() and returns the number of bytes of data read More...
 

Protected Attributes

allocatormemory
 Any and all memory allocation must happen using this object.
 
instreamsource
 If this object is reading from another instream then this is that instream.
 

Detailed Description

Read data from an input stream.

This is the abstract base class for reading data from an input source. If the indexer, for example, needs to read from a file then an instance of a subclass of this class can be used and once created the user need not know where the data is coming from. Its an abstraction over input streams of a generic interface to get data.

There are two "kinds" of these objects. Ones that read from a stream such as a file, and ones that generate documents ready for indexing. They share the same interface so that its possible to chain them together to form pipelines such as read_file | de-zip | de-tar | index.

The constructor of a complex pipeline does not want to keep track of each and every pointer to parts of the stream - and to free them on competion so this object deletes the predecessor in the pipeline if deleted. This propegates down the pipeline which is eventially cleaned up bottom up.

An example tying documents, instreams, and parsing to count the number of document and non-unique symbols is:

/*
PARSER_USE.CPP
--------------
Copyright (c) 2016 Andrew Trotman
Released under the 2-clause BSD license (See:https://en.wikipedia.org/wiki/BSD_licenses)
*/
#include "parser.h"
#include "instream_file.h"
/*
MAIN()
------
*/
int main(int argc, char *argv[])
{
/*
allocate a document object and a parser object.
*/
JASS::document document;
JASS::parser parser;
/*
build a pipeline - recall that deletes cascade so file is deleted when source goes out of scope.
*/
/*
this program counts document and alphbetic tokens in those documents.
*/
size_t total_documents = 0;
size_t alphas = 0;
/*
read document, then parse them.
*/
do
{
/*
read the next document into the same memory the last document used.
*/
document.rewind();
source.read(document);
/*
eof is signaled as an empty document.
*/
if (document.isempty())
break;
/*
count documents.
*/
total_documents++;
/*
now parse the docment.
*/
parser.set_document(document);
bool finished = false;
do
{
/*
get the next token
*/
const auto &token = parser.get_next_token();
/*
what type is that token
*/
switch (token.type)
{
/*
At end of document so signal to leave the loop.
*/
finished = true;
break;
/*
Count the number of alphabetic tokens.
*/
alphas++;
break;
default:
/*
else ignore the token.
*/
break;
}
}
while (!finished);
}
while (!document.isempty());
/*
Dump out the the number of documents and the numner of tokens.
*/
printf("Documents:%lld\n", (long long)total_documents);
printf("alphas :%lld\n", (long long)alphas);
return 0;
}

Constructor & Destructor Documentation

§ instream()

JASS::instream::instream ( allocator memory = nullptr,
instream source = nullptr 
)
inline

Constructor.

Parameters
memory[in] If this object needs to allocate memory (for example, a buffer) then it should be allocated from this pool.
source[in] This object reads data from source before processing and passingin via read().

§ ~instream()

virtual JASS::instream::~instream ( )
inlinevirtual

Destructor.

This destructor not only cleans up this object but also any object that is earlier in the pipeline - so a deletion of the root of the pipeline will delete the entire pipeline.

Member Function Documentation

§ fetch()

size_t JASS::instream::fetch ( void *  buffer,
size_t  bytes 
)
inline

fetch() generates a document object, sets its contents to the passed buffer, calls read() and returns the number of bytes of data read

Parameters
buffer[in] Buffer to read into.
bytes[in] The maximum number of bytes to read into the buffer.
Returns
The number of bytes that were read into the buffer.

§ read()

virtual void JASS::instream::read ( document buffer)
pure virtual

Read at most buffer.contents.size() bytes of data into buffer, resizing on eof.

Parameters
buffer[out] buffer.contents.size() bytes of data are read from source into buffer which is resized to the number of bytes read.

Implemented in JASS::instream_document_trec, JASS::instream_file, and JASS::instream_memory.


The documentation for this class was generated from the following file: