A toolkit for working with phylogenetic data.
v0.20.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
InputStream Class Reference

#include <genesis/utils/io/input_stream.hpp>

Detailed Description

Stream interface for reading data from an InputSource, that keeps track of line and column counters.

This class provides similar functionality to std::istream, but has a different way of handling the stream and characters. The main differences are:

  • The stream is not automatically advanced after reading a char. This is because otherwise the line and column would already point to the next char while processing the last. Thus, advance() or the increment operator++() have to be called to get to the next char in the stream.
  • The handling of line feed chars (LF or \n, as used in Unix-like systems) and carriage return chars (CR or \r, which are the new line delimiters in many Mac systems, and which are part of the CR+LF new lines as used in Windows) is different. Both, CR and LF chars (and the whole CR+LF combination), are turned into single line feed chars (\n) in this iterator. This ensures that all new lines delimiters are internally represented as one LF, independently of the file format. That makes parsing way easier.

It has two member functions line() and column() that return the corresponding values for the current iterator position. Also, at() can be used to get a textual representation of the current position. The member function current() furthermore provides a checked version of the dereference operator.

Implementation details inspired by fast-cpp-csv-parser by Ben Strasser, see also Acknowledgements.

Definition at line 76 of file input_stream.hpp.

Public Member Functions

 InputStream ()
 
 InputStream (std::unique_ptr< BaseInputSource > input_source)
 
 InputStream (self_type const &)=delete
 
 InputStream (self_type &&)=delete
 
 ~InputStream ()
 
self_typeadvance ()
 Move to the next char in the stream and advance the counters. More...
 
std::string at () const
 Return a textual representation of the current input position in the form "line:column". More...
 
size_t column () const
 Return the current column of the input stream. More...
 
char current () const
 Return the current char, with some checks. More...
 
bool eof () const
 Return true iff the input reached its end. More...
 
char get_char ()
 Extract a single char from the input. More...
 
std::pair< char *, size_t > get_line ()
 Return the current line and move to the beginning of the next. More...
 
bool good () const
 Return true iff the input is good (not end of data) and can be read from. More...
 
size_t line () const
 Return the current line of the input stream. More...
 
 operator bool () const
 Return true iff the input is good (not end of data) and can be read from. Shortcut for good(). More...
 
char operator* () const
 Dereference operator. Return the current char. More...
 
self_typeoperator++ ()
 Move to the next char in the stream. Shortcut for advance(). More...
 
self_typeoperator= (self_type const &)=delete
 
self_typeoperator= (self_type &&)=delete
 
std::string source_name () const
 Get the input source name where this stream reads from. More...
 

Public Types

using self_type = InputStream
 
using value_type = char
 

Static Public Attributes

static const size_t BlockLength = 1 << 24
 Block length for internal buffering. More...
 

Constructor & Destructor Documentation

InputStream ( )
inline

Definition at line 100 of file input_stream.hpp.

InputStream ( std::unique_ptr< BaseInputSource input_source)
inlineexplicit

Definition at line 110 of file input_stream.hpp.

~InputStream ( )
inline

Definition at line 117 of file input_stream.hpp.

InputStream ( self_type const &  )
delete
InputStream ( self_type &&  )
delete

Member Function Documentation

self_type& advance ( )
inline

Move to the next char in the stream and advance the counters.

Definition at line 173 of file input_stream.hpp.

std::string at ( ) const
inline

Return a textual representation of the current input position in the form "line:column".

Definition at line 323 of file input_stream.hpp.

size_t column ( ) const
inline

Return the current column of the input stream.

The counter starts with column 1 for each line of the input stream. New line characters \n are included in counting and count as the last character of a line.

Definition at line 314 of file input_stream.hpp.

char current ( ) const
inline

Return the current char, with some checks.

This function is similar to the dereference operator, but additionally performs two checks:

  • End of input: If this function is called when there is no more data left in the input, it throws an runtime_error.
  • Current char: This iterator is meant for ASCII (or similar) text format encodings with single bytes, and its output should be usable for lookup tables etc. Thus, this function ensures that the char is in the range [0, 127]. If not, an std::domain_error is thrown.

Usually, those two conditions are checked in the parser anyway, so in most cases it is preferred to use the dereference operator instead.

Definition at line 155 of file input_stream.hpp.

bool eof ( ) const
inline

Return true iff the input reached its end.

Definition at line 348 of file input_stream.hpp.

char get_char ( )
inline

Extract a single char from the input.

Return the current char and move to the next one.

Definition at line 214 of file input_stream.hpp.

std::pair< char*, size_t > get_line ( )
inline

Return the current line and move to the beginning of the next.

The function finds the end of the current line, starting from the current position. It returns a pointer to the current position and the length of the line. Furthermore, a null char is set at the end of the line, replacing the new line char. This allows downstream parses to directly use the returned pointer as a c-string.

The stream is left at the first char of the next line.

Definition at line 235 of file input_stream.hpp.

bool good ( ) const
inline

Return true iff the input is good (not end of data) and can be read from.

Definition at line 331 of file input_stream.hpp.

size_t line ( ) const
inline

Return the current line of the input stream.

The counter starts with line 1 for input stream.

Definition at line 303 of file input_stream.hpp.

operator bool ( ) const
inlineexplicit

Return true iff the input is good (not end of data) and can be read from. Shortcut for good().

Definition at line 340 of file input_stream.hpp.

char operator* ( ) const
inline

Dereference operator. Return the current char.

Definition at line 136 of file input_stream.hpp.

self_type& operator++ ( )
inline

Move to the next char in the stream. Shortcut for advance().

Definition at line 203 of file input_stream.hpp.

self_type& operator= ( self_type const &  )
delete
self_type& operator= ( self_type &&  )
delete
std::string source_name ( ) const
inline

Get the input source name where this stream reads from.

Depending on the type of input, this is either

  • "input string",
  • "input stream" or
  • "input file <filename>"

This is mainly useful for user output like log and error messages.

Definition at line 364 of file input_stream.hpp.

Member Typedef Documentation

Definition at line 93 of file input_stream.hpp.

using value_type = char

Definition at line 94 of file input_stream.hpp.

Member Data Documentation

const size_t BlockLength = 1 << 24
static

Block length for internal buffering.

The buffer uses three blocks of this size (16MB each). This is also the maximum line length that can be read at a time with get_line(). If this is too short, change the BlockLength.

Definition at line 91 of file input_stream.hpp.


The documentation for this class was generated from the following file: