A toolkit for working with phylogenetic data.
v0.24.0
InputStream Class Reference

#include <genesis/utils/io/input_stream.hpp>

Detailed Description

Stream interface for reading data from an InputSource, that keeps track of line and column counters.

This class provides similar functionality to std::istream, but has a different way of handling the stream and characters. The main differences are:

  • The stream is not automatically advanced after reading a char. This is because otherwise the line and column would already point to the next char while processing the last. Thus, advance() or the increment operator++() have to be called to get to the next char in the stream.
  • The handling of line feed chars (LF or \n, as used in Unix-like systems) and carriage return chars (CR or \r, which are the new line delimiters in many Mac systems, and which are part of the CR+LF new lines as used in Windows) is different. Both, CR and LF chars (and the whole CR+LF combination), are turned into single line feed chars (\n) in this iterator. This ensures that all new lines delimiters are internally represented as one LF, independently of the file format. That makes parsing way easier.

It has two member functions line() and column() that return the corresponding values for the current iterator position. Also, at() can be used to get a textual representation of the current position. The member function current() furthermore provides a checked version of the dereference operator.

Implementation details inspired by fast-cpp-csv-parser by Ben Strasser, see also Acknowledgements.

Definition at line 80 of file input_stream.hpp.

Public Member Functions

 InputStream ()
 
 InputStream (std::shared_ptr< BaseInputSource > input_source)
 
 InputStream (self_type const &)=delete
 
 InputStream (self_type &&other)
 
 ~InputStream ()
 
self_typeadvance ()
 Move to the next char in the stream and advance the counters. More...
 
std::string at () const
 Return a textual representation of the current input position in the form "line:column". More...
 
size_t column () const
 Return the current column of the input stream. More...
 
char current () const
 Return the current char, with some checks. More...
 
bool eof () const
 Return true iff the input reached its end. More...
 
char get_char ()
 Extract a single char from the input. More...
 
void get_line (std::string &target)
 Read the current line, append it to the target, and move to the beginning of the next line. More...
 
std::string get_line ()
 Read the current line and move to the beginning of the next. More...
 
bool good () const
 Return true iff the input is good (not end of data) and can be read from. More...
 
size_t line () const
 Return the current line of the input stream. More...
 
 operator bool () const
 Return true iff the input is good (not end of data) and can be read from. Shortcut for good(). More...
 
char operator* () const
 Dereference operator. Return the current char. More...
 
self_typeoperator++ ()
 Move to the next char in the stream. Shortcut for advance(). More...
 
self_typeoperator= (self_type const &)=delete
 
self_typeoperator= (self_type &&other)
 
std::string source_name () const
 Get the input source name where this stream reads from. More...
 

Public Types

using self_type = InputStream
 
using value_type = char
 

Static Public Attributes

static const size_t BlockLength = 1 << 22
 Block length for internal buffering. More...
 

Constructor & Destructor Documentation

◆ InputStream() [1/4]

InputStream ( )
inline

Definition at line 104 of file input_stream.hpp.

◆ InputStream() [2/4]

InputStream ( std::shared_ptr< BaseInputSource input_source)
inlineexplicit

Definition at line 114 of file input_stream.hpp.

◆ ~InputStream()

~InputStream ( )
inline

Definition at line 121 of file input_stream.hpp.

◆ InputStream() [3/4]

InputStream ( self_type const &  )
delete

◆ InputStream() [4/4]

InputStream ( self_type &&  other)
inline

Definition at line 129 of file input_stream.hpp.

Member Function Documentation

◆ advance()

self_type& advance ( )
inline

Move to the next char in the stream and advance the counters.

Definition at line 214 of file input_stream.hpp.

◆ at()

std::string at ( ) const
inline

Return a textual representation of the current input position in the form "line:column".

Definition at line 481 of file input_stream.hpp.

◆ column()

size_t column ( ) const
inline

Return the current column of the input stream.

The counter starts with column 1 for each line of the input stream. New line characters \n are included in counting and count as the last character of a line.

Definition at line 472 of file input_stream.hpp.

◆ current()

char current ( ) const
inline

Return the current char, with some checks.

This function is similar to the dereference operator, but additionally performs two checks:

  • End of input: If this function is called when there is no more data left in the input, it throws an runtime_error.
  • Current char: This iterator is meant for ASCII (or similar) text format encodings with single bytes, and its output should be usable for lookup tables etc. Thus, this function ensures that the char is in the range [0, 127]. If not, an std::domain_error is thrown.

Usually, those two conditions are checked in the parser anyway, so in most cases it is preferred to use the dereference operator instead.

Definition at line 196 of file input_stream.hpp.

◆ eof()

bool eof ( ) const
inline

Return true iff the input reached its end.

Definition at line 506 of file input_stream.hpp.

◆ get_char()

char get_char ( )
inline

Extract a single char from the input.

Return the current char and move to the next one.

Definition at line 256 of file input_stream.hpp.

◆ get_line() [1/2]

void get_line ( std::string &  target)
inline

Read the current line, append it to the target, and move to the beginning of the next line.

The function finds the end of the current line, starting from the current position, and appends the content to the given target, excluding the trailing new line char(s). The stream is left at the first char of the next line.

Definition at line 275 of file input_stream.hpp.

◆ get_line() [2/2]

std::string get_line ( )
inline

Read the current line and move to the beginning of the next.

The function finds the end of the current line, starting from the current position, and returns the content, excluding the trailing new line char(s). The stream is left at the first char of the next line.

Definition at line 445 of file input_stream.hpp.

◆ good()

bool good ( ) const
inline

Return true iff the input is good (not end of data) and can be read from.

Definition at line 489 of file input_stream.hpp.

◆ line()

size_t line ( ) const
inline

Return the current line of the input stream.

The counter starts with line 1 for input stream.

Definition at line 461 of file input_stream.hpp.

◆ operator bool()

operator bool ( ) const
inlineexplicit

Return true iff the input is good (not end of data) and can be read from. Shortcut for good().

Definition at line 498 of file input_stream.hpp.

◆ operator*()

char operator* ( ) const
inline

Dereference operator. Return the current char.

Definition at line 177 of file input_stream.hpp.

◆ operator++()

self_type& operator++ ( )
inline

Move to the next char in the stream. Shortcut for advance().

Definition at line 245 of file input_stream.hpp.

◆ operator=() [1/2]

self_type& operator= ( self_type const &  )
delete

◆ operator=() [2/2]

self_type& operator= ( self_type &&  other)
inline

Definition at line 137 of file input_stream.hpp.

◆ source_name()

std::string source_name ( ) const
inline

Get the input source name where this stream reads from.

Depending on the type of input, this is either

  • "input string",
  • "input stream" or
  • "input file <filename>"

This is mainly useful for user output like log and error messages.

Definition at line 522 of file input_stream.hpp.

Member Typedef Documentation

◆ self_type

Definition at line 97 of file input_stream.hpp.

◆ value_type

using value_type = char

Definition at line 98 of file input_stream.hpp.

Member Data Documentation

◆ BlockLength

const size_t BlockLength = 1 << 22
static

Block length for internal buffering.

The buffer uses three blocks of this size (4MB each). This is also the maximum line length that can be read at a time with get_line(). If this is too short, change the BlockLength.

Definition at line 95 of file input_stream.hpp.


The documentation for this class was generated from the following file: