#include <genesis/utils/io/input_stream.hpp>
Stream interface for reading data from an InputSource, that keeps track of line and column counters.
This class provides similar functionality to std::istream
, but has a different way of handling the stream and characters. The main differences are:
\n
, as used in Unix-like systems) and carriage return chars (CR or \r
, which are the new line delimiters in many Mac systems, and which are part of the CR+LF new lines as used in Windows) is different. Both, CR and LF chars (and the whole CR+LF combination), are turned into single line feed chars (\n
) in this iterator. This ensures that all new lines delimiters are internally represented as one LF, independently of the file format. That makes parsing way easier.It has two member functions line() and column() that return the corresponding values for the current iterator position. Also, at() can be used to get a textual representation of the current position. The member function current() furthermore provides a checked version of the dereference operator.
Implementation details inspired by fast-cpp-csv-parser by Ben Strasser, see also Acknowledgements.
Definition at line 88 of file input_stream.hpp.
Public Member Functions | |
InputStream () | |
InputStream (self_type &&other) | |
InputStream (self_type const &)=delete | |
InputStream (std::shared_ptr< BaseInputSource > input_source) | |
~InputStream () | |
self_type & | advance () |
Move to the next char in the stream and advance the counters. More... | |
std::string | at () const |
Return a textual representation of the current input position in the form "line:column". More... | |
std::pair< char const *, size_t > | buffer () |
Direct access to the internal buffer. More... | |
size_t | column () const |
Return the current column of the input stream. More... | |
char | current () const |
Return the current char, with some checks. More... | |
bool | eof () const |
Return true iff the input reached its end. More... | |
char | get_char () |
Extract a single char from the input. More... | |
std::string | get_line () |
Read the current line and move to the beginning of the next. More... | |
void | get_line (std::string &target) |
Read the current line, append it to the target , and move to the beginning of the next line. More... | |
bool | good () const |
Return true iff the input is good (not end of data) and can be read from. More... | |
void | jump_unchecked (size_t n) |
Jump forward in the stream by a certain amount of chars. More... | |
size_t | line () const |
Return the current line of the input stream. More... | |
operator bool () const | |
Return true iff the input is good (not end of data) and can be read from. Shortcut for good(). More... | |
char | operator* () const |
Dereference operator. Return the current char. More... | |
self_type & | operator++ () |
Move to the next char in the stream. Shortcut for advance(). More... | |
self_type & | operator= (self_type &&other) |
self_type & | operator= (self_type const &)=delete |
char | read_char_or_throw (char const criterion) |
Lexing function that reads a single char from the stream and checks whether it equals the provided one. More... | |
char | read_char_or_throw (std::function< bool(char)> criterion) |
Lexing function that reads a single char from the stream and checks whether it fulfills the provided criterion. More... | |
std::string | source_name () const |
Get the input source name where this stream reads from. More... | |
Public Types | |
using | self_type = InputStream |
using | value_type = char |
Static Public Attributes | |
static const size_t | BlockLength = 1 << 22 |
Block length for internal buffering. More... | |
|
inline |
Definition at line 110 of file input_stream.hpp.
|
inlineexplicit |
Definition at line 120 of file input_stream.hpp.
|
inline |
Definition at line 127 of file input_stream.hpp.
|
delete |
|
inline |
Definition at line 134 of file input_stream.hpp.
|
inline |
Move to the next char in the stream and advance the counters.
Definition at line 187 of file input_stream.hpp.
|
inline |
Return a textual representation of the current input position in the form "line:column".
Definition at line 437 of file input_stream.hpp.
|
inline |
Direct access to the internal buffer.
This function returns a pointer to the internal buffer, as well as the length (past the end) that is currently buffered. This is meant for special file parsers that can optimize better when using this - but it is highly dangerous to use if you do not know what you are doing!
The idea is as follows: With access to the buffer, parse data as needed, keeping track of how many chars have been processed. Then, use the jump_unchecked() function to update this stream to the new position of the stream (the char after the last one being processed by the parsing).
Caveat: Never parse and jump across new line characters (or, for that matter, carriage return characters, which won't be automatically converted when using the buffer directly)! This would invalidate the line counting!
Caveat: Never read after the end of the buffer, that is, never access the char at the returned last position buffer + length
or after!
Definition at line 390 of file input_stream.hpp.
|
inline |
Return the current column of the input stream.
The counter starts with column 1 for each line of the input stream. New line characters \n
are included in counting and count as the last character of a line.
Definition at line 428 of file input_stream.hpp.
|
inline |
Return the current char, with some checks.
This function is similar to the dereference operator, but additionally performs two checks:
runtime_error
.std::domain_error
is thrown.Usually, those two conditions are checked in the parser anyway, so in most cases it is preferred to use the dereference operator instead.
Definition at line 169 of file input_stream.hpp.
|
inline |
Return true iff the input reached its end.
Definition at line 462 of file input_stream.hpp.
|
inline |
Extract a single char from the input.
Return the current char and move to the next one.
Definition at line 231 of file input_stream.hpp.
|
inline |
Read the current line and move to the beginning of the next.
The function finds the end of the current line, starting from the current position, and returns the content, excluding the trailing new line char(s). The stream is left at the first char of the next line.
Definition at line 284 of file input_stream.hpp.
void get_line | ( | std::string & | target | ) |
Read the current line, append it to the target
, and move to the beginning of the next line.
The function finds the end of the current line, starting from the current position, and appends the content to the given target
, excluding the trailing new line char(s). The stream is left at the first char of the next line.
Definition at line 127 of file input_stream.cpp.
|
inline |
Return true iff the input is good (not end of data) and can be read from.
Definition at line 445 of file input_stream.hpp.
void jump_unchecked | ( | size_t | n | ) |
Jump forward in the stream by a certain amount of chars.
This is meant to update the stream position after using buffer() for direct parsing. See the caveats there!
In particular, we can never jump behind the current buffer end, and shall not jump across new lines. That is, this function is not meant as a way to jump to an arbitrary (later) position in a file!
Definition at line 609 of file input_stream.cpp.
|
inline |
Return the current line of the input stream.
The counter starts with line 1 for input stream.
Definition at line 417 of file input_stream.hpp.
|
inlineexplicit |
Return true iff the input is good (not end of data) and can be read from. Shortcut for good().
Definition at line 454 of file input_stream.hpp.
|
inline |
Dereference operator. Return the current char.
Definition at line 150 of file input_stream.hpp.
|
inline |
Move to the next char in the stream. Shortcut for advance().
Definition at line 196 of file input_stream.hpp.
InputStream & operator= | ( | self_type && | other | ) |
Definition at line 52 of file input_stream.cpp.
char read_char_or_throw | ( | char const | criterion | ) |
Lexing function that reads a single char from the stream and checks whether it equals the provided one.
If not, the function throws std::runtime_error
. The stream is advanced by one position and the char is returned. For a similar function that checks the value of the current char but does not advance in the stream, see affirm_char_or_throw().
Definition at line 89 of file input_stream.cpp.
char read_char_or_throw | ( | std::function< bool(char)> | criterion | ) |
Lexing function that reads a single char from the stream and checks whether it fulfills the provided criterion.
If not, the function throws std::runtime_error
. The stream is advanced by one position and the char is returned. For a similar function that checks the value of the current char but does not advance in the stream, see affirm_char_or_throw().
Definition at line 104 of file input_stream.cpp.
|
inline |
Get the input source name where this stream reads from.
Depending on the type of input, this is either
This is mainly useful for user output like log and error messages.
Definition at line 478 of file input_stream.hpp.
using self_type = InputStream |
Definition at line 103 of file input_stream.hpp.
using value_type = char |
Definition at line 104 of file input_stream.hpp.
|
static |
Block length for internal buffering.
The buffer uses three blocks of this size (4MB each).
Definition at line 101 of file input_stream.hpp.