A library for working with phylogenetic and population genetic data.
v0.27.0
LambdaIterator< T, D > Class Template Reference

#include <genesis/utils/containers/lambda_iterator.hpp>

Detailed Description

template<class T, class D = EmptyLambdaIteratorData>
class genesis::utils::LambdaIterator< T, D >

Type erasure for iterators, using std::function to eliminate the underlying input type.

This class offers an abstraction to get a uniform iterator type over a set of underlying iterators of different type. It expects a function (most likely, you want to use a lambda) that converts the underlying data into the desired type T, which is where the type erasure happens. The data that is iterated over is stored here, and the end of the iterator is indicated by the lambda by returning false.

Example:

// Convert from an iterator over VcfRecord to Variant.
auto beg = vcf_range.begin();
auto end = vcf_range.end();
// Create the conversion with type erasure via the lambda function.
auto generator = LambdaIterator<Variant>(
[beg, end]( Variant& var ) mutable {
if( beg != end ) {
var = convert_to_variant(*beg);
++beg;
return true;
} else {
return false;
}
}
);
// Iterate over generator.begin() and generator.end()
for( auto const& it : generator ) ...

For other types of iterators, instead of beg and end, other input can be used:

// Use a pileup iterator, which does not offer begin and end.
auto it = SimplePileupInputIterator( utils::from_file( pileup_file_.value ), reader );
auto generator = LambdaIterator<Variant>(
[it]( Variant& var ) mutable {
if( it ) {
var = *it;
++it;
return true;
} else {
return false;
}
}
);

And accordingly for other underlying iterator types.

In addition to the type T that we iterate over, for user convenience, we also offer to use a data storage variable of the template type D (typedef'd as LambdaIterator::Data). This data is provided at construction of the LambdaIterator, and can be accessed via the data() functions. it is a generic extra variable to store iterator-specific information. As the LambdaIterator is intended to be initializable with just a lambda function that yields the elements to traverse over, there is otherwise no convenient way to access related information of the underlying iterator. For example, when iterating over a file, one might want to store the file name or other characteristics of the input in the data().

The class furthermore offers filters and transformations of the underlying iterator data, using the functions add_filter(), add_transform(), and add_transform_filter(), which can all be mixed and are executed as a combined list in the order in which they were added using these three functions (that is, it can be first a filter, then a transformation, then a filter again). This allows to easily skip elements of the underlying iterator without the need to add an additional layer of abstraction.

Lastly, the class offers block buffering in a separate thread, for speed up. This capability takes care of the underlying iterator processing (including potential file parsing etc), and buffers blocks of elements, so that the user of this class has faster access to it. For example, when processing data along a genome with lots of computations per position, it makes sense to run the file reading in a separate thread and buffer positions as needed, which this class does automatically. This can be activated by setting the block_size() to the indended number of elements to be buffered. By default, this is set to 0, meaning that no buffering is done. Note that small buffer sizes can induce overhead for the thread synchronisation; we hence recommend to use block sizes of 1000 or greater, as needed.

We are aware that with all this extra functionality, the class is slighly overloaded, and that the filters and the block buffering would typically go in separate classes for modularity. However, we are taking user convenience and speed into account here: Instead of having to add filters or a block buffer wrapper around each input iterator that is then wrapped in a LambdaIterator anyway, we rather take care of this in one place; this also reduces levels of abstraction, and hence (hopefully) increases processing speed.

See also
VariantInputIterator for a use case of this iterator that allows to traverse different input file types that all are convertible to Variant.

Definition at line 150 of file lambda_iterator.hpp.

Public Member Functions

 LambdaIterator ()=default
 
 LambdaIterator (self_type &&)=default
 
 LambdaIterator (self_type const &)=default
 
 LambdaIterator (std::function< bool(value_type &)> get_element, Data &&data, size_t block_size=DEFAULT_BLOCK_SIZE)
 Create an iterator over some underlying content. More...
 
 LambdaIterator (std::function< bool(value_type &)> get_element, Data const &data, size_t block_size=DEFAULT_BLOCK_SIZE)
 Create an iterator over some underlying content. More...
 
 LambdaIterator (std::function< bool(value_type &)> get_element, size_t block_size=DEFAULT_BLOCK_SIZE)
 Create an iterator over some underlying content. More...
 
 ~LambdaIterator ()=default
 
self_typeadd_filter (std::function< bool(T const &)> filter)
 Add a filter function that is applied to each element of the iteration. More...
 
self_typeadd_transform (std::function< void(T &)> transform)
 Add a transformation function that is applied to each element of the iteration. More...
 
self_typeadd_transform_filter (std::function< bool(T &)> filter)
 Add a transformation and filter function that is applied to each element of the iteration. More...
 
Iterator begin ()
 
size_t block_size () const
 Get the currenlty set block size used for buffering the input data. More...
 
self_typeblock_size (size_t value)
 Set the block size used for buffering the input data. More...
 
self_typeclear_filters_and_transformations ()
 Clear all filters and transformations. More...
 
Datadata ()
 Access the data stored in the iterator. More...
 
Data const & data () const
 Access the data stored in the iterator. More...
 
Iterator end ()
 
 operator bool () const
 Return whether a function to get elemetns was assigend to this generator, that is, whether it is default constructed (false) or not (true). More...
 
self_typeoperator= (self_type &&)=default
 
self_typeoperator= (self_type const &)=default
 

Public Types

using Data = D
 
using difference_type = std::ptrdiff_t
 
using iterator_category = std::input_iterator_tag
 
using pointer = value_type const *
 
using reference = value_type const &
 
using self_type = LambdaIterator
 
using value_type = T
 

Public Attributes

friend Iterator
 

Static Public Attributes

static const size_t DEFAULT_BLOCK_SIZE = 0
 Default size for block buffering. More...
 

Classes

class  Iterator
 Internal iterator over the data. More...
 

Constructor & Destructor Documentation

◆ LambdaIterator() [1/6]

LambdaIterator ( )
default

◆ LambdaIterator() [2/6]

LambdaIterator ( std::function< bool(value_type &)>  get_element,
size_t  block_size = DEFAULT_BLOCK_SIZE 
)
inline

Create an iterator over some underlying content.

The constructor expects the function that takes an element by reference to assign it its new value at each iteration, and returns true if there was an element (iteration still ongoing), or false once the end of the underlying iterator is reached.

Definition at line 666 of file lambda_iterator.hpp.

◆ LambdaIterator() [3/6]

LambdaIterator ( std::function< bool(value_type &)>  get_element,
Data const &  data,
size_t  block_size = DEFAULT_BLOCK_SIZE 
)
inline

Create an iterator over some underlying content.

The constructor expects the function that takes an element by reference to assign it its new value at each iteration, and returns true if there was an element (iteration still ongoing), or false once the end of the underlying iterator is reached.

Additionally, data can be given here, which we simply store and make accessible via data(). This is a convenience so that iterators generated via a make function can for example forward their input source name for user output.

Definition at line 681 of file lambda_iterator.hpp.

◆ LambdaIterator() [4/6]

LambdaIterator ( std::function< bool(value_type &)>  get_element,
Data &&  data,
size_t  block_size = DEFAULT_BLOCK_SIZE 
)
inline

Create an iterator over some underlying content.

The constructor expects the function that takes an element by reference to assign it its new value at each iteration, and returns true if there was an element (iteration still ongoing), or false once the end of the underlying iterator is reached.

Additionally, data can be given here, which we simply store and make accessible via data(). This is a convenience so that iterators generated via a make function can for example forward their input source name for user output.

This version of the constructor takes the data by r-value reference, for moving it.

Definition at line 696 of file lambda_iterator.hpp.

◆ ~LambdaIterator()

~LambdaIterator ( )
default

◆ LambdaIterator() [5/6]

LambdaIterator ( self_type const &  )
default

◆ LambdaIterator() [6/6]

LambdaIterator ( self_type &&  )
default

Member Function Documentation

◆ add_filter()

self_type& add_filter ( std::function< bool(T const &)>  filter)
inline

Add a filter function that is applied to each element of the iteration.

If the function returns false, the element is skipped in the iteration.

Note that all of add_transform(), add_filter(), and add_transform_filter() are chained in the order in which they are added - meaning that they can be mixed as needed. For example, it makes sense to first filter by some property, and then apply transformations only on those elements that passed the filter to avoid unneeded work.

Definition at line 785 of file lambda_iterator.hpp.

◆ add_transform()

self_type& add_transform ( std::function< void(T &)>  transform)
inline

Add a transformation function that is applied to each element of the iteration.

Note that all of add_transform(), add_filter(), and add_transform_filter() are chained in the order in which they are added - meaning that they can be mixed as needed. For example, it makes sense to first filter by some property, and then apply transformations only on those elements that passed the filter to avoid unneeded work.

Definition at line 767 of file lambda_iterator.hpp.

◆ add_transform_filter()

self_type& add_transform_filter ( std::function< bool(T &)>  filter)
inline

Add a transformation and filter function that is applied to each element of the iteration.

This can be used to transform and filter an alement at the same time, as a shortcut where several steps might be needed at once. If the function returns false, the element is skipped in the iteration.

Note that all of add_transform(), add_filter(), and add_transform_filter() are chained in the order in which they are added - meaning that they can be mixed as needed. For example, it makes sense to first filter by some property, and then apply transformations only on those elements that passed the filter to avoid unneeded work.

Definition at line 805 of file lambda_iterator.hpp.

◆ begin()

Iterator begin ( )
inline

Definition at line 720 of file lambda_iterator.hpp.

◆ block_size() [1/2]

size_t block_size ( ) const
inline

Get the currenlty set block size used for buffering the input data.

Definition at line 830 of file lambda_iterator.hpp.

◆ block_size() [2/2]

self_type& block_size ( size_t  value)
inline

Set the block size used for buffering the input data.

Shall not be changed after iteration has started, that is, after calling begin().

By default, this is set to 0, meaning that no buffering is done. Note that small buffer sizes can induce overhead for the thread synchronisation; we hence recommend to use block sizes of 1000 or greater, as needed.

Definition at line 844 of file lambda_iterator.hpp.

◆ clear_filters_and_transformations()

self_type& clear_filters_and_transformations ( )
inline

Clear all filters and transformations.

Definition at line 818 of file lambda_iterator.hpp.

◆ data() [1/2]

Data& data ( )
inline

Access the data stored in the iterator.

Definition at line 750 of file lambda_iterator.hpp.

◆ data() [2/2]

Data const& data ( ) const
inline

Access the data stored in the iterator.

Definition at line 742 of file lambda_iterator.hpp.

◆ end()

Iterator end ( )
inline

Definition at line 725 of file lambda_iterator.hpp.

◆ operator bool()

operator bool ( ) const
inline

Return whether a function to get elemetns was assigend to this generator, that is, whether it is default constructed (false) or not (true).

Definition at line 734 of file lambda_iterator.hpp.

◆ operator=() [1/2]

self_type& operator= ( self_type &&  )
default

◆ operator=() [2/2]

self_type& operator= ( self_type const &  )
default

Member Typedef Documentation

◆ Data

using Data = D

Definition at line 165 of file lambda_iterator.hpp.

◆ difference_type

using difference_type = std::ptrdiff_t

Definition at line 162 of file lambda_iterator.hpp.

◆ iterator_category

using iterator_category = std::input_iterator_tag

Definition at line 163 of file lambda_iterator.hpp.

◆ pointer

using pointer = value_type const*

Definition at line 160 of file lambda_iterator.hpp.

◆ reference

using reference = value_type const&

Definition at line 161 of file lambda_iterator.hpp.

◆ self_type

Definition at line 158 of file lambda_iterator.hpp.

◆ value_type

using value_type = T

Definition at line 159 of file lambda_iterator.hpp.

Member Data Documentation

◆ DEFAULT_BLOCK_SIZE

const size_t DEFAULT_BLOCK_SIZE = 0
static

Default size for block buffering.

The class by default buffers blocks of elements of this size, with the buffer loaded in a separate thread, in order to speed up iterating over elements that need some processing, such as input files, which is the typical use case of this class.

Definition at line 174 of file lambda_iterator.hpp.

◆ Iterator

friend Iterator

Definition at line 714 of file lambda_iterator.hpp.


The documentation for this class was generated from the following file:
genesis::population::convert_to_variant
Variant convert_to_variant(SimplePileupReader::Record const &record, unsigned char min_phred_score)
Definition: simple_pileup_common.cpp:145
genesis::utils::from_file
std::shared_ptr< BaseInputSource > from_file(std::string const &file_name, bool detect_compression=true)
Obtain an input source for reading from a file.
Definition: input_source.hpp:67
genesis::utils::LambdaIterator::end
Iterator end()
Definition: lambda_iterator.hpp:725