A toolkit for working with phylogenetic data.
v0.18.0
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Friends Macros Pages
TaxopathParser Class Reference

#include <genesis/taxonomy/formats/taxopath_parser.hpp>

Detailed Description

Helper class to parse a string containing a taxonomic path string into a Taxopath object.

This class bundles the parameters used for parsing a taxonomic path strings and offers functions for the actual parsing. This is needed in order to allow customization of the parsing process, for example in TaxonomyReader. Furthermore, this prevents code duplication in places where the input is a taxonomic path string. The result of the parsing process is a Taxopath object. See there for details.

The elements are expected to be char separated, using the value of delimiters() to separate them. Default is ';'.

For example: The input string

Tax_1; Tax_2 ;;Tax_4;

is parsed into the Taxopath

[ "Tax_1", "Tax_2", "Tax_2", "Tax_4" ]

That is, missing elements are filled up with the preceeding ones - this is a common technique in taxonomic databases, which is useful for unspecified taxa in deeper taxonomies.

Furthermore, if the string ends with the delimiter char, this is removed by default. See above for an example of this; see remove_trailing_delimiter() to change that behaviour and instead keep this last element. Also, the first taxon in the string cannot be empty. Otherwise an std::runtime_error is thrown.

Definition at line 81 of file taxopath_parser.hpp.

Public Member Functions

 TaxopathParser ()=default
 
 TaxopathParser (TaxopathParser const &)=default
 
 TaxopathParser (TaxopathParser &&)=default
 
 ~TaxopathParser ()=default
 
TaxopathParserdelimiters (std::string const &value)
 Set the chars used to split the taxonomic path string. More...
 
std::string delimiters () const
 Return the currelty set delimiter chars used to split the taxonomic path string. More...
 
Taxopath from_string (std::string const &taxopath) const
 Parse a taxonomic path string into a Taxopath object and return it. More...
 
Taxopath from_taxon (Taxon const &taxon) const
 Helper function to turn a Taxon into a Taxopath. More...
 
Taxopath operator() (std::string const &taxopath) const
 Shortcut function alias for from_string(). More...
 
Taxopath operator() (Taxon const &taxon) const
 Shortcut function alias for from_taxon(). More...
 
TaxopathParseroperator= (TaxopathParser const &)=default
 
TaxopathParseroperator= (TaxopathParser &&)=default
 
TaxopathParserremove_trailing_delimiter (bool value)
 Set whether to remove an empty taxonomic element at the end, if it occurs. More...
 
bool remove_trailing_delimiter () const
 Return whether currently trailing delimiters are removed from the taxonomic path string. More...
 
TaxopathParsertrim_whitespaces (bool value)
 Set whether to trim whitespaces around the taxonomic elements after splitting them. More...
 
bool trim_whitespaces () const
 Return the currently set value whether whitespaces are trimmed off the taxonomic elements. More...
 

Constructor & Destructor Documentation

TaxopathParser ( )
default
~TaxopathParser ( )
default
TaxopathParser ( TaxopathParser const &  )
default
TaxopathParser ( TaxopathParser &&  )
default

Member Function Documentation

TaxopathParser & delimiters ( std::string const &  value)

Set the chars used to split the taxonomic path string.

Those chars are used to split the taxon name into its hierarchical parts. Default is ';', as this is the usual value in many databases. See Taxopath for details.

If this value is set to multiple chars (string longer than 1), any of them is used for splitting.

Example: The taxonomic path string

Archaea;Euryarchaeota;Halobacteria;

is split into "Archaea", "Euryarchaeota" and "Halobacteria".

Definition at line 166 of file taxopath_parser.cpp.

std::string delimiters ( ) const

Return the currelty set delimiter chars used to split the taxonomic path string.

See the setter for details.

Definition at line 177 of file taxopath_parser.cpp.

Taxopath from_string ( std::string const &  taxopath) const

Parse a taxonomic path string into a Taxopath object and return it.

See the class description for details on what this parser does.

Definition at line 53 of file taxopath_parser.cpp.

Taxopath from_taxon ( Taxon const &  taxon) const

Helper function to turn a Taxon into a Taxopath.

This function is probably not need often, as the Taxopath is a helper object from a taxonomic path string towards a Taxon object, but not the other way round. In order to get the string from a Taxon, see the TaxopathGenerator class instead.

However, this function might still be useful in some cases. You never know.

Definition at line 120 of file taxopath_parser.cpp.

Taxopath operator() ( std::string const &  taxopath) const

Shortcut function alias for from_string().

This shortcut enables to use a TaxopathParser object as functor.

Definition at line 106 of file taxopath_parser.cpp.

Taxopath operator() ( Taxon const &  taxon) const

Shortcut function alias for from_taxon().

This shortcut enables to use a TaxopathParser object as functor.

Definition at line 143 of file taxopath_parser.cpp.

TaxopathParser& operator= ( TaxopathParser const &  )
default
TaxopathParser& operator= ( TaxopathParser &&  )
default
TaxopathParser & remove_trailing_delimiter ( bool  value)

Set whether to remove an empty taxonomic element at the end, if it occurs.

In many taxonomic databases, the taxonomic string representation end with a ';' by default. When splitting such a string, this results in an empty last element. If this option is set to true (default), this element is removed from the Taxopath.

If set to false, the element is not removed, but instead treated as a normal "empty" element, which means, it is replaced by the value of the preceeding element. See the class description for details on that.

Definition at line 225 of file taxopath_parser.cpp.

bool remove_trailing_delimiter ( ) const

Return whether currently trailing delimiters are removed from the taxonomic path string.

See the setter for details.

Definition at line 237 of file taxopath_parser.cpp.

TaxopathParser & trim_whitespaces ( bool  value)

Set whether to trim whitespaces around the taxonomic elements after splitting them.

Default is true. If set to true, the taxa given are trimmed off white spaces after splitting them. This is helpful if the input string is copied from some spreadsheet application or CSV file, where spaces between cells might be added.

If set to false, all elements are left as they are.

Example: The line

Archaea; Aigarchaeota; Aigarchaeota Incertae Sedis; 11091   class   123

contains spaces both between the taxa names (separated by ;), as well as within the names. Only the former ones will be trimmed, while latter ones are left as they are.

Definition at line 198 of file taxopath_parser.cpp.

bool trim_whitespaces ( ) const

Return the currently set value whether whitespaces are trimmed off the taxonomic elements.

See the setter for details.

Definition at line 209 of file taxopath_parser.cpp.


The documentation for this class was generated from the following files: