A toolkit for working with phylogenetic data.
v0.24.0
TaxopathParser Class Reference

#include <genesis/taxonomy/formats/taxopath_parser.hpp>

Detailed Description

Helper class to parse a string containing a taxonomic path string into a Taxopath object.

This class bundles the parameters used for parsing a taxonomic path strings and offers functions for the actual parsing. This is needed in order to allow customization of the parsing process, for example in TaxonomyReader. Furthermore, this prevents code duplication in places where the input is a taxonomic path string. The result of the parsing process is a Taxopath object. See there for details.

The elements are expected to be char separated, using the value of delimiters() to separate them. Default is ';'.

For example: The input string

Tax_1; Tax_2 ;;Tax_4;

is parsed into the Taxopath

[ "Tax_1", "Tax_2", "Tax_2", "Tax_4" ]

That is, missing elements are filled up with the preceeding ones - this is a common technique in taxonomic databases, which is useful for unspecified taxa in deeper taxonomies.

Furthermore, if the string ends with the delimiter char, this is removed by default. See above for an example of this; see remove_trailing_delimiter() to change that behaviour and instead keep this last element. Also, the first taxon in the string cannot be empty. Otherwise an std::runtime_error is thrown.

Definition at line 81 of file taxopath_parser.hpp.

Public Member Functions

 TaxopathParser ()=default
 
 TaxopathParser (TaxopathParser const &)=default
 
 TaxopathParser (TaxopathParser &&)=default
 
 ~TaxopathParser ()=default
 
TaxopathParserdelimiters (std::string const &value)
 Set the chars used to split the taxonomic path string. More...
 
std::string delimiters () const
 Return the currelty set delimiter chars used to split the taxonomic path string. More...
 
TaxopathParseroperator= (TaxopathParser const &)=default
 
TaxopathParseroperator= (TaxopathParser &&)=default
 
Taxopath parse (std::string const &taxopath) const
 Parse a taxonomic path string into a Taxopath object and return it. More...
 
Taxopath parse (Taxon const &taxon) const
 Helper function to turn a Taxon into a Taxopath. More...
 
TaxopathParserremove_trailing_delimiter (bool value)
 Set whether to remove an empty taxonomic element at the end, if it occurs. More...
 
bool remove_trailing_delimiter () const
 Return whether currently trailing delimiters are removed from the taxonomic path string. More...
 
TaxopathParsertrim_whitespaces (bool value)
 Set whether to trim whitespaces around the taxonomic elements after splitting them. More...
 
bool trim_whitespaces () const
 Return the currently set value whether whitespaces are trimmed off the taxonomic elements. More...
 

Constructor & Destructor Documentation

◆ TaxopathParser() [1/3]

TaxopathParser ( )
default

◆ ~TaxopathParser()

~TaxopathParser ( )
default

◆ TaxopathParser() [2/3]

TaxopathParser ( TaxopathParser const &  )
default

◆ TaxopathParser() [3/3]

TaxopathParser ( TaxopathParser &&  )
default

Member Function Documentation

◆ delimiters() [1/2]

TaxopathParser& delimiters ( std::string const &  value)
inline

Set the chars used to split the taxonomic path string.

Those chars are used to split the taxon name into its hierarchical parts. Default is ';', as this is the usual value in many databases. See Taxopath for details.

If this value is set to multiple chars (string longer than 1), any of them is used for splitting.

Example: The taxonomic path string

Archaea;Euryarchaeota;Halobacteria;

is split into "Archaea", "Euryarchaeota" and "Halobacteria".

Definition at line 138 of file taxopath_parser.hpp.

◆ delimiters() [2/2]

std::string delimiters ( ) const
inline

Return the currelty set delimiter chars used to split the taxonomic path string.

See the setter for details.

Definition at line 149 of file taxopath_parser.hpp.

◆ operator=() [1/2]

TaxopathParser& operator= ( TaxopathParser const &  )
default

◆ operator=() [2/2]

TaxopathParser& operator= ( TaxopathParser &&  )
default

◆ parse() [1/2]

Taxopath parse ( std::string const &  taxopath) const

Parse a taxonomic path string into a Taxopath object and return it.

See the class description for details on what this parser does.

Definition at line 48 of file taxopath_parser.cpp.

◆ parse() [2/2]

Taxopath parse ( Taxon const &  taxon) const

Helper function to turn a Taxon into a Taxopath.

This function is probably not need often, as the Taxopath is a helper object from a taxonomic path string towards a Taxon object, but not the other way round. In order to get the string from a Taxon, see the TaxopathGenerator class instead.

However, this function might still be useful in some cases. You never know.

Definition at line 96 of file taxopath_parser.cpp.

◆ remove_trailing_delimiter() [1/2]

TaxopathParser& remove_trailing_delimiter ( bool  value)
inline

Set whether to remove an empty taxonomic element at the end, if it occurs.

In many taxonomic databases, the taxonomic string representation end with a ';' by default. When splitting such a string, this results in an empty last element. If this option is set to true (default), this element is removed from the Taxopath.

If set to false, the element is not removed, but instead treated as a normal "empty" element, which means, it is replaced by the value of the preceeding element. See the class description for details on that.

Definition at line 197 of file taxopath_parser.hpp.

◆ remove_trailing_delimiter() [2/2]

bool remove_trailing_delimiter ( ) const
inline

Return whether currently trailing delimiters are removed from the taxonomic path string.

See the setter for details.

Definition at line 209 of file taxopath_parser.hpp.

◆ trim_whitespaces() [1/2]

TaxopathParser& trim_whitespaces ( bool  value)
inline

Set whether to trim whitespaces around the taxonomic elements after splitting them.

Default is true. If set to true, the taxa given are trimmed off white spaces after splitting them. This is helpful if the input string is copied from some spreadsheet application or CSV file, where spaces between cells might be added.

If set to false, all elements are left as they are.

Example: The line

Archaea; Aigarchaeota; Aigarchaeota Incertae Sedis; 11091   class   123

contains spaces both between the taxa names (separated by ;), as well as within the names. Only the former ones will be trimmed, while latter ones are left as they are.

Definition at line 170 of file taxopath_parser.hpp.

◆ trim_whitespaces() [2/2]

bool trim_whitespaces ( ) const
inline

Return the currently set value whether whitespaces are trimmed off the taxonomic elements.

See the setter for details.

Definition at line 181 of file taxopath_parser.hpp.


The documentation for this class was generated from the following files: