#include <genesis/taxonomy/formats/taxopath_parser.hpp>
Helper class to parse a string containing a taxonomic path string into a Taxopath object.
This class bundles the parameters used for parsing a taxonomic path strings and offers functions for the actual parsing. This is needed in order to allow customization of the parsing process, for example in TaxonomyReader. Furthermore, this prevents code duplication in places where the input is a taxonomic path string. The result of the parsing process is a Taxopath object. See there for details.
The elements are expected to be char separated, using the value of delimiters() to separate them. Default is ';'.
For example: The input string
Tax_1; Tax_2 ;;Tax_4;
is parsed into the Taxopath
[ "Tax_1", "Tax_2", "Tax_2", "Tax_4" ]
That is, missing elements are filled up with the preceeding ones - this is a common technique in taxonomic databases, which is useful for unspecified taxa in deeper taxonomies.
Furthermore, if the string ends with the delimiter char, this is removed by default. See above for an example of this; see remove_trailing_delimiter() to change that behaviour and instead keep this last element. Also, the first taxon in the string cannot be empty. Otherwise an std::runtime_error
is thrown.
Definition at line 81 of file taxopath_parser.hpp.
Public Member Functions | |
TaxopathParser ()=default | |
TaxopathParser (TaxopathParser &&)=default | |
TaxopathParser (TaxopathParser const &)=default | |
~TaxopathParser ()=default | |
std::string | delimiters () const |
Return the currelty set delimiter chars used to split the taxonomic path string. More... | |
TaxopathParser & | delimiters (std::string const &value) |
Set the chars used to split the taxonomic path string. More... | |
TaxopathParser & | operator= (TaxopathParser &&)=default |
TaxopathParser & | operator= (TaxopathParser const &)=default |
Taxopath | parse (std::string const &taxopath) const |
Parse a taxonomic path string into a Taxopath object and return it. More... | |
Taxopath | parse (Taxon const &taxon) const |
Helper function to turn a Taxon into a Taxopath. More... | |
bool | remove_trailing_delimiter () const |
Return whether currently trailing delimiters are removed from the taxonomic path string. More... | |
TaxopathParser & | remove_trailing_delimiter (bool value) |
Set whether to remove an empty taxonomic element at the end, if it occurs. More... | |
bool | trim_whitespaces () const |
Return the currently set value whether whitespaces are trimmed off the taxonomic elements. More... | |
TaxopathParser & | trim_whitespaces (bool value) |
Set whether to trim whitespaces around the taxonomic elements after splitting them. More... | |
|
default |
|
default |
|
default |
|
default |
|
inline |
Return the currelty set delimiter chars used to split the taxonomic path string.
See the setter for details.
Definition at line 149 of file taxopath_parser.hpp.
|
inline |
Set the chars used to split the taxonomic path string.
Those chars are used to split the taxon name into its hierarchical parts. Default is ';', as this is the usual value in many databases. See Taxopath for details.
If this value is set to multiple chars (string longer than 1), any of them is used for splitting.
Example: The taxonomic path string
Archaea;Euryarchaeota;Halobacteria;
is split into "Archaea", "Euryarchaeota" and "Halobacteria".
Definition at line 138 of file taxopath_parser.hpp.
|
default |
|
default |
Taxopath parse | ( | std::string const & | taxopath | ) | const |
Parse a taxonomic path string into a Taxopath object and return it.
See the class description for details on what this parser does.
Definition at line 48 of file taxopath_parser.cpp.
Helper function to turn a Taxon into a Taxopath.
This function is probably not need often, as the Taxopath is a helper object from a taxonomic path string towards a Taxon object, but not the other way round. In order to get the string from a Taxon, see the TaxopathGenerator class instead.
However, this function might still be useful in some cases. You never know.
Definition at line 96 of file taxopath_parser.cpp.
|
inline |
Return whether currently trailing delimiters are removed from the taxonomic path string.
See the setter for details.
Definition at line 209 of file taxopath_parser.hpp.
|
inline |
Set whether to remove an empty taxonomic element at the end, if it occurs.
In many taxonomic databases, the taxonomic string representation end with a ';' by default. When splitting such a string, this results in an empty last element. If this option is set to true
(default), this element is removed from the Taxopath.
If set to false
, the element is not removed, but instead treated as a normal "empty" element, which means, it is replaced by the value of the preceeding element. See the class description for details on that.
Definition at line 197 of file taxopath_parser.hpp.
|
inline |
Return the currently set value whether whitespaces are trimmed off the taxonomic elements.
See the setter for details.
Definition at line 181 of file taxopath_parser.hpp.
|
inline |
Set whether to trim whitespaces around the taxonomic elements after splitting them.
Default is true
. If set to true, the taxa given are trimmed off white spaces after splitting them. This is helpful if the input string is copied from some spreadsheet application or CSV file, where spaces between cells might be added.
If set to false
, all elements are left as they are.
Example: The line
Archaea; Aigarchaeota; Aigarchaeota Incertae Sedis; 11091 class 123
contains spaces both between the taxa names (separated by ;
), as well as within the names. Only the former ones will be trimmed, while latter ones are left as they are.
Definition at line 170 of file taxopath_parser.hpp.