A library for working with phylogenetic and population genetic data.
v0.27.0
SignatureSpecifications Class Reference

#include <genesis/sequence/functions/signature_specifications.hpp>

Detailed Description

Specifications for calculating signatures (like k-mer counts) from Sequences.

This class stores settings needed for signature functions like signature_counts(), signature_frequencies(), signature_symmetrized_frequencies() etc. It mainly stores the alphabet() and k() to use for these calculations.

It also serves as storage and lookup for index tables that are needed by those functions. Thus, the indices are only created once per instance of this class, that is, once per alphabet and k. This saves costs when calculating signatures for many Sequences.

Definition at line 64 of file signature_specifications.hpp.

Public Member Functions

 SignatureSpecifications ()=default
 
 SignatureSpecifications (SignatureSpecifications &&)=default
 Default move constructor. More...
 
 SignatureSpecifications (SignatureSpecifications const &)=default
 Default copy constructor. More...
 
 SignatureSpecifications (std::string const &alphabet, size_t k)
 
 ~SignatureSpecifications ()=default
 Default destructor. More...
 
std::string const & alphabet () const
 
size_t char_index (char c) const
 Return the index of a char within the alphabet(). More...
 
bool is_nucleic_acids () const
 Speedup and shortcut to test whether the alphabet() is "ACGT". More...
 
size_t k () const
 
std::vector< size_t > const & kmer_combined_reverse_complement_map () const
 Get a map from indices of kmer_list() and signature_counts() vectors to a smaller list which combines reverse complementary kmers for nucleic acid sequences. More...
 
std::vector< std::string > const & kmer_list () const
 Return the list of all possible k-mers for the given k and alphabet. More...
 
size_t kmer_list_size () const
 
std::vector< size_t > const & kmer_reverse_complement_indices () const
 Get the indices for each kmer in kmer_list() to its reverse complement in the list. More...
 
std::vector< std::string > const & kmer_reverse_complement_list () const
 
size_t kmer_reverse_complement_list_size (bool with_palindromes=true) const
 
SignatureSpecificationsoperator= (SignatureSpecifications &&)=default
 Default move assignment. More...
 
SignatureSpecificationsoperator= (SignatureSpecifications const &)=default
 Default copy assignment. More...
 
UnknownCharBehavior unknown_char_behavior () const
 
SignatureSpecificationsunknown_char_behavior (UnknownCharBehavior value)
 

Public Types

enum  UnknownCharBehavior { kSkip, kThrow }
 List of policies to decide what to do when a char that is not part of the alphabet occurs while counting kmers. More...
 

Static Public Attributes

static const size_t InvalidCharIndex = std::numeric_limits<size_t>::max()
 Value that is used to indicate an invalid (non-alphabet) char when using index_of(). More...
 

Constructor & Destructor Documentation

◆ SignatureSpecifications() [1/4]

◆ SignatureSpecifications() [2/4]

SignatureSpecifications ( std::string const &  alphabet,
size_t  k 
)

Definition at line 52 of file signature_specifications.cpp.

◆ ~SignatureSpecifications()

Default destructor.

◆ SignatureSpecifications() [3/4]

Default copy constructor.

◆ SignatureSpecifications() [4/4]

Default move constructor.

Member Function Documentation

◆ alphabet()

std::string const& alphabet ( ) const
inline

Definition at line 130 of file signature_specifications.hpp.

◆ char_index()

size_t char_index ( char  c) const
inline

Return the index of a char within the alphabet().

For chars that are not in the alphabet, InvalidCharIndex is returned as an indicator value.

Definition at line 163 of file signature_specifications.hpp.

◆ is_nucleic_acids()

bool is_nucleic_acids ( ) const
inline

Speedup and shortcut to test whether the alphabet() is "ACGT".

Definition at line 152 of file signature_specifications.hpp.

◆ k()

size_t k ( ) const
inline

Definition at line 135 of file signature_specifications.hpp.

◆ kmer_combined_reverse_complement_map()

std::vector< size_t > const & kmer_combined_reverse_complement_map ( ) const

Get a map from indices of kmer_list() and signature_counts() vectors to a smaller list which combines reverse complementary kmers for nucleic acid sequences.

Definition at line 122 of file signature_specifications.cpp.

◆ kmer_list()

std::vector< std::string > const & kmer_list ( ) const

Return the list of all possible k-mers for the given k and alphabet.

Definition at line 80 of file signature_specifications.cpp.

◆ kmer_list_size()

size_t kmer_list_size ( ) const

Definition at line 117 of file signature_specifications.cpp.

◆ kmer_reverse_complement_indices()

std::vector< size_t > const & kmer_reverse_complement_indices ( ) const

Get the indices for each kmer in kmer_list() to its reverse complement in the list.

Definition at line 174 of file signature_specifications.cpp.

◆ kmer_reverse_complement_list()

std::vector< std::string > const & kmer_reverse_complement_list ( ) const

Definition at line 226 of file signature_specifications.cpp.

◆ kmer_reverse_complement_list_size()

size_t kmer_reverse_complement_list_size ( bool  with_palindromes = true) const

Definition at line 263 of file signature_specifications.cpp.

◆ operator=() [1/2]

SignatureSpecifications& operator= ( SignatureSpecifications &&  )
default

Default move assignment.

◆ operator=() [2/2]

SignatureSpecifications& operator= ( SignatureSpecifications const &  )
default

Default copy assignment.

◆ unknown_char_behavior() [1/2]

UnknownCharBehavior unknown_char_behavior ( ) const
inline

Definition at line 140 of file signature_specifications.hpp.

◆ unknown_char_behavior() [2/2]

SignatureSpecifications& unknown_char_behavior ( UnknownCharBehavior  value)
inline

Definition at line 194 of file signature_specifications.hpp.

Member Enumeration Documentation

◆ UnknownCharBehavior

enum UnknownCharBehavior
strong

List of policies to decide what to do when a char that is not part of the alphabet occurs while counting kmers.

Enumerator
kSkip 

Simply ignore the char by skipping it.

kThrow 

Throw an exception.

Definition at line 76 of file signature_specifications.hpp.

Member Data Documentation

◆ InvalidCharIndex

const size_t InvalidCharIndex = std::numeric_limits<size_t>::max()
static

Value that is used to indicate an invalid (non-alphabet) char when using index_of().

Definition at line 92 of file signature_specifications.hpp.


The documentation for this class was generated from the following files: