Lookup of Sequences of a reference genome.
The class stores Sequences in the order they are added, but also stores a hash map for quickly finding a Sequence given its name/label, as well as a lookup of bases at positions in the genome.
- See also
- SequenceDict
Definition at line 65 of file reference_genome.hpp.
|
| ReferenceGenome () |
|
| ReferenceGenome (ReferenceGenome &&)=default |
|
| ReferenceGenome (ReferenceGenome const &)=delete |
|
| ~ReferenceGenome ()=default |
|
const_reference | add (Sequence &&seq, bool also_look_up_first_word=true) |
| Add a Sequence to the ReferenceGenome by moving it, and return a const_reference to it. More...
|
|
const_reference | add (Sequence const &seq, bool also_look_up_first_word=true) |
| Add a Sequence to the ReferenceGenome by copying it, and return a const_reference to it. More...
|
|
const_iterator | begin () const |
|
const_iterator | cbegin () const |
|
const_iterator | cend () const |
|
void | clear () |
| Remove all Sequences from the ReferenceGenome, leaving it with a size() of 0. More...
|
|
bool | contains (std::string const &label) const |
|
bool | empty () const |
|
const_iterator | end () const |
|
const_iterator | find (std::string const &label) const |
| Return an iterator to the Sequence with the given label , or an iterator to end() if no Sequence with that label is present. More...
|
|
const_reference | get (std::string const &label) const |
| Same as find(), but returns the sequence directly, or throws if not present. More...
|
|
char | get_base (const_iterator it, size_t position, bool to_upper=true) const |
| Get a particular base at the given sequence iterator and position. More...
|
|
char | get_base (std::string const &chromosome, size_t position, bool to_upper=true) const |
| Get a particular base at a given chromosome and position. More...
|
|
ReferenceGenome & | operator= (ReferenceGenome &&)=default |
|
ReferenceGenome & | operator= (ReferenceGenome const &)=delete |
|
size_t | size () const |
|
Add a Sequence to the ReferenceGenome by moving it, and return a const_reference to it.
If also_look_up_first_word
is set (true by default), we add an additional look up name for the added sequence: In addition to its full name, it can also be looked up with just the first word, that is, until the first tab or space character, in case there are any, as this is what typical fasta indexing tools also seem to do. The sequence is still stored with its original name though, and just that additional lookup is added for using find() or get().
Definition at line 232 of file reference_genome.hpp.
Add a Sequence to the ReferenceGenome by copying it, and return a const_reference to it.
If also_look_up_first_word
is set (true by default), we add an additional look up name for the added sequence: In addition to its full name, it can also be looked up with just the first word, that is, until the first tab or space character, in case there are any, as this is what typical fasta indexing tools also seem to do. The sequence is still stored with its original name though, and just that additional lookup is added for using find() or get().
Definition at line 222 of file reference_genome.hpp.
char get_base |
( |
std::string const & |
chromosome, |
|
|
size_t |
position, |
|
|
bool |
to_upper = true |
|
) |
| const |
|
inline |
Get a particular base at a given chromosome and position.
Reference genomes are often used to look up a particular base, so we offer this functionality here directly. The function throws if either the chromosome is not part of the genome, or if the position is outside of the size of the chromosome.
Important: We here use 1-based indexing for the position, which differs from a direct lookup using the sites of the sequence directly, but is more in line with the usage in our population functions.
Definition at line 174 of file reference_genome.hpp.