A library for working with phylogenetic data.
v0.25.0
SimplePileupReader Class Reference

#include <genesis/population/formats/simple_pileup_reader.hpp>

Detailed Description

Reader for line-by-line assessment of (m)pileup files.

This simple reader processes (m)pileup files line by line. That is, it does not take into consideration which read starts at which position, but instead gives a quick and simple tally of the bases of all reads that cover a given position. This makes it fast in cases where only per-position, but no per-read information is needed.

For each processed line, a SimplePileupReader::Record is produced, which captures the basic information of the line, as well as a tally for each sample in the line, collected in SimplePileupReader::Sample. One such sample consists of two or more columns in the file. The number of columns per sample depends on the additional information contained in the file. As we have no way of deciding this automatically, these columns have to be activated beforehand:

More columns might be needed in the future, and potentially their ordering might need to be adapted. But for now, we only have these use cases.

Definition at line 68 of file simple_pileup_reader.hpp.

Public Member Functions

 SimplePileupReader ()=default
 
 SimplePileupReader (self_type &&)=default
 
 SimplePileupReader (self_type const &)=default
 
 ~SimplePileupReader ()=default
 
self_typeoperator= (self_type &&)=default
 
self_typeoperator= (self_type const &)=default
 
bool parse_line (utils::InputStream &input_stream, Record &record) const
 Read an (m)pileup line. More...
 
bool parse_line (utils::InputStream &input_stream, Record &record, std::vector< bool > const &sample_filter) const
 Read an (m)pileup line, but only the samples at which the sample_filter is true. More...
 
sequence::QualityEncoding quality_encoding () const
 
self_typequality_encoding (sequence::QualityEncoding value)
 Set the type of encoding for the quality code string. More...
 
std::vector< Recordread (std::shared_ptr< utils::BaseInputSource > source) const
 Read an (m)pileup file line by line. More...
 
std::vector< Recordread (std::shared_ptr< utils::BaseInputSource > source, std::vector< bool > const &sample_filter) const
 Read an (m)pileup file line by line, but only the samples at which the sample_filter is true. More...
 
std::vector< Recordread (std::shared_ptr< utils::BaseInputSource > source, std::vector< size_t > const &sample_indices) const
 Read an (m)pileup file line by line, but only the samples at the given indices. More...
 
bool with_ancestral_base () const
 
self_typewith_ancestral_base (bool value)
 Set whether to expect the base of the ancestral allele as the last part of each sample in a record line. More...
 
bool with_quality_string () const
 
self_typewith_quality_string (bool value)
 Set whether to expect a phred-scaled, ASCII-encoded quality code string per sample. More...
 

Static Public Member Functions

static std::vector< bool > make_sample_filter (std::vector< size_t > const &indices)
 Helper function to create a sample filter from a list of sample indices. More...
 

Public Types

using self_type = SimplePileupReader
 

Classes

struct  Record
 Single line/record from a pileup file. More...
 
struct  Sample
 One sample in a pileup line/record. More...
 

Constructor & Destructor Documentation

◆ SimplePileupReader() [1/3]

SimplePileupReader ( )
default

◆ ~SimplePileupReader()

~SimplePileupReader ( )
default

◆ SimplePileupReader() [2/3]

SimplePileupReader ( self_type const &  )
default

◆ SimplePileupReader() [3/3]

SimplePileupReader ( self_type &&  )
default

Member Function Documentation

◆ make_sample_filter()

std::vector< bool > make_sample_filter ( std::vector< size_t > const &  indices)
static

Helper function to create a sample filter from a list of sample indices.

Definition at line 114 of file simple_pileup_reader.cpp.

◆ operator=() [1/2]

self_type& operator= ( self_type &&  )
default

◆ operator=() [2/2]

self_type& operator= ( self_type const &  )
default

◆ parse_line() [1/2]

bool parse_line ( utils::InputStream input_stream,
SimplePileupReader::Record record 
) const

Read an (m)pileup line.

Definition at line 95 of file simple_pileup_reader.cpp.

◆ parse_line() [2/2]

bool parse_line ( utils::InputStream input_stream,
SimplePileupReader::Record record,
std::vector< bool > const &  sample_filter 
) const

Read an (m)pileup line, but only the samples at which the sample_filter is true.

This filter does not need to contain the same number of values as the record has samples. If it is shorter, all samples after its last index will be ignored. If it is longer, the remaining entries are not used as a filter.

Definition at line 102 of file simple_pileup_reader.cpp.

◆ quality_encoding() [1/2]

sequence::QualityEncoding quality_encoding ( ) const
inline

Definition at line 249 of file simple_pileup_reader.hpp.

◆ quality_encoding() [2/2]

self_type& quality_encoding ( sequence::QualityEncoding  value)
inline

Set the type of encoding for the quality code string.

If with_quality_string() is set to true (default), this encoding is used to transform the ASCII-encoded string into actual phred-scaled scores. See sequence::quality_decode_to_phred_score() for details.

Definition at line 261 of file simple_pileup_reader.hpp.

◆ read() [1/3]

std::vector< SimplePileupReader::Record > read ( std::shared_ptr< utils::BaseInputSource source) const

Read an (m)pileup file line by line.

Definition at line 51 of file simple_pileup_reader.cpp.

◆ read() [2/3]

std::vector< SimplePileupReader::Record > read ( std::shared_ptr< utils::BaseInputSource source,
std::vector< bool > const &  sample_filter 
) const

Read an (m)pileup file line by line, but only the samples at which the sample_filter is true.

This filter does not need to contain the same number of values as the record has samples. If it is shorter, all samples after its last index will be ignored. If it is longer, the remaining entries are not used as a filter.

Definition at line 81 of file simple_pileup_reader.cpp.

◆ read() [3/3]

std::vector< SimplePileupReader::Record > read ( std::shared_ptr< utils::BaseInputSource source,
std::vector< size_t > const &  sample_indices 
) const

Read an (m)pileup file line by line, but only the samples at the given indices.

Definition at line 64 of file simple_pileup_reader.cpp.

◆ with_ancestral_base() [1/2]

bool with_ancestral_base ( ) const
inline

Definition at line 267 of file simple_pileup_reader.hpp.

◆ with_ancestral_base() [2/2]

self_type& with_ancestral_base ( bool  value)
inline

Set whether to expect the base of the ancestral allele as the last part of each sample in a record line.

This is a pipeup extension used by Pool-HMM (Boitard et al 2013) to denote the ancestral allele of each position directly within the pipleup file. Set to true when this is present in the input.

A typical line from a pileup file looks like

2L  30  A   15  aaaAaaaAaAAaaAa PY\aVO^`ZaaV[_S A

which contains the three fixed columns, and then four columns for the sample, with the last one A being the ancestral allele for that sample.

Definition at line 287 of file simple_pileup_reader.hpp.

◆ with_quality_string() [1/2]

bool with_quality_string ( ) const
inline

Definition at line 223 of file simple_pileup_reader.hpp.

◆ with_quality_string() [2/2]

self_type& with_quality_string ( bool  value)
inline

Set whether to expect a phred-scaled, ASCII-encoded quality code string per sample.

A typical line from a pileup file looks like

seq1 272 T 24  ,.$.....,,.,.,...,,,.,..^+. <<<+;<<<<<<<<<<<=<;<;7<&

with the last field being quality codes. However, this last field is optional, and hence we offer this option. If true (default), the field is expected to be there; if false, it is expected not to be there. That is, at the moment, we have no automatic setting for this.

See quality_encoding() for changing the encoding that is used in this column. Default is Sanger encoding. See genesis::sequence::QualityEncoding for details.

Definition at line 243 of file simple_pileup_reader.hpp.

Member Typedef Documentation

◆ self_type

Definition at line 149 of file simple_pileup_reader.hpp.


The documentation for this class was generated from the following files: