A library for working with phylogenetic and population genetic data.
v0.32.0
GenomeRegionReader Class Reference

#include <genesis/population/format/genome_region_reader.hpp>

Detailed Description

Generic reader for inputs that contain a genomic region or locus per line, in different formats.

The reader expects an input source, and tries to interpret each line as a position or region in a chromosome, offering a variety of formats:

  • "chr" for whole chromosomes,
  • "chr:position", "chr:start-end", "chr:start..end" for positions and regions,
  • tab- or space-delimited "chr position" or "chr start end" as well for positions and regions.

This allows for maximum flexibility when reading in such inputs. Note that this is more flexible than parse_genome_region(), which does not support the tab- and space-delimiation - when parsing an individual string such as coming from a command line argument, we do not want to allow tabs or spaces, as this can get messy, hence we only offer this delimitation for files as of now.

Definition at line 69 of file genome_region_reader.hpp.

Public Member Functions

 GenomeRegionReader ()=default
 
 GenomeRegionReader (GenomeRegionReader &&)=default
 
 GenomeRegionReader (GenomeRegionReader const &)=default
 
 ~GenomeRegionReader ()=default
 
bool end_exclusive () const
 
GenomeRegionReaderend_exclusive (bool value)
 
GenomeRegionReaderoperator= (GenomeRegionReader &&)=default
 
GenomeRegionReaderoperator= (GenomeRegionReader const &)=default
 
GenomeLocusSet read_as_genome_locus_set (std::shared_ptr< utils::BaseInputSource > source) const
 Read an input source, and return its content as a GenomeLocusSet. More...
 
GenomeRegionList read_as_genome_region_list (std::shared_ptr< utils::BaseInputSource > source, bool merge=false) const
 Read an input source, and return its content as a GenomeRegionList. More...
 
void read_as_genome_region_list (std::shared_ptr< utils::BaseInputSource > source, GenomeRegionList &target, bool merge=false) const
 Read a map/bim input source, and add its content to an existing GenomeRegionList. More...
 
bool zero_based () const
 
GenomeRegionReaderzero_based (bool value)
 

Constructor & Destructor Documentation

◆ GenomeRegionReader() [1/3]

GenomeRegionReader ( )
default

◆ ~GenomeRegionReader()

~GenomeRegionReader ( )
default

◆ GenomeRegionReader() [2/3]

GenomeRegionReader ( GenomeRegionReader const &  )
default

◆ GenomeRegionReader() [3/3]

Member Function Documentation

◆ end_exclusive() [1/2]

bool end_exclusive ( ) const
inline

Definition at line 141 of file genome_region_reader.hpp.

◆ end_exclusive() [2/2]

GenomeRegionReader& end_exclusive ( bool  value)
inline

Definition at line 146 of file genome_region_reader.hpp.

◆ operator=() [1/2]

GenomeRegionReader& operator= ( GenomeRegionReader &&  )
default

◆ operator=() [2/2]

GenomeRegionReader& operator= ( GenomeRegionReader const &  )
default

◆ read_as_genome_locus_set()

GenomeLocusSet read_as_genome_locus_set ( std::shared_ptr< utils::BaseInputSource source) const

Read an input source, and return its content as a GenomeLocusSet.

This is the recommended way to read an input for testing whether genome coordinates are covered (filtered / to be considered) for downstream analyses.

Definition at line 50 of file genome_region_reader.cpp.

◆ read_as_genome_region_list() [1/2]

GenomeRegionList read_as_genome_region_list ( std::shared_ptr< utils::BaseInputSource source,
bool  merge = false 
) const

Read an input source, and return its content as a GenomeRegionList.

If merge is set, adjacent coordinates or overlapping regions of the input are merged into contiguous intervals. This is useful of the region list is used to determine coverage, although for that use case, it is recommended to use read_as_genome_locus_set() instead, as this is faster for testing coverage of genomic coordinates. Also, see the overlap flag of GenomeRegionList::add( GenomeLocus const&, bool ) for details on the flag.

Definition at line 60 of file genome_region_reader.cpp.

◆ read_as_genome_region_list() [2/2]

void read_as_genome_region_list ( std::shared_ptr< utils::BaseInputSource source,
GenomeRegionList target,
bool  merge = false 
) const

Read a map/bim input source, and add its content to an existing GenomeRegionList.

If merge is set, adjacent coordinates or overlapping regions of the input are merged into contiguous intervals. This is useful of the region list is used to determine coverage, although for that use case, it is recommended to use read_as_genome_locus_set() instead, as this is faster for testing coverage of genomic coordinates. Also, see the overlap flag of GenomeRegionList::add( GenomeLocus const&, bool ) for details on the flag.

Definition at line 69 of file genome_region_reader.cpp.

◆ zero_based() [1/2]

bool zero_based ( ) const
inline

Definition at line 130 of file genome_region_reader.hpp.

◆ zero_based() [2/2]

GenomeRegionReader& zero_based ( bool  value)
inline

Definition at line 135 of file genome_region_reader.hpp.


The documentation for this class was generated from the following files: