noparama  v0.0.1
Nonparametric Bayesian models
Public Member Functions | Protected Member Functions | Friends | List of all members
membertrix Class Reference

#include <membertrix.h>

Public Member Functions

 membertrix ()
 
 membertrix (const membertrix &other)
 
membertrixclone ()
 
 ~membertrix ()
 
cluster_id_t addCluster (cluster_t *cluster)
 
cluster_tgetCluster (cluster_id_t cluster_id)
 
data_id_t addData (data_t &data)
 
data_tgetDatum (data_id_t data_id)
 
np_error_t assign (cluster_id_t cluster_id, data_id_t data_id)
 
np_error_t retract (cluster_id_t cluster_id, data_id_t data_id, bool auto_remove=true)
 
np_error_t retract (data_id_t data_id, bool auto_remove=true)
 
np_error_t remove (cluster_id_t cluster_id)
 
bool assigned (data_id_t data_id) const
 
void getAssignments (cluster_id_t cluster_id, data_ids_t &data_ids) const
 
cluster_id_t getClusterId (data_id_t data_id) const
 
const clusters_tgetClusters () const
 
size_t getClusterCount () const
 
void relabel ()
 
void print (cluster_id_t cluster_id, std::ostream &os) const
 
void print (std::ostream &os) const
 
dataset_tgetData ()
 
dataset_tgetData (const cluster_id_t cluster_id) const
 
void getData (const cluster_id_t cluster_id, dataset_t &dataset) const
 
void getData (const data_ids_t data_ids, dataset_t &dataset) const
 
size_t count (cluster_id_t cluster_id) const
 
size_t count () const
 
bool empty (cluster_id_t cluster_id)
 
int cleanup ()
 
membertrixoperator= (membertrix other)
 

Protected Member Functions

bool exists (cluster_id_t cluster_id)
 

Friends

std::ostream & operator<< (std::ostream &os, const membertrix &m)
 
void swap (membertrix &first, membertrix &second)
 

Detailed Description

The membertrix data structure is a binary matrix optimimized for storing membership information. The membership is asymmetric. A data item can be assigned to only one cluster. In contrary, a cluster can have multiple data points as members.

The data points and clusters are stored in separate vectors.

The structure is stored with clusters as columns, and data items as rows. Reasons:

Usage:

References: [1] https://eigen.tuxfamily.org/dox/group__TopicStorageOrders.html

Constructor & Destructor Documentation

◆ membertrix() [1/2]

membertrix::membertrix ( )

The default constructor.

◆ membertrix() [2/2]

membertrix::membertrix ( const membertrix other)

The copy constructor. This is not a true copy constructor. If you copy a membership matrix we assume you want to optimize the internal structures. This invalidates all cluster_id's. If an exact clone is required you will need to add a clone() member function.

This constructor calls addData and addCluster to have all internal datastructures consistent and reduce the matrix to the minimum size. The alternative would be all kind of book-keeping swapping columns in the matrix, moving datasets from own cluster to the next, etc.

◆ ~membertrix()

membertrix::~membertrix ( )

The destructor.

Member Function Documentation

◆ addCluster()

cluster_id_t membertrix::addCluster ( cluster_t cluster)

Add a cluster to the membership matrix. The cluster is not physically stored, only a reference is kept. If the memory is deallocated, errors can be expected.

The returned index should be kept as a reference for use in the functions assign() and retract().

Parameters
[in]clusterA cluster object
Returns
An index to the given cluster object

◆ addData()

data_id_t membertrix::addData ( data_t data)

Add a data point to the membership matrix. The data are not physically stored, only a reference is kept. If the memory is deallocated, errors can be expected.

The returned index should be kept as a reference for use in the functions assign() and retract().

Parameters
[in]dataA data object
Returns
An index to the given data object

◆ assign()

np_error_t membertrix::assign ( cluster_id_t  cluster_id,
data_id_t  data_id 
)

Assign a previously added data item (through addData) to a previously added cluster (through addCluster).

Parameters
[in]cluster_idAn index to a cluster object
[in]data_idAn index to a data point
Returns
Error if data item does not exist (for example)

◆ assigned()

bool membertrix::assigned ( data_id_t  data_id) const

If the data item is assigned to any cluster this function will return true. In all other cases it returns false.

Parameters
[in]data_idAn index to a data point
Returns
Boolean representing any assignment

◆ cleanup()

int membertrix::cleanup ( )

Clean up internal data structures after data items and clusters have been assigned to remove all clusters that didn't get assigned.

◆ clone()

membertrix * membertrix::clone ( )

◆ count() [1/2]

size_t membertrix::count ( cluster_id_t  cluster_id) const

Return count of data points within the given cluster.

Parameters
[in]cluster_idAn index to a particular cluster
Returns
Number of data points (should be the same as getData(cluster_id).size()).

◆ count() [2/2]

size_t membertrix::count ( ) const

Return total number of data points. This should be the same as calling count(cluster_id_t) for each cluster returned by getClusters().

Returns
The total number of data points

◆ empty()

bool membertrix::empty ( cluster_id_t  cluster_id)

Indicate if a cluster is empty or non-empty (one or more data items assigned to it).

Returns
True or false depending on zero or nonzero data items in the cluster

◆ exists()

bool membertrix::exists ( cluster_id_t  cluster_id)
protected

◆ getAssignments()

void membertrix::getAssignments ( cluster_id_t  cluster_id,
data_ids_t data_ids 
) const

Get all assignments to given cluster.

Parameters
[in]cluster_idAn index to a cluster object
[out]data_idsSet of data ids

◆ getCluster()

cluster_t * membertrix::getCluster ( cluster_id_t  cluster_id)

Get cluster given cluster id.

Parameters
[in]cluster_idAn index to a cluster object
Returns
The cluster object itself

◆ getClusterCount()

size_t membertrix::getClusterCount ( ) const

Get number of clusters. Note that retract and assign adjust the number of clusters!

Returns
Total number of clusters

◆ getClusterId()

cluster_id_t membertrix::getClusterId ( data_id_t  data_id) const

Get the cluster id given a particular data id.

Parameters
[in]data_idAn index to a data point
Returns
An index to a cluster

◆ getClusters()

const clusters_t & membertrix::getClusters ( ) const

Get all clusters to iterate over them. The cluster set is const to protect the user from accidentally removing clusters in a for-loop in a way that destroys the user iterator.

This function can be used to adjust the parameters of the cluster_t objects. The function only reads cluster information and does not change the membertrix instance, hence it is const.

Returns
Set of clusters

◆ getData() [1/4]

dataset_t * membertrix::getData ( )

Return all data points.

Returns
The entire dataset (do not need to be assigned)

◆ getData() [2/4]

dataset_t * membertrix::getData ( const cluster_id_t  cluster_id) const

Return all data points that are assigned to a particular cluster.

Parameters
[in]cluster_idAn index to a particular cluster
[out]datasetA dataset (vector) of data points that have been assigned through assign()

◆ getData() [3/4]

void membertrix::getData ( const cluster_id_t  cluster_id,
dataset_t dataset 
) const

◆ getData() [4/4]

void membertrix::getData ( const data_ids_t  data_ids,
dataset_t dataset 
) const

Return a particular subset of data points. Can belong to a particular cluster or not. As long as they have been added through addData(). They do not have to be assigned to a cluster yet.

Parameters
[in]data_idsA set of data point ids
Returns
The data points themselves

◆ getDatum()

data_t * membertrix::getDatum ( data_id_t  data_id)

Return data point with given index.

Parameters
[in]data_idAn index to a particular data point
Returns
A data point that has been set previously through addData

◆ operator=()

membertrix & membertrix::operator= ( membertrix  other)

The assignment operator is implemented by not passing by reference, but having the argument as a copy. Subsequently, only a swap operation needs to be called.

Parameters
[in]membertrixAnother membertrix object
Returns
A copy of this membertrix object, optimized.

◆ print() [1/2]

void membertrix::print ( cluster_id_t  cluster_id,
std::ostream &  os 
) const

Print cluster to stream.

◆ print() [2/2]

void membertrix::print ( std::ostream &  os) const

Print entire membership matrix to stream.

◆ relabel()

void membertrix::relabel ( )

Aggressive restructuring of all data structures. This will relabel all cluster_id's to consecutive numbers. The assignments are still valid but with different cluster_id's.

This is called automatically on assignments!!

◆ remove()

np_error_t membertrix::remove ( cluster_id_t  cluster_id)

Remove a cluster (should be empty). This function should be called if auto_remove is set to false in the retract() functions. If auto_remove is set to true (default) there is no need to call remove().

Parameters
[in]cluster_idIndex to cluster to be removed
Returns
Error if cluster is still non-empty for example

◆ retract() [1/2]

np_error_t membertrix::retract ( cluster_id_t  cluster_id,
data_id_t  data_id,
bool  auto_remove = true 
)

Retract a previously assigned data-cluster pair (through assign). If the cluster does not have any data points left, also the object will be deallocated depending on auto_remove setting.

Parameters
[in]cluster_idAn index to a cluster object
[in]data_idAn index to a data point
[in]auto_removeAutomatically deallocate cluster object if there is no data assigned anymore

◆ retract() [2/2]

np_error_t membertrix::retract ( data_id_t  data_id,
bool  auto_remove = true 
)

Retract a previously assigned data-cluster pair (through assign) where the search for this particular cluster is left to getClusterId(data_id). If the cluster does not have any data points left, also the object will be deallocated (depending on auto_remove flag). This has the same effect as:

retract(getClusterId(data_id), data_id);

Parameters
[in]data_idAn index to a data point
[in]auto_removeAutomatically deallocate cluster object if there is no data assigned anymore

Friends And Related Function Documentation

◆ operator<<

std::ostream& operator<< ( std::ostream &  os,
const membertrix m 
)
friend

Allow a membership to be printed to a standard stream using the << operator.

◆ swap

void swap ( membertrix first,
membertrix second 
)
friend

Swap all member fields of the two objects. This is a very lightweight implementation only swapping the five member fields on the level of references, nothing is copied.


The documentation for this class was generated from the following files: