pandahandler.tabulation

Univariate data tabulation.

Classes

Tabulation

Counts and associated metadata for a univariate data set.

Functions

tabulate(→ Tabulation)

Create a tabulation of data.

Module Contents

class pandahandler.tabulation.Tabulation[source]

Counts and associated metadata for a univariate data set.

counts: pandas.Series[source]

The table of counts.

name: str | None = None[source]

A name for the data set being tabulated.

n_values: int[source]

The number of values in the input series.

n_distinct: int[source]

The number of distinct values in the input series (i.e. the number of rows in df).

__attrs_post_init__() None[source]

Data validation.

Raises:

ValueError – If counts index is not monotonic increasing, after removing nulls.

select(keep: Iterable[Hashable]) typing_extensions.Self[source]

Derive a new tabulation that includes only a subset of the distinct values.

Parameters:

keep – The distinct values to include.

Raises:

KeyError – If any of the named index values are not present in index of self.counts.

property rates: pandas.Series[source]

Generate the empirical multinomial probabilities.

pandahandler.tabulation.tabulate(data: Iterable[Hashable], name: str | None = None, dropna: bool = False) Tabulation[source]

Create a tabulation of data.

Parameters:
  • data – The data to tabulate.

  • name – A name for the data set being tabulated. Defaults to None, but inherits the name of the input data if it has a name attribute.

  • dropna – Whether to drop NA values before tabulating. Defaults to False.