pandahandler.tabulation

Univariate data tabulation.

Classes

Counts and associated metadata for a univariate data set.

tabulate(→ Tabulation)

Create a tabulation of data.

class pandahandler.tabulation.Tabulation[source]

Counts and associated metadata for a univariate data set.

n_distinct: int[source]: The number of distinct values in the input series (i.e. the number of rows in df).

__attrs_post_init__() → None[source]

Data validation.

Raises:: ValueError – If counts index is not monotonic increasing, after removing nulls.

select(keep: Iterable[Hashable]) → typing_extensions.Self[source]

Derive a new tabulation that includes only a subset of the distinct values.

Parameters:: keep – The distinct values to include.
Raises:: KeyError – If any of the named index values are not present in index of self.counts.

property rates: pandas.Series[source]: Generate the empirical multinomial probabilities.

pandahandler.tabulation.tabulate(data: Iterable[Hashable], name: str | None = None, dropna: bool = False) → Tabulation[source]

Create a tabulation of data.

Parameters:

data – The data to tabulate.
name – A name for the data set being tabulated. Defaults to None, but inherits the name of the input data if it has a name attribute.
dropna – Whether to drop NA values before tabulating. Defaults to False.