pandahandler.tabulation
=======================

.. py:module:: pandahandler.tabulation

.. autoapi-nested-parse::

   Univariate data tabulation.


Classes
-------

.. autoapisummary::

   pandahandler.tabulation.Tabulation


Functions
---------

.. autoapisummary::

   pandahandler.tabulation.tabulate


Module Contents
---------------

.. py:class:: Tabulation

   Counts and associated metadata for a univariate data set.


   .. py:attribute:: counts
      :type:  pandas.Series

      The table of counts.


   .. py:attribute:: name
      :type:  str | None
      :value: None


      A name for the data set being tabulated.


   .. py:attribute:: n_values
      :type:  int

      The number of values in the input series.


   .. py:attribute:: n_distinct
      :type:  int

      The number of distinct values in the input series (i.e. the number of rows in `df`).


   .. py:method:: __attrs_post_init__() -> None

      Data validation.

      :raises ValueError: If counts index is not monotonic increasing, after removing nulls.


   .. py:method:: select(keep: Iterable[Hashable]) -> typing_extensions.Self

      Derive a new tabulation that includes only a subset of the distinct values.

      :param keep: The distinct values to include.

      :raises KeyError: If any of the named index values are not present in index of self.counts.


   .. py:property:: rates
      :type: pandas.Series


      Generate the empirical multinomial probabilities.


.. py:function:: tabulate(data: Iterable[Hashable], name: str | None = None, dropna: bool = False) -> Tabulation

   Create a tabulation of data.

   :param data: The data to tabulate.
   :param name: A name for the data set being tabulated. Defaults to None, but inherits the name of the input data if
                it has a `name` attribute.
   :param dropna: Whether to drop NA values before tabulating. Defaults to False.