pandahandler.indexes
====================

.. py:module:: pandahandler.indexes

.. autoapi-nested-parse::

   Tools for working with pandas indexes.


Classes
-------

.. autoapisummary::

   pandahandler.indexes.Index


Functions
---------

.. autoapisummary::

   pandahandler.indexes.is_unnamed_range_index
   pandahandler.indexes.unset


Module Contents
---------------

.. py:function:: is_unnamed_range_index(index: pandas.Index) -> bool

   Assert that the index is trivial, i.e. equal to the default RangeIndex.


.. py:function:: unset(df: pandas.DataFrame, require_names: bool = True) -> pandas.DataFrame

   Safely unset the index.

   Details:

   * This is more an "unset" than a "reset" in the sense that it makes the index as trivial as possible. If we
     could totally remove the index of the data frame, that's what this function would do, but the unnamed
     range index is the next closest thing.
   * This is "safe" in the sense that it will not drop any existing data encoded in the index. (We assume that an
     unnamed range index does not count as "data" in this context.) Existing index columns get converted to
     regular columns in the data frame.

   :param df: The data frame to unset the index on.
   :param require_names: Whether to raise an error if the existing index is unnamed:

                         * This setting is ignored whenever the index is an unnamed RangeIndex.
                         * With require_names=False, a unnamed index typically is converted to a new column called "index".
                         * Using require_names=True (default) forces users to declare how to handle an unnamed index. Either:

                           * Call reset_index(drop=True) directly to drop the index instead of calling this function.
                           * Set the name(s) of the index prior to calling this function.

   :raises ValueError: If the data frame column names overlap with the index names.
   :raises ValueError: If all of the following apply:
       
       * require_names is True (the default)
       * the index is unnamed
       * the index is not a trivial RangeIndex

   :returns: A copy of the input data frame with the index columns reset as regular columns. The new index is
             a simple RangeIndex.


.. py:class:: Index

   A functional wrapper around pandas indexes.

   An instance of this class can be used to simplify the following types of operations:

   * Coercing an input data frame to have a particular index.
   * Enforcing or ensuring various index properties such as monotonicity or uniqueness.
   * Unset and reset indexes cleanly before and after operations that require columnar access to index columns.

   This class applies both for pandas.Index and MultiIndex objects to reduce the amount of special-casing needed based
   on the number of columns in the index.

   .. rubric:: Example

   .. code-block:: python

       # Set an index with sorting
       df = pd.DataFrame({"a": [1, 3, 2], "b": [4, 5, 6]})
       index = Index(names=["a"], sort=True)
       df = index(df)
       assert df.index.tolist() == [1, 2, 3]

       # Switch over to another column as the index


   .. py:attribute:: names
      :type:  pandas.core.indexes.frozen.FrozenList

      The names of the columns of the index.


   .. py:attribute:: allow_null
      :type:  bool

      Whether to allow null values in the index. Applicable only for single-column indexes, since pandas does not
      support null values in a MultiIndex.


   .. py:attribute:: sort
      :type:  bool

      Whether to sort the index.


   .. py:attribute:: require_unique
      :type:  bool

      Whether to require the index to be unique. E.g. if True, raise an error if the index is not unique.


   .. py:attribute:: dtypes
      :type:  Mapping[str, Any] | None
      :value: None


      The data types of the index columns.


   .. py:method:: validate(index: pandas.Index) -> None

      Assert that the provided index complies with this index specification.


   .. py:method:: __call__(df: pandas.DataFrame, coerce_dtypes: bool = False, filter_nulls: bool = False) -> pandas.DataFrame

      Set the index on the data frame.

      Any named columns of the current df index that are not part of the new index will be converted to new
      data frame columns.

      :param df: The data frame to set the index on.
      :param coerce_dtypes: Whether to coerce the types of the index columns to the specified dtypes.
      :param filter_nulls: Whether to filter out rows with null values in the index columns before setting the index.
                           This is useful when allow_null is False and you want to keep the rows with non-null values.

      :raises ValueError: If coerce_dtypes is True but dtypes is not specified.
      :raises DuplicateValueError: If the index has duplicate values and require_unique is True.
      :raises NullValueError: If the index has null values and allow_null is False.
      :raises DTypeError: If the index dtypes do not match the specified dtypes and coerce_dtypes is False.
      :raises ValueError: If the index names do not match the specified names.


   .. py:method:: assert_equal_names(index: pandas.Index) -> None

      Assert that names of the provided index match the names of this index.