pandahandler.schema =================== .. py:module:: pandahandler.schema .. autoapi-nested-parse:: Tools to learn or coerce the schema of an open-ended input data frame. Classes ------- .. autoapisummary:: pandahandler.schema.Schema Functions --------- .. autoapisummary:: pandahandler.schema.categorize_non_numerics Module Contents --------------- .. py:function:: categorize_non_numerics(df: pandas.DataFrame) -> pandas.DataFrame Categorize columns that are neither categorical nor numeric. .. py:class:: Schema Using and applying a data frame's schema information. The primary intended use case is in open-world data exploration, where the schema of the input data is not known in advance. If you know the schema in advance, consider using a more declarative approach such as pandera. Note that the categorical encodings attribute is important information that's not traditionally captured in "schema" information, although it is important for encoding any new data in a way that's consistent with training data for scoring in machine learning applications. .. py:attribute:: types_ :type: pandas.Series The data types of the columns. .. py:attribute:: categorical_encodings :type: dict[Hashable, pandas.Index] The categories of the categorical columns. The keys are the column names (for columns of categorical type) and the values are index objects expressing the integeger-category mappings defining that column's categorical encoding. .. py:method:: __post_init__() -> None Run consistency checks. .. py:method:: from_df(df: pandas.DataFrame) -> typing_extensions.Self :classmethod: Create a ColumnTypes object from a data frame. .. py:property:: categoricals :type: list[Hashable] Return the names of categorical columns. .. py:property:: numerics :type: list[Hashable] Return the names of numeric columns. .. py:property:: others :type: list[Hashable] Return the names of columns that are neither categorical nor numeric. .. py:method:: __call__(df: pandas.DataFrame) -> pandas.DataFrame Coerce the data frame to the schema.