pandahandler.schema
Tools to learn or coerce the schema of an open-ended input data frame.
Classes
Using and applying a data frame's schema information. |
Functions
|
Categorize columns that are neither categorical nor numeric. |
Module Contents
- pandahandler.schema.categorize_non_numerics(df: pandas.DataFrame) pandas.DataFrame[source]
Categorize columns that are neither categorical nor numeric.
- class pandahandler.schema.Schema[source]
Using and applying a data frame’s schema information.
The primary intended use case is in open-world data exploration, where the schema of the input data is not known in advance. If you know the schema in advance, consider using a more declarative approach such as pandera.
Note that the categorical encodings attribute is important information that’s not traditionally captured in “schema” information, although it is important for encoding any new data in a way that’s consistent with training data for scoring in machine learning applications.
- categorical_encodings: dict[Hashable, pandas.Index][source]
The categories of the categorical columns. The keys are the column names (for columns of categorical type) and the values are index objects expressing the integeger-category mappings defining that column’s categorical encoding.
- classmethod from_df(df: pandas.DataFrame) typing_extensions.Self[source]
Create a ColumnTypes object from a data frame.