pandahandler.frames.filtering.masktools

Defining data frame filters in terms of masks.

Functions

apply_mask(→ pandas.DataFrame)

Apply a mask to filter a data frame.

as_filter(...)

Convert a mask function to a filter function.

Module Contents

pandahandler.frames.filtering.masktools.apply_mask(df: pandas.DataFrame, *, mask: pandas.Series | pandahandler.frames.constants.DataframeToSeries, name: str = 'unnamed_mask') pandas.DataFrame[source]

Apply a mask to filter a data frame.

Parameters:
  • df – The data frame to filter.

  • mask – A boolean series with the same index as df, where True values indicate rows to keep.

  • name – The name of the mask, for logging purposes only

pandahandler.frames.filtering.masktools.as_filter(mask_func: pandahandler.frames.constants.DataframeToSeries, **kwargs) pandahandler.frames.constants.DataframeToDataframe[source]

Convert a mask function to a filter function.

The returned filter function will accept a data frame and return a data frame with the rows filtered by the mask, while using log_rowcount_change internally to log the change in row count.

Example

def my_mask(df: pd.DataFrame) -> pd.Series:
    return df["a"] > 1

my_filter = as_filter(my_mask)

# Now you can use the filter function and expect logging for rowcount changes:
filtered_df = my_filter(some_data_frame)
Parameters:
  • mask_func – A function that accepts a data frame and returns a boolean series with the same index.

  • **kwargs – Additional keyword arguments to pass to log_rowcount_change.

Returns:

A function that accepts a data frame and returns a data frame, where the rows are filtered by the mask.