pandahandler.frames.filtering.masktools
Defining data frame filters in terms of masks.
Functions
|
Apply a mask to filter a data frame. |
|
Convert a mask function to a filter function. |
Module Contents
- pandahandler.frames.filtering.masktools.apply_mask(df: pandas.DataFrame, *, mask: pandas.Series | pandahandler.frames.constants.DataframeToSeries, name: str = 'unnamed_mask') pandas.DataFrame[source]
Apply a mask to filter a data frame.
- Parameters:
df – The data frame to filter.
mask – A boolean series with the same index as df, where True values indicate rows to keep.
name – The name of the mask, for logging purposes only
- pandahandler.frames.filtering.masktools.as_filter(mask_func: pandahandler.frames.constants.DataframeToSeries, **kwargs) pandahandler.frames.constants.DataframeToDataframe[source]
Convert a mask function to a filter function.
The returned filter function will accept a data frame and return a data frame with the rows filtered by the mask, while using log_rowcount_change internally to log the change in row count.
Example
def my_mask(df: pd.DataFrame) -> pd.Series: return df["a"] > 1 my_filter = as_filter(my_mask) # Now you can use the filter function and expect logging for rowcount changes: filtered_df = my_filter(some_data_frame)
- Parameters:
mask_func – A function that accepts a data frame and returns a boolean series with the same index.
**kwargs – Additional keyword arguments to pass to log_rowcount_change.
- Returns:
A function that accepts a data frame and returns a data frame, where the rows are filtered by the mask.