Preprocessing

Xplainable offers a preprocessing module that allows you to build reproducible preprocessing pipelines. The module aims to rapidly develop and deploy pipelines in production environments and play friendly with ipywidgets.

The preprocessing module is built on the XPipeline class from xplainable and is used similarly to the scikit-learn Pipeline class. All transformers in the pipeline are expected to have a fit and transform method, along with an inverse_transform method.

To create custom transformers, you can inherit from the XBaseTransformer class. You can render these custom transformers in the embedded xplainable GUI, which allows you to build pipelines without writing any code. You can find documentation on how to embed them in the GUI in the advanced_concepts/custom_transformers section.

Using the GUI

Xplainable offers a GUI for making preprocessing pipelines easy and reproducible. You can start the GUI by running a few simple lines.

Example

import xplainable as xp
import pandas as pd
from sklearn.model_selection import train_test_split

# Load data
data = pd.read_csv('data.csv')
train, test = train_test_split(data, test_size=0.2, random_state=42)

# Instantiate the preprocessor object
pp = xp.Preprocessor()

# Open the GUI and build pipeline
pp.preprocess(train)

# Apply the pipeline on new data
test_transformed = pp.transform(test)

Using the Python API

You can develop preprocessing pipelines using the Python API with XPipeline. The following example shows how to build a pipeline.

Example

from xplainable.preprocessing import transformers as xtf
from xplainable.preprocessing.pipeline import XPipeline
from sklearn.model_selection import train_test_split
import pandas as pd

# Load data
data = pd.read_csv('data.csv')
train, test = train_test_split(data, test_size=0.2, random_state=42)

# Instantiate a pipeline
pipeline = XPipeline()

# Add stages for specific features
pipeline.add_stages([
    {"feature": "age", "transformer": xtf.Clip(lower=18, upper=99)},
    {"feature": "balance", "transformer": xtf.LogTransform()}
])

# add stages on multiple features
pipeline.add_stages([
    {"transformer": xtf.FillMissing({'job': 'mode', 'age': 'mean'})},
    {"transformer": xtf.DropCols(columns=['duration', 'campaign'])}
])

# Share a single transformer across multiple features.
# Note this can only be applied when no fit method is required.
upper_case = xtf.ChangeCase(case='upper')

pipeline.add_stages([
    {"feature": "job", "transformer": upper_case},
    {"feature": "month", "transformer": upper_case}
])

# Fit and transform the data
train_transformed = pipeline.fit_transform(train)

# Apply transformations on new data
test_transformed = pipeline.transform(test)

# Inverse transform (only applies to configured features)
test_inv_transformed = pipeline.inverse_transform(test_transform)

XPipeline

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.pipeline.XPipeline[source]

Bases: object

Pipeline builder for xplainable transformers.

Parameters:

stages (list) – list containing xplainable pipeline stages.

add_stages(stages: list) XPipeline[source]

Adds multiple stages to the pipeline.

Parameters:

stages (list) – list containing xplainable pipeline stages.

Returns:

self

Return type:

XPipeline

drop_stage(stage: int) XPipeline[source]

Drops a stage from the pipeline.

Parameters:

stage (int) – index of the stage to drop.

Returns:

self

Return type:

XPipeline

fit(x: DataFrame) XPipeline[source]

Sequentially iterates through pipeline stages and fits data.

Parameters:

x (pd.DataFrame) – A non-empty DataFrame to fit.

Returns:

The fitted pipeline.

Return type:

XPipeline

fit_transform(x: DataFrame, start: int = 0)[source]

Runs the fit method followed by the transform method.

Parameters:
  • x (pd.DataFrame) – A non-empty DataFrame to fit.

  • start (int) – index of the stage to start fitting from.

Returns:

The transformed dataframe.

Return type:

pd.DataFrame

get_blueprint()[source]

Returns a blueprint of the pipeline.

Returns:

A list containing the pipeline blueprint.

Return type:

list

inverse_transform(x: DataFrame)[source]

Iterates through pipeline stages applying inverse transformations.

Parameters:

x (pd.DataFrame) – A non-empty DataFrame to inverse transform.

Returns:

The inverse transformed dataframe.

Return type:

pd.DataFrame

transform(x: DataFrame)[source]

Iterates through pipeline stages applying transformations.

Parameters:

x (pd.DataFrame) – A non-empty DataFrame to transform.

Returns:

The transformed dataframe.

Return type:

pd.DataFrame

transform_generator(x)[source]

transform generator

Base Transformer

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.transformers.base.XBaseTransformer[source]

Bases: object

Base class for all transformers.

This base class is used as a template for all xplainable transformers. It contains the basic methods that all transformers should have, and is used to enforce a consistent API across all transformers.

the __call__ method is used to allow the transformers to be called inside the xplainable gui in jupyter, but does not need to be called.

fit(*args, **kwargs)[source]

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)[source]

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)[source]

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()[source]

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

transform(x: Series | DataFrame)[source]

Placeholder for transformation operation. Intended to be overridden.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

Categorical Transformers

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.transformers.categorical.ChangeCase(case='lower')[source]

Bases: XBaseTransformer

Changes the case of a string.

Parameters:

case (str) – ‘upper’ or ‘lower’

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Changes the case of a string.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.Condense(pct=0.8, categories=[])[source]

Bases: XBaseTransformer

Condenses a feature into categories that make up x pct of obserations.

Parameters:

pct (int) – The minumum pct of observations the categories should cover.

fit(ser: Series) Condense[source]

Determines the categories that make up x pct of obserations.

Parameters:

ser (pandas.Series) – The series in which to analyse.

Raises:

TypeError – If the series is not of type string.

Returns:

The fitted transformer.

Return type:

Condense

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Condenses a feature into categories that make up x pct of obserations.

Parameters:

ser (pd.Series) – The series to transform.

Raises:

ValueError – If the series is not of type string.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.DetectCategories(max_categories=10, category_list=[])[source]

Bases: XBaseTransformer

Auto-detects categories from a string column.

Parameters:

max_categories (int) – The maximum number of categories to extract.

fit(ser: Series) DetectCategories[source]

Identifies the top categories from a text series.

Parameters:

ser (pandas.Series) – The series in which to analyse.

Returns:

The fitted transformer.

Return type:

DetectCategories

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Detects categories from a string column.

Parameters:

ser (pd.Series) – The series to transform.

Raises:

TypeError – If the series is not of type string.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.FillMissingCategorical(fill_with='missing')[source]

Bases: XBaseTransformer

Fills missing values with a specified value.

Parameters:

fill_with (str) – Text to fill with.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Fills missing values with a specified value.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.MapCategories(category_values={})[source]

Bases: XBaseTransformer

Maps all categories of a string column to new values

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Maps all categories of a string column to new values

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.MergeCategories(merge_from=[], merge_to='')[source]

Bases: XBaseTransformer

Merges specified categories in a series into one category.

Parameters:
  • merge_from (list) – List of categories to merge from.

  • merge_to (str) – The category to merge to.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Merges specified categories in a series into one category.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.ReplaceCategory(target=None, replace_with='')[source]

Bases: XBaseTransformer

Replaces a category in a series with specified value.

Parameters:
  • target – The target value to replace.

  • replace_with – The value to insert in place.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Replaces a category in a series with specified value.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.ReplaceWith(target=None, replace_with=None)[source]

Bases: XBaseTransformer

Replaces specified value in series

Parameters:

case (str) – ‘upper’ or ‘lower’

case

The case the string will convert to.

Type:

str

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Replaces specified value in series

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.TextContains(selector=None, value=None)[source]

Bases: XBaseTransformer

Flags series values that contain, start with, or end with a value.

Parameters:
  • selector (str) – The type of search to make.

  • value (str) – The value to search.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Flags series values that contain, start with, or end with a value.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.TextRemove(numbers=False, characters=False, uppercase=False, lowercase=False, special=False, whitespace=False, stopwords=False, text=None, custom_regex=None)[source]

Bases: XBaseTransformer

Remove specified values from a str type series.

This transformer cannot be inverse_transformed and does not require fitting.

Parameters:
  • numbers (bool, optional) – Removes numbers from string.

  • characters (bool, optional) – Removes characters from string.

  • uppercase (bool, optional) – Removes uppercase characters from string.

  • lowercase (bool, optional) – Removes lowercase characters from string.

  • special (bool, optional) – Removes special characters from string.

  • whitespace (bool, optional) – Removes whitespace from string.

  • stopwords (bool, optional) – Removes stopwords from string.

  • text (str, optional) – Removes specific text match from string.

  • custom_regex (str, optional) – Removes matching regex text from string.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Removes specified values from a str type series.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.TextSlice(start=None, end=None, action='keep')[source]

Bases: XBaseTransformer

Selects slice from categorical column string.

Parameters:
  • start (int) – Starting character.

  • end (int) – Ending character.

  • action (str) – [keep, drop] selected slice.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Selects slice from categorical column string.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.categorical.TextTrim(selector=None, n=0, action='keep')[source]

Bases: XBaseTransformer

Drops or keeps first/last n characters of a categorical column.

Parameters:
  • selector (str) – [first, last].

  • n (int) – Number of characters to identify.

  • action (str) – [keep, drop] the identified characters.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical']
transform(ser: Series) Series[source]

Drops or keeps first/last n characters of a categorical column.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

Numeric Transformers

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.transformers.numeric.Clip(lower=None, upper=None)[source]

Bases: XBaseTransformer

Clips numeric values to a specified range.

Parameters:
  • lower (float) – The lower threshold value.

  • upper (float) – The upper threshold value.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['numeric']
transform(ser: Series) Series[source]
Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.numeric.FillMissingNumeric(fill_with='mean', fill_value=None)[source]

Bases: XBaseTransformer

Fills missing values with a specified strategy.

Parameters:

fill_with (str) – The strategy [‘mean’, ‘median’, ‘mode’].

fit(ser: Series) FillMissingNumeric[source]

Calculates the fill value from a series.

Parameters:

ser (pandas.Series) – The series to analyse.

Returns:

The fitted transformer.

Return type:

FillMissingNumeric

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['numeric']
transform(ser: Series) Series[source]
Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.numeric.LogTransform[source]

Bases: XBaseTransformer

Log transforms a given numeric series.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(ser: Series) Series[source]
Parameters:

ser (pd.Series) – The series to inverse transform.

Returns:

The inverse transformed series.

Return type:

pd.Series

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['numeric']
transform(ser: Series) Series[source]
Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.numeric.MinMaxScale(min_value=None, max_value=None)[source]

Bases: XBaseTransformer

Scales a numeric series between 0 and 1.

fit(ser: Series) MinMaxScale[source]

Extracts the min and max value from a series.

Parameters:

ser (pandas.Series) – The series in which to analyse.

Returns:

The fitted transformer.

Return type:

MinMaxScale

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['numeric']
transform(ser: Series) Series[source]

Scale a numeric series between 0 and 1.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

Mixed Transformers

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.transformers.mixed.SetDType(to_type=None)[source]

Bases: XBaseTransformer

Changes the data type of a specified column.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['numeric', 'categorical']
transform(ser: Series) Series[source]

Changes the data type of a specified column.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

class xplainable.preprocessing.transformers.mixed.Shift(step=0)[source]

Bases: XBaseTransformer

Shifts a series up or down n steps.

Parameters:

step (str) – The number of steps to shift.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['categorical', 'numeric']
transform(ser: Series) Series[source]

Shifts a series up or down n steps.

Parameters:

ser (pd.Series) – The series to transform.

Returns:

The transformed series.

Return type:

pd.Series

Dataset Transformers

Copyright Xplainable Pty Ltd, 2023

class xplainable.preprocessing.transformers.dataset.ChangeCases(columns=[], case='lower')[source]

Bases: XBaseTransformer

Changes the case of all specified categorical columns.

Parameters:
  • columns (list) – To apply the case change to.

  • case (str) – ‘upper’ or ‘lower’.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Changes the case of all specified categorical columns.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.ChangeNames(col_names={})[source]

Bases: XBaseTransformer

Changes names of columns in a dataset

Parameters:

col_names (dict) – Dictionary of old and new column names.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Changes names of columns in a dataset

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.DateTimeExtract(target=None, year=False, month=False, day=False, weekday=False, day_name=False, hour=False, minute=False, second=False, drop=False)[source]

Bases: XBaseTransformer

Extracts Datetime values from datetime object.

Parameters:
  • target (str) – The datetime column to extract from.

  • year (bool) – Extracts year if True.

  • month (bool) – Extracts month if True.

  • day (bool) – Extracts day if True.

  • weekday (bool) – Extracts weekday if True.

  • day_name (bool) – Extracts day name if True.

  • hour (bool) – Extracts hour if True.

  • minute (bool) – Extracts minute if True.

  • second (bool) – Extracts second if True.

  • drop (bool) – Drops original datetime column if True.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Extracts Datetime values from datetime object.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.DropCols(columns=None)[source]

Bases: XBaseTransformer

Drops specified columns from a dataset.

Parameters:

columns (str) – The columns to be dropped.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Drops specified columns from a dataset.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.DropNaNs(subset=None)[source]

Bases: XBaseTransformer

Drops nan rows from a dataset.

Parameters:

subset (list, optional) – A subset of columns to apply the transfomer.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Drops nan rows from a dataset.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.FillMissing(fill_with={}, fill_values={})[source]

Bases: XBaseTransformer

Fills missing values of all columns with a specified value/strategy.

fit(df: DataFrame) FillMissing[source]

Calculates the fill_value for all columns in the dataset.

The fill values are based on a specified strategy for each column.

Parameters:

df (pd.DataFrame) – The dataset to fit

Returns:

The fitted transformer.

Return type:

FillMissing

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Fills missing values of all columns with a specified value/strategy.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.GroupbyShift(columns=None, step=0, as_new=True, col_names=[], group_by=None, order_by=None, descending=None)[source]

Bases: XBaseTransformer

Shifts a series up or down n steps within specified group.

Parameters:
  • target (str) – The target feature to shift.

  • step (int) – The number of steps to shift.

  • as_new (bool) – Creates new column if True.

  • group_by (str) – The column to group by.

  • order_by (str) – The column to order by.

  • descending (bool) – Orders the value descending if True.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Shifts a series up or down n steps within specified group.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.GroupedSignalSmoothing(target=None, group_by=None, order_by=None, descending=None)[source]

Bases: XBaseTransformer

Smooths signal data within specified group.

Parameters:
  • target (str) – The target feature to shift.

  • as_new (bool) – Creates new column if True.

  • group_by (str) – The column to group by.

  • order_by (str) – The column to order by.

  • descending (bool) – Orders the value descending if True.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Smooths signal data within specified group.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.Operation(columns=[], operation=None, alias: str | None = None, drop: bool = False)[source]

Bases: XBaseTransformer

Applies operation to multiple columns (in order) into new feature.

Parameters:
  • columns (list) – Column names to add.

  • alias (str) – Name of newly created column.

  • drop (bool) – Drops original columns if True

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Applies operation to multiple columns (in order) into new feature.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.OrderBy(order_by=None, ascending=True)[source]

Bases: XBaseTransformer

Orders the dataset by the values of a given series.

Parameters:
  • order_by (str) – The series to order by.

  • ascending (bool) – Orders in ascending order if True.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Orders the dataset by the values of a given series.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.RollingOperation(groupby=None, orderby=None, direction=None, columns=[], window=None, operation=None, drop: bool = False)[source]

Bases: XBaseTransformer

Applies operation to multiple columns (in order) into new feature.

Parameters:
  • columns (list) – Column names to add.

  • alias (str) – Name of newly created column.

  • drop (bool) – Drops original columns if True

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Applies operation to multiple columns (in order) into new feature.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.SetDTypes(types={})[source]

Bases: XBaseTransformer

Sets the data type of all columns in the dataset.

Parameters:

types (dict) – Dictionary of column names and data types.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Sets the data type of all columns in the dataset.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.TextSplit(target=None, separator=None, max_splits=0)[source]

Bases: XBaseTransformer

Splits a string column into multiple columns on a specified separator.

Parameters:
  • target (str) – The columns to split.

  • separator (str) – The separator to split on.

  • max_splits (int) – The maximum number of splits to make.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Splits a string column into multiple columns on a specified separator.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame

class xplainable.preprocessing.transformers.dataset.TextTrimMulti(column='', selector=None, n=0, action='keep', drop_col=False, alias='')[source]

Bases: XBaseTransformer

Drops or keeps first/last n characters of a categorical column.

Parameters:
  • selector (str) – [first, last].

  • n (int) – Number of characters to identify.

  • action (str) – [keep, drop] the identified characters.

fit(*args, **kwargs)

No fit is required for this transformer.

This is a default fit method in case no fit is needed. This method is used to allow the transformer to be used in a pipeline, and is intended to be overridden by transformers that require fitting.

Decorators:

raise_errors (decorator): Raises detailed errors.

fit_transform(x: Series | DataFrame)

Fit and transforms data on a series or dataframe.

Parameters:

x (pd.Series | pd.DataFrame) – Series or df to fit & transform.

Returns:

The transformed series or df.

Return type:

pandas.Series

inverse_transform(x: Series | DataFrame)

No inverse transform is available for this transformer.

This is a default inverse method in case no inverse transform is available.

The input parameter is either a pd.Series or a pd.DataFrame, depending on the transformer. Documentation for each individual transformer should specify which type of input is expected in this method when it is being overridden.

Parameters:

x (pd.Series | pd.DataFrame) – To be specified by transformer.

Decorators:

raise_errors (decorator): Raises detailed errors.

raise_errors()

Decorator to raise detailed errors in transformer functions.

This decorator is used to wrap the transformer methods and raise any errors that occur during processing. This is done to allow the gui to catch the errors and display them.

supported_types = ['dataset']
transform(df: DataFrame) DataFrame[source]

Drops or keeps first/last n characters of a categorical column.

Parameters:

df (pd.DataFrame) – The dataset to transform.

Returns:

The transformed dataset.

Return type:

pd.DataFrame