Hooks are a mechanism to add extra behaviour to Kedro’s main execution in an easy and consistent manner. Some examples might include:
Adding a log statement after the data catalog is loaded
Adding data validation to the inputs before a node runs, and to the outputs after a node has run. This makes it possible to integrate with other tools like Great-Expectations
Adding machine learning metrics tracking, e.g. using MLflow, throughout a pipeline run
A Hook consists of a Hook specification, and Hook implementation. To add Hooks to your project, you must:
Create or modify the file
<your_project>/src/<package_name>/hooks.pyto define a Hook implementation for an existing Kedro-defined Hook specification
Register your Hook implementation in the
src/<your_project>/settings.pyfile under the
Kedro defines Hook specifications for particular execution points where users can inject additional behaviour. Currently, the following Hook specifications are provided in kedro.framework.hooks:
The naming convention for non-error Hooks is
<before/after>_<noun>_<past_participle>, in which:
<past_participle>refers to when the Hook executed, e.g.
before <something> was runor
after <something> was created.
<noun>refers to the relevant component in the Kedro execution timeline for which this Hook adds extra behaviour, e.g.
The naming convention for error hooks is
on_<noun>_error, in which:
<noun>refers to the relevant component in the Kedro execution timeline that throws the error.
kedro.framework.hooks lists the full specifications for which you can inject additional behaviours by providing an implementation.
You should provide an implementation for the specification that describes the point at which you want to inject additional behaviour. The Hook implementation should have the same name as the specification. The Hook must provide a concrete implementation with a subset of the corresponding specification’s parameters (you do not need to use them all).
To declare a Hook implementation, use the
For example, the full signature of the
after_data_catalog_created Hook specification is:
@hook_spec def after_catalog_created( self, catalog: DataCatalog, conf_catalog: Dict[str, Any], conf_creds: Dict[str, Any], save_version: str, load_versions: Dict[str, str], ) -> None: pass
However, if you just want to use this Hook to list the contents of a data catalog after it is created, your Hook implementation can be as simple as:
# <your_project>/src/<your_project>/hooks.py import logging from kedro.framework.hooks import hook_impl from kedro.io import DataCatalog class DataCatalogHooks: @property def _logger(self): return logging.getLogger(self.__class__.__name__) @hook_impl def after_catalog_created(self, catalog: DataCatalog) -> None: self._logger.info(catalog.list())
The name of a module that contains Hooks implementation is arbitrary and is not restricted to
We recommend that you group related Hook implementations under a namespace, preferably a class, within a
hooks.py file that you create in your project.
Registering your Hook implementations with Kedro¶
Hook implementations should be registered with Kedro using the
<your_project>/src/<package_name>/settings.py file under the
You can register more than one implementation for the same specification. They will be called in LIFO (last-in, first-out) order.
The following example sets up a Hook so that the
after_data_catalog_created implementation is called every time after a data catalog is created.
# <your_project>/src/<your_project>/settings.py from <your_project>.hooks import ProjectHooks, DataCatalogHooks HOOKS = (ProjectHooks(), DataCatalogHooks())
Kedro also has auto-discovery enabled by default. This means that any installed plugins that declare a Hooks entry-point will be registered. To learn more about how to enable this for your custom plugin, see our plugin development guide.
Auto-discovered Hooks will run first, followed by the ones specified in
Disable auto-registered plugins’ Hooks¶
Auto-registered plugins’ Hooks can be disabled via
settings.py as follows:
# <your_project>/src/<your_project>/settings.py DISABLE_HOOKS_FOR_PLUGINS = ("<plugin_name>",)
<plugin_name> is the name of an installed plugin for which the auto-registered Hooks must be disabled.