Packaging a project

In this section, you will learn how to build your project documentation, as well as how to bundle your project into a Python package for handover.

Add documentation to your project

While Kedro documentation can be found by running kedro docs from the command line, project-specific documentation can be generated by running kedro build-docs in the project’s root directory.

This will create documentation based on the code structure of your project. Documentation will also include the docstrings defined in the project code. The resulting HTML files can be found in docs/build/html/.

kedro build-docs uses the Sphinx framework to build your project documentation, so if you want to customise it, please refer to docs/source/conf.py and the corresponding section of the Sphinx documentation.

Package your project

You can package your project by running kedro package from the command line. This will create one .egg file and one .whl file within the src/dist/ folder of your project, which are Python packaging formats for binary distribution. For further information about packaging for Python, documentation is provided here.

After packaging your project, you can move the .egg and .whl files to your execution environment and install them accordingly using pip install <path/to/your/wheel/file>. For example, if you name your project as kedro-spaceflights and your package kedro_spaceflights, this pip installation will allow you to run your Kedro project with python -m kedro_spaceflights.run. There is also an executable kedro-spaceflights located in the bin directory of your Python installation location.

Please note that this packaging method only contains Python source code of your Kedro pipeline, not any of the conf/, data/ and logs/ directories. To successfully run the packaged project, you still need to be inside a directory that contain these sub-directories. This allows you to distribute the same source code but run it with different configuration, data and logging location in different environments.

Note: data/ folder is optional if your pipeline(s) don’t load or save any local data.

Manage project dependencies

Ensuring that you have accounted for all Python package versions that your project relies on encourages reproducibility of your Kedro project. Use the kedro build-reqs CLI command to pin package versions. It works by taking a requirements.in file (or requirements.txt if the first one does not exist), resolving all package versions using pip compile and freezing them by putting pinned versions back into requirements.txt. It significantly reduces the chances of dependencies issues due to downstream changes as you would always install the same package versions using kedro install.

Extend your project

  • You can also check out Kedro-Docker, an officially supported Kedro plugin for packaging and shipping Kedro projects within Docker containers.
  • We also support converting your Kedro project into an Airflow project with the Kedro-Airflow plugin.

What is next?

You have now successfully built a project along with its documentation and packaged it using one of standard Python distribution formats. You may choose to open-source your project and make it available to a wider community of users and contributors. For further steps we advise you to consult this GitHub guide, PyPI help, a Read the Docs tutorial, and a guide on Open Source Licenses & Standards.