You can use Kedro starters to customise the boilerplate code provided by
kedro new. This is useful when you need to adapt to different use cases, such as:
- To add initial configuration, initialisation code and an example pipeline for PySpark
- To add a
docker-composesetup to launch Kedro next to a monitoring stack
- To add deployment scripts and CI/CD setup for your targeted infrastructure
A Kedro starter is a Cookiecutter template that contains the boilerplate code for a Kedro project. Each starter encodes best practices and provides utilities to bootstrap a new Kedro project for a particular use case. For example, we have created a
PySpark starter, which contains initial configuration and initialisation code for PySpark according to our recommended Kedro best practices.
How to use Kedro starters¶
To create a Kedro project using a starter, apply the
--starter flag to
kedro new as follows:
kedro new --starter=<path-to-starter>
path-to-startercould be a local directory or a VCS repository, as long as it is supported by Cookiecutter.
To create a project using the
kedro new --starter=https://github.com/quantumblack/kedro-starter-pyspark.git
If no starter is provided to
kedro new, the default Kedro template will be used, as documented in “Creating a new project”.
We provide aliases for common starters maintained by Kedro team so that users don’t have to specify the full path. For example, to create a project using the
kedro new --starter=pyspark
To list all the aliases we support:
kedro starter list
List of official starters¶
The Kedro team maintains the following starters:
By default, Kedro will use the latest version available in the repository, but if you want to use a specific version of a starter, you can pass a
--checkout argument to the command as follows:
kedro new --starter=pyspark --checkout=0.1.0
--checkout value points to a branch, tag or commit in the starter repository.
Under the hood, the value will be passed to the
--checkout flag in Cookiecutter.
Use a starter in interactive mode¶
By default, when you create a new project using a starter,
kedro new launches in interactive mode. You will be prompted to provide the following variables:
project_name- A human readable name for your new project
repo_name- A name for the directory that holds your project repository
python_package- A Python package name for your project package (see Python package naming conventions)
This mode assumes that the starter doesn’t require any additional configuration variables.
Use a starter with a configuration file¶
Kedro also allows you to specify a configuration file to create a project. Use the
--config flag alongside the starter as follows:
kedro new --config=my_kedro_pyspark_project.yml --starter=pyspark
This option is useful when the starter requires more configuration than is required by the interactive mode.