Kedro starters

Kedro starters are used to create projects that contain code to run as-is, or to adapt and extend. They provide pre-defined example code and configuration that can be reused, for example:

  • As example code for a typical Kedro project
  • To add a docker-compose setup to launch Kedro next to a monitoring stack
  • To add deployment scripts and CI/CD setup for your targeted infrastructure

A Kedro starter is a Cookiecutter template that contains the boilerplate code for a Kedro project. You can create your own starters for reuse within a project or team, as described in the documentation about how to create a Kedro starter.

How to use Kedro starters

To create a Kedro project using a starter, apply the --starter flag to kedro new as follows:

kedro new --starter=<path-to-starter>
Note: path-to-starter could be a local directory or a VCS repository, as long as it is supported by Cookiecutter.

To create a project using the PySpark starter:

kedro new --starter=https://github.com/quantumblacklabs/kedro-starter-pyspark.git

If no starter is provided to kedro new, the default Kedro template will be used, as documented in “Creating a new project”.

Starter aliases

We provide aliases for common starters maintained by Kedro team so that users don’t have to specify the full path. For example, to create a project using the PySpark starter:

kedro new --starter=pyspark

To list all the aliases we support:

kedro starter list

List of official starters

The Kedro team maintains the following starters to bootstrap new Kedro projects:

Each starter project encodes our recommended Kedro best practices.

Starter versioning

By default, Kedro will use the latest version available in the repository, but if you want to use a specific version of a starter, you can pass a --checkout argument to the command as follows:

kedro new --starter=pyspark --checkout=0.1.0

The --checkout value points to a branch, tag or commit in the starter repository.

Under the hood, the value will be passed to the --checkout flag in Cookiecutter.

Use a starter in interactive mode

By default, when you create a new project using a starter, kedro new launches by asking a few questions. You will be prompted to provide the following variables:

  • project_name - A human readable name for your new project
  • repo_name - A name for the directory that holds your project repository
  • python_package - A Python package name for your project package (see Python package naming conventions)

This mode assumes that the starter doesn’t require any additional configuration variables.

Use a starter with a configuration file

Kedro also allows you to specify a configuration file to create a project. Use the --config flag alongside the starter as follows:

kedro new --config=my_kedro_pyspark_project.yml --starter=pyspark

This option is useful when the starter requires more configuration than is required by the interactive mode.