Introduction

What is Kedro?

Kedro is an open source development workflow tool that helps structure reproducible, scaleable, deployable, robust and versioned data pipelines. It is applicable to a wide range of projects, from single user projects running on a local environment, when you want to have an organised way of working, to enterprise-level team projects, when your team needs to work in a structured way and reduce the effort required to take machine learning models into the production environment. For the source code, take a look at the Kedro repository on Github.

Learning about Kedro

In the next few chapters, you will learn how to install and set up Kedro to build your own production-ready data pipelines.

Once you are set up, to get a feel for Kedro, we suggest working through our examples, including an entry-level “Hello World” and a more detailed Spaceflights tutorial. You will get hands-on experience and learn the basics of Kedro.

Advanced users looking for in-depth information should consult the User Guide.

You can also check out the resources section for answers to frequently asked questions and the API reference documentation to find further information.

Assumptions

We have designed the documentation in general, and the tutorial in particular, for beginners to get started creating their own Kedro projects in Python. If you have elementary knowledge using Python then you might find the Kedro learning curve more challenging. However, we have simplified the tutorial by providing all Python functions required to create your data pipelines.

Note: There are a number of excellent online resources for learning Python, but be aware that you should choose those that reference Python 3, as Kedro is built for Python 3.5+. There are many curated lists of online resources, such as: