Spaceflights tutorial FAQs

Note

If you can’t find the answer you need here, ask the Kedro community for help!

How do I resolve these common errors?

Dataset errors

DatasetError: Failed while loading data from data set

You’re testing whether Kedro can load the raw test data and see the following:

DatasetError: Failed while loading data from data set
CSVDataset(filepath=...).
[Errno 2] No such file or directory: '.../companies.csv'

or a similar error for the shuttles or reviews data.

Are the three sample data files stored in the data/raw folder?

DatasetNotFoundError: Dataset not found in the catalog

You see an error such as the following:

DatasetNotFoundError: Dataset 'companies' not found in the catalog

Has something changed in your catalog.yml from the version generated by the spaceflights starter? Take a look at the data specification to ensure it is valid.

Call exit() within the IPython session and restart kedro ipython (or type @kedro_reload into the IPython console to reload Kedro into the session without restarting). Then try again.

DatasetError: An exception occurred when parsing config for Dataset

Are you seeing a message saying that an exception occurred?

DatasetError: An exception occurred when parsing config for Dataset
'data_processing.preprocessed_companies':
Object 'ParquetDataset' cannot be loaded from 'kedro_datasets.pandas'. Please see the
documentation on how to install relevant dependencies for kedro_datasets.pandas.ParquetDataset:
https://kedro.readthedocs.io/en/stable/kedro_project_setup/dependencies.html

The Kedro Data Catalog is missing dependencies needed to parse the data. Check that you have all the project dependencies to requirements.txt and then call pip install -r requirements.txt to install them.

Pipeline run

To successfully run the pipeline, all required input datasets must already exist, otherwise you may get an error similar to this:

kedro run --pipeline=data_science

2019-10-04 12:36:12,158 - kedro.io.data_catalog - INFO - Loading data from `model_input_table` (CSVDataset)...
2019-10-04 12:36:12,158 - kedro.runner.sequential_runner - WARNING - There are 3 nodes that have not run.
You can resume the pipeline run with the following command:
kedro run
Traceback (most recent call last):
  ...
  File "pandas/_libs/parsers.pyx", line 382, in pandas._libs.parsers.TextReader.__cinit__
  File "pandas/_libs/parsers.pyx", line 689, in pandas._libs.parsers.TextReader._setup_parser_source
FileNotFoundError: [Errno 2] File b'data/03_primary/model_input_table.csv' does not exist: b'data/03_primary/model_input_table.csv'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  ...
    raise DatasetError(message) from exc
kedro.io.core.DatasetError: Failed while loading data from data set CSVDataset(filepath=data/03_primary/model_input_table.csv, save_args={'index': False}).
[Errno 2] File b'data/03_primary/model_input_table.csv' does not exist: b'data/03_primary/model_input_table.csv'