kedro.contrib.io

Description

This module contains functionality which we might consider moving into the kedro.io module (e.g. additional AbstractDataSets and extensions/alternative DataCatalogs.

Data catalog wrapper

kedro.contrib.io.catalog_with_default.DataCatalogWithDefault([…]) A DataCatalog with a default DataSet implementation for any data set which is not registered in the catalog.

DataSets

kedro.contrib.io.azure.CSVBlobDataSet(…[, …]) CSVBlobDataSet loads and saves csv files in Microsoft’s Azure blob storage.
kedro.contrib.io.azure.JSONBlobDataSet(…) JSONBlobDataSet loads and saves json(line-delimited) files in Microsoft’s Azure blob storage.
kedro.contrib.io.bioinformatics.BioSequenceLocalDataSet(…) BioSequenceLocalDataSet loads and saves data to a sequence file.
kedro.contrib.io.cached.CachedDataSet(dataset) CachedDataSet is a dataset wrapper which caches in memory the data saved, so that the user avoids io operations with slow storage media.
kedro.contrib.io.feather.FeatherLocalDataSet(…) FeatherLocalDataSet loads and saves data to a local feather file.
kedro.contrib.io.matplotlib.MatplotlibLocalWriter(…) MatplotlibLocalWriter saves matplotlib objects to a local image file.
kedro.contrib.io.matplotlib.MatplotlibS3Writer(…) MatplotlibS3Writer saves matplotlib objects to an image file in S3.
kedro.contrib.io.parquet.ParquetS3DataSet(…) ParquetS3DataSet loads and saves data to a file in S3.
kedro.contrib.io.pyspark.SparkDataSet(filepath) SparkDataSet loads and saves Spark data frames.
kedro.contrib.io.pyspark.SparkHiveDataSet(…) SparkHiveDataSet loads and saves Spark data frames stored on Hive.
kedro.contrib.io.pyspark.SparkJDBCDataSet(…) SparkJDBCDataSet loads data from a database table accessible via JDBC URL url and connection properties and saves the content of a PySpark DataFrame to an external database table via JDBC.
kedro.contrib.io.yaml_local.YAMLLocalDataSet(…) YAMLLocalDataset loads and saves data to a local yaml file using PyYAML.
kedro.contrib.io.gcs.CSVGCSDataSet(filepath) CSVGCSDataSet loads and saves data to a file in GCS (Google Cloud Storage).
kedro.contrib.io.gcs.JSONGCSDataSet(filepath) JSONGCSDataSet loads and saves data to a file in GCS (Google Cloud Storage).
kedro.contrib.io.gcs.ParquetGCSDataSet(filepath) ParquetGCSDataSet loads and saves data to a file in Parquet (Google Cloud Storage).
kedro.contrib.io.networkx.NetworkXLocalDataSet(…) NetworkXLocalDataSet loads and saves graphs to a local JSON file format using NetworkX.

DataSet Transformers

kedro.contrib.io.transformers.ProfileTimeTransformer A transformer that logs the runtime of data set load and save calls