This is the user-facing entry-point for the skeletal DataToolkitCore package and the set of useful transformers and plugins provided by DataToolkitCommon, along with a few trivial niceties.
This aims to provide a foundation for robust data management.
For now, this set of packages around the beta stage of development. No major changes to the core functionality or structure are anticipated, but small expansions in the data-CLI functionality and set of transformers and plugins provided by DataToolkitCommon are expected prior to the 1.0 release, and larger changes may occur if there is good reason for them.
[[iris]]
uuid = "3f3d7714-22aa-4555-a950-78f43b74b81c"
description = "Fisher's famous Iris flower measurements"
[[iris.storage]]
driver = "web"
checksum = "k12:cfb9a6a302f58e5a9b0c815bb7e8efb4"
url = "https://raw.githubusercontent.com/scikit-learn/scikit-learn/1.0/sklearn/datasets/data/iris.csv"
[[iris.loader]]
driver = "csv"
args.header = ["sepal_length", "sepal_width", "petal_length", "petal_width", "species_class"]
args.skipto = 2
- DataDeps.jl
- Downloading files on-demand. Essentially implements the
web
storage driver along with some of the machinery. - DataSets.jl
- An alternate take on declarative data representation. Focused on filling a gap with JuliaHub’s cloud compute offering; less versatile overall.
- RemoteFiles.jl
- Automatically re-downloading files on a schedule. Equivalent
to the
web
storage driver when using thelifetime
parameter of thestore
plugin.
- Announcement thread on Discourse
- JuliaCon 2023 slides and recording