Skip to content

Ingestion

a_git_a edited this page Dec 18, 2023 · 13 revisions

A typical Ingestion job:

  • Extracts data from various sources** (HTTP APIs, Databases, CSV, etc.).
  • Does NOT do any transformations on the data (besides formating the payload to be accepted by target (e.g json serialization)).
  • Loads the data to your preferred Ingestion target (database, cloud storage)

Ingesting data

As usual - it is a one-liner, e.g.:

Example: send any JSON-able Python object for ingestion

job_input.send_object_for_ingestion( {'some number': 4098, 'some text': "hi!"}, "name_of_table_that_receives_the_data" ) # Every Python object is a dictionary, so we are showing an example with a dictionary here.

For real-life production examples, you can check the following examples.

Ingestion examples:

Ingesting data from REST API into Database
Ingesting data from DB into Database
Ingesting local CSV file into Database
Incremental ingestion using Job Properties
Ingesting data from an authenticated REST API using Secrets

Jupyter Ingestion Tutorial

VDK Ingestion Tutorial with Jupyter Notebooks

Videos

▶️ Data Ingestion Intro
▶️ Incremental Ingestion

All VDK Examples

All VDK examples can be found here

➡️ Next section: Transformation

Clone this wiki locally