req_snowflake

NOTE: THIS DRIVER/CONNECTOR IS NOT OFFICIALLY AFFILIATED WITH SNOWFLAKE, NOR HAS OFFICIAL SUPPORT FROM THEM.

An Elixir driver for Snowflake, the cloud data platform.

Also has support for both pure-Elixir using JSON, or decoding Arrow files via snowflake_arrow_elixir, a Rust library which decodes the Arrow streaming file into Elixir.

Usage

Mix.install([
  {:req_snowflake, github: "joshuataylor/req_snowflake"}
])

# With plain string query
Req.new()
|> ReqSnowflake.attach(
  username: "rosebud",
  password: "hunter2",
  account_name: "foobar",
  region: "us-east-1",
  warehouse: "compute_wh", # optional
  role: "myrole", # optional
  database: "mydb", # optional
  schema: "myschema" # optional,
  session_parameters: %{} # Passing in session parameters from
  parallel_downloads: 10 # optional, but recommended. Defaults to 5 (what the other connectors default to).
)
|> Req.post!(query: "select L_ORDERKEY, L_PARTKEY from snowflake_sample_data.tpch_sf1.lineitem limit 2").body
#=>
# %ReqSnowflake.Result{
#   columns: ["L_ORDERKEY", "L_PARTKEY"],
#   total_rows: 2,
#   rows: [[3_000_001, 14406], [3_000_002, 34422]],
#   success: true
# }

# With query parameters for inserting
Req.new()
|> ReqSnowflake.attach(
  username: "rosebud",
  password: "hunter2",
  account_name: "foobar",
  region: "us-east-1",
  warehouse: "compute_wh", # optional
  role: "myrole", # optional
  database: "mydb", # optional
  schema: "myschema" # optional
)
|> Req.post!(
  query: "INSERT INTO \"foo\".\"bar\".\"baz\" (\"hello\") VALUES (?)",
  bindings: %{"1" => %{type: "TEXT", value: "xxx"}}
)
#=>
# %ReqSnowflake.Result {
#   columns: ["number of rows inserted"],
#   total_rows: 1,
#   rows: [[1]],
#   success: true
# }

What this is

It uses the Snowflake REST API to communicate with Snowflake, with an earlier version set for JSON (with support for Arrow if using snowflake_arrow. The REST API is used by the Python, Golang, NodeJS and other languages to send requests to Snowflake, so it is stable and changes are communicated.

This library does not use the Snowflake SQL API, which is limited in its implementation and features.

Right now the library doesn't support MFA, so you'll need to either use private key auth or connect using a username & password. A private key auth is highly recommended as you can rotate passwords easier.

One of the major notes when using Ecto is you will need to enable Snowflakes QUOTED_IDENTIFIERS_IGNORE_CASE setting, which you can find here: https://docs.snowflake.com/en/sql-reference/identifiers-syntax.html#third-party-tools-and-case-insensitive-identifier-resolution

Note that this can be done on an account or if needed on a session level which you can set below.

Features

Running queries and inserts.
JSON row decoding, using Flow this is much faster to decode. It's recommended to also use Jiffy, benchmarking shows a 2x speedup due to the size of the JSON files Snowflake returns.
Arrow row decoding, using snowflake_arrow
Passing async queries and getting back a query ID, or polling for results

Options

There are a lot of options that you can pass in, and you can also pass in Snowflake Session Parameters as documented below.

snowflake_query string required

Your snowflake query. select L_ORDERKEY, L_PARTKEY from snowflake_sample_data.tpch_sf1.lineitem limit 2
username string required

Your snowflake username.
password string required

Your snowflake password.
account_name string required

Your account name, this is found before the region name in the URL. https://abc1234.us-east-1.snowflakecomputing.com would be abc1234 .
region string required

Your snowflake region, this is found after your account name.
arrow (boolean) optional

Whether or not to use Arrow. You must have snowflake_arrow included in your project for this to work.
cache_token optional

Cache the login token between queries, for up to 10 minutes. 10 minutes is the standard login token time for Snowflake. If you change a parameter (apart from the query) this will relog you in.
warehouse optional The warehouse to use. If none is provided, will use the users default warehouse
role optional

The role to use. If none is provided, will use the users default role
database optional

The database to use. If none is provided, will use the users default database
schema string optional

The schema to use. If none is provided, will use the users default schema
application_name string optional

Application name to pass. By default will not use an application name.
bindings map optional

Any bindings to pass.
session_parameters map optional

You can pass any session parameters from https://docs.snowflake.com/en/sql-reference/parameters.html.

Example: session_parameters: %{ROWS_PER_RESULTSET: 50} will return 50 results only.
parallel_downloads integer optional

How many parallel downloads to perform for s3. This defaults to 5, which is the default for other connectors.
async boolean optional

Will run the query in async mode, returning you the query ID.
async_poll boolean optional

Will run the query in async mode, then poll every 5000ms (unless defined by async_poll_interval) for the result.
async_poll_interval integer optional

Will run the query in async mode, then poll every interval milliseconds.
async_poll_timeout integer optional

How many times it will try to poll for the result before giving up.
download_chunks boolean optional

Whether to download the chunks or just return the base64.
table boolean optional

If true, will return the results with the expectation you will use table to process the results. By default this is false. For kino_db, this is set as true as this uses Table by default.

If you are dealing with a lot of data, it's highly recommended to use Table as you can stream the chunks out as you need them, saving you memory, CPU and bandwidth.
cache_results **boolean optional

Whether to cache the downloaded chunks, if you are re-iterating over the same dataset this will be cached locally so that you don't have to redownload the data. This is false by default, but enabled for kino_db as when paging we don't want to redownload the chunk.
json_library module optional

When decoding JSON, jiffy has shown to be 2x faster and use less memory than Jason for larger JSON blobs, as Snowflake can send large JSON files. Examples: json_library: Jason or json_library: :jiffy. :jiffy is an atom because it's an Erlang library. Defaults to JSON.

Table Integration

Using table makes it incredibly easy to manipulate and stream over the data. If you download all the chunks at once, you might end up using a lot of memory if you do this over millions of rows.

With table, you can enumerate over the data and only the relevant chunks will be downloaded and decoded.

To get results 0-200, you could do the following:

Req.new()
|> ReqSnowflake.attach(
# your details here!
)
|> Req.post!(
  snowflake_query: "select * from foo.bar.baz"
)
|> Map.get(:body)
|> Table.to_rows()
|> Enum.slice(0, 200)

Which results in the following:

[
%{
  "ROW_NUMBER" => 688156,
  "SF_ARRAY" => nil,
  "SF_BOOLEAN" => false,
  "SF_DATE" => ~D[2024-04-19],
  "SF_DECIMAL_38_2" => 2146.89,
  "SF_FLOAT" => nil,
  "SF_FLOAT_TWO_PRECISION" => nil,
  "SF_INTEGER" => 8562093803,
  "SF_OBJECT" => nil,
  "SF_TIMESTAMP" => ~N[2025-04-20 07:31:58.064000],
  "SF_TIMESTAMP_LTZ" => nil,
  "SF_TIMESTAMP_NTZ" => nil,
  "SF_VARCHAR" => "ylPPdH0leSF3f5lK9kEC",
  "SF_VARIANT_JSON" => "{\n  \"key_eV6kL4tVYXLFrEyyDNol\": true\n}"
},
...
]

kino_db

kino_db integration is supported out of the box, which allows you to connect to an account, create queries and slice the data. It will only download the chunks that it needs, thanks to slicing. This allows you to see results in the table extremely quickly.

Short term Roadmap

Add this to db_connection_snowflake as a generic Snowflake library for db_connection
Add this to ecto_snowflake as a Snowflake library for Ecto.

Medium term roadmap

Add support for MFA
Add support for private key auth
Add support for telemetry

Thanks

I just want to thank the opensource community, especially dbconnection/ecto/ecto_sql/postgrex for being amazing, and being able to copy most of the decoding code from that. I also want to thank the @wojtekmach for the awesome work on req, and the help converting this to req.

Thanks to @lewisvrobinson for the logo.

License

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at [http://www.apache.org/licenses/LICENSE-2.0](http://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
lib/req_snowflake		lib/req_snowflake
priv/testing		priv/testing
test		test
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
mix.exs		mix.exs
mix.lock		mix.lock
req_snowflake.png		req_snowflake.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

lib/req_snowflake

lib/req_snowflake

priv/testing

priv/testing

test

test

.gitignore

.gitignore

CONTRIBUTING.md

CONTRIBUTING.md

README.md

README.md

mix.exs

mix.exs

mix.lock

mix.lock

req_snowflake.png

req_snowflake.png

Repository files navigation

req_snowflake

Table of Contents

Usage

What this is

Features

Options

Table Integration

kino_db

Short term Roadmap

Medium term roadmap

Thanks

License

About

Releases

Packages

Languages

joshuataylor/req_snowflake

Folders and files

Latest commit

History

Repository files navigation

req_snowflake

Table of Contents

Usage

What this is

Features

Options

Table Integration

kino_db

Short term Roadmap

Medium term roadmap

Thanks

License

About

Topics

Resources

Stars

Watchers

Forks

Languages