Releases: spiceai/spiceai
v0.13.2-alpha
Spice v0.13.2-alpha (June 3, 2024)
The v0.13.2-alpha release is focused on quality and stability with improvements to federated query push-down, telemetry, and query history.
Highlights
-
Filesystem Data Connector: Adds the Filesystem Data Connector for directly using files as data sources.
-
Federated Query Push-Down: Improved stability and schema compatibility for federated queries.
-
Enhanced Telemetry: Runtime Metrics now include last update time for accelerated datasets, count of refresh errors, and new metrics for query duration and failures.
-
Query History: Enabled query history logging for Arrow Flight queries in addition to HTTP queries.
Contributors
What's Changed
- Update ROADMAP.md May 27, 2024 by @lukekim in #1535
- update helm chart version and use v0.13.1-alpha by @y-f-u in #1536
- version correction in v0.13.1 release note by @y-f-u in #1538
- update version to v0.14.0-alpha by @y-f-u in #1539
- Update
spice_cloud
- connect to cloud api by @ewgenius in #1523 - Update spice_cloud extension params, and remove logging by @ewgenius in #1541
- Update MSRV to 1.78 and remove unused Rust Version parameter in CI by @phillipleblanc in #1540
- Improve
llm
UX inspicepod.yaml
by @Jeadie in #1545 - Store local runtime metrics in Timestamp with nanoseconds precision and UTC time by @ewgenius in #1548
- Object store metadata Table provider by @Jeadie in #1518
- Remove clickhouse password requirement by @Sevenannn in #1547
- pretty print loaded rows number by @y-f-u in #1553
- Fix UNION ALL federated push down by @phillipleblanc in #1550
- Update mistral, fix bugs and improve local file DX by @Jeadie in #1552
- Cast
runtime.metrics
schema, if remote (spiceai) data connector provided by @ewgenius in #1554 - Use proper MySQL dialect during federation push-down by @phillipleblanc in #1555
- parallel load dataset when starting up by @y-f-u in #1551
- fix linter warning on Scanf return value by @y-f-u in #1556
- Update spice cloud connect api endpoint by @ewgenius in #1557
- Create new HTTP endpoint to create embeddings. by @Jeadie in #1558
- Query History support for Flight API by @sgrebnov in #1549
- Don't cache queries for runtime tables by @sgrebnov in #1561
- Fix schema incompatibility on federated push-down queries by @phillipleblanc in #1560
- move 'embeddings' to top-level concept in spicepod.yaml by @Jeadie in #1564
object_store
table provider for UTF8 data formats by @Jeadie in #1562- Improve connectivity for JDBC clients, like Tableau by @sgrebnov in #1563
- Enable datasets from local filesystem by @Jeadie in #1584
- Adds benchmarking tests for Spice by @phillipleblanc in #1577
- Push down correct timestamp expr to SQLite, add binary type mapping by @mach-kernel in #1566
- Add
query_duration_seconds
andquery_failures
metrics by @sgrebnov in #1575 - Use
/app
as a default workdir in spiceai docker image by @ewgenius in #1586 - Add support for both file:// and file:/ by @Jeadie in #1587
- put load_datasets as the latest step along with start_servers by @y-f-u in #1559
- Embedding columns (from embedding providers) are now run inside datafusion plans. by @Jeadie in #1576
- Support BinaryArray in DuckDB accelerations by @phillipleblanc in #1595
- Add cache header to Flight API and Spice REPL indicator by @sgrebnov in #1591
- Add accelerated datasets refresh metrics by @sgrebnov in #1589
- update the error when starting spice sql with no runtime to be actionable by @digadeesh in #1597
- add odbc integration test by @y-f-u in #1590
- Fix bug in instantiating
EmbeddingConnector
by @Jeadie in #1592 - readme change to reflect new cli output by @y-f-u in #1602
- Update version v0.13.2 by @ewgenius in #1604
- Roadmap changes Jun 3, 2024 by @lukekim in #1609
Full Changelog: v0.13.1-alpha...v0.13.2
v0.13.1-alpha
Spice v0.13.1-alpha (May 27, 2024)
The v0.13.1-alpha release of Spice is a minor update focused on stability, quality, and operability. Query result caching provides protection against bursts of queries and schema support for datasets has been added logical grouping. An issue where Refresh SQL predicates were not pushed down underlying data sources has been resolved along with improved Acceleration Refresh logging.
Highlights in v0.13.1-alpha
-
Results Caching: Introduced query results caching to handle bursts of requests and support caching of non-accelerated results, such as refresh data returned on zero results. Results caching is enabled by default with a
1s
item time-to-live (TTL). Learn more. -
Query History Logging: Recent queries are now logged in the new
spice.runtime.query_history
dataset with a default retention of 24-hours. Query history is initially enabled for HTTP queries only (not Arrow Flight queries). -
Dataset Schemas: Added support for dataset schemas, allowing logical grouping of datasets by separating the schema name from the table name with a
.
. E.g.datasets: - from: mysql:app1.identities name: app.users - from: postgres:app2.purchases name: app.purchases
In this example, queries against
app.users
will be federated tomy_schema.my_table
, andapp.purchases
will be federated toapp2.purchases
.
Contributors
@y-f-u
@Jeadie
@sgrebnov
@ewgenius
@phillipleblanc
@lukekim
@gloomweaver
@Sevenannn
New in this release
- Add more type support on mysql connector by @y-f-u in #1449
- Add in-memory caching support for Arrow Flight queries by @sgrebnov in #1450
- Fix the table reference to use the full table reference, not just the table by @phillipleblanc in #1460
- Make
file_format
parameter required for S3/FTP/SFTP connector by @ewgenius in #1455 - Add more verbose logging when acceleration refresh update is finished by @y-f-u in #1453
- Fix snowflake dataset path when using federation query by @y-f-u in #1474
- Update cargo to use spiceai datafusion fork by @y-f-u in #1475
- Enable in-memory results caching by default by @sgrebnov in #1473
- Add basic integration test for MySQL federation by @phillipleblanc in #1477
- Update results_cache config names per final spec by @sgrebnov in #1487
- Add DuckDB quickstart to E2E tests by @lukekim in #1461
- Add X-Cache header for http queries by @sgrebnov in #1472
- Add telemetry for in-memory caching by @sgrebnov in #1456
- Pin Git dependencies to a specific commit hash by @phillipleblanc in #1490
- Detect
file_format
from dataset path by @ewgenius in #1489 - Add
file_format
to helm chart sample dataset by @ewgenius in #1493 - Improve duckdb data connector error messages by @Sevenannn in #1486
- Add
file_format
prompt for s3 and ftp datasets in Dataset Configure CLI if no extension detected by @ewgenius in #1494 - Add llms to the spicepod definition and use throughout by @Jeadie in #1447
- Fix duckdb acceleration converting null into default values. by @y-f-u in #1500
- Separate runtime Dataset from spicepod Dataset by @phillipleblanc in #1503
- Duckdb e2e test OSX support by @y-f-u in #1505
- Use TableReference for dataset name by @phillipleblanc in #1506
- Tweak Results Cache naming and output by @lukekim in #1509
- Fix refresh_sql not properly passing down filters by @phillipleblanc in #1510
- Allow datasets to specify a schema by @phillipleblanc in #1507
- Cache invalidation for accelerated tables by @sgrebnov in #1498
- Improve spark data connector error messages by @Sevenannn in #1497
- Parse postgres table schema from prepare statement to support empty tables by @ewgenius in #1445
- Improve clarity of README and add FAQ by @lukekim in #1512
- Use binary data transfer for ftp by @gloomweaver in #1517
- Add support for time64 for SQL insertion statement by @y-f-u in #1519
- Add Spice Extensions PoC by @ewgenius in #1476
- Add results cache metrics, pod and quantile filters to Grafana dashboard by @sgrebnov in #1513
- Add unit tests for results caching utils by @sgrebnov in #1514
- Add E2E tests for results caching by @sgrebnov in #1515
- Pass table_reference full string into spark_session table so it can query across schemas or catalogs by @y-f-u in #1521
- Trace on debug level for tables in
runtime
schema by @ewgenius in #1524 - Update SparkSessionBuilder::remote and update spark fork hash by @Sevenannn in #1495
- Fix federation push-down for datasets with schemas by @phillipleblanc in #1526
- Store history of queries in 'spice.runtime.query_history' by @Jeadie in #1501
- Disable cache for system queries by @sgrebnov in #1528
- Register runtime tables with runtime schema by @phillipleblanc in #1532
- Fix acknowledgments workflow to include all cargo features by @Jeadie in #1531
Full Changelog: v0.13.0-alpha...v0.13.1-alpha
v0.13.0-alpha
Spice v0.13-alpha (May 20, 2024)
The v0.13.0-alpha release significantly improves federated query performance and efficiency with Query Push-Down. Query push-down allows SQL queries to be directly executed by underlying data sources, such as joining tables using the same data connector. Query push-down is supported for all SQL-based and Arrow Flight data connectors. Additionally, runtime metrics, including query duration, collected and accessed in the spice.runtime.metrics
table. This release also includes a new FTP/SFTP data connector and improved CSV support for the S3 data connector.
Highlights
-
Federated Query Push-Down (#1394): All SQL and Arrow Flight data connectors support federated query push-down.
-
Runtime Metrics (#1361): Runtime metric collection can be enabled using the
--metrics
flag and accessed by thespice.runtime.metrics
table. -
FTP & SFTP data connector (#1355) (#1399): Added support for using FTP and SFTP as data sources.
-
Improved CSV support (#1411) (#1414): S3/FTP/SFTP data connectors support CSV files with expanded CSV options.
Contributors
What's Changed
- Remove milestones from Enhancement template by @lukekim in #1373
- Update version.txt and Cargo.toml to 0.13.0-alpha by @sgrebnov in #1375
- Helm chart for Spice v0.12.2-alpha by @sgrebnov in #1374
- Add
release
cargo feature to docker builds by @ewgenius in #1377 - FTP connector by @gloomweaver in #1355
- Provide ability to specify timeout for s3 data connector by @gloomweaver in #1378
- clickhouse-rs use tag instead of branch by @gloomweaver in #1313
- Store runtime metrics in
spice.runtime.metrics
table by @ewgenius in #1361 - Update bug_report.md to include the kind/bug label by @digadeesh in #1381
- Remove redundant [refresh] in log by @lukekim in #1384
- Implement federation for DuckDB Data Connector (POC) by @phillipleblanc in #1380
- Update wording for spice cloud connection by @ewgenius in #1386
- fix dataset refreshing status by @y-f-u in #1387
- clickhouse friendly error by @y-f-u in #1388
- Initial work for NQL crate and API by @Jeadie in #1366
- Fully implement federation for all SqlTable-based Data Connectors by @phillipleblanc in #1394
- use df logical plan to query latest timestamp when refreshing incrementally by @y-f-u in #1393
- Refactor datafusion.write_data to use table reference by @ewgenius in #1402
- Add federation to FlightTable based DataConnectors by @phillipleblanc in #1401
- SFTP Data Connector by @gloomweaver in #1399
- Use GPT3.5 for NSQL task by @Jeadie in #1400
- Update ROADMAP May 16, 2024 by @lukekim in #1405
- Add ftp/sftp connector to readme by @gloomweaver in #1404
- Add FlightSQL federation provider by @phillipleblanc in #1403
- Refactor runtime metrics to use localhost accelerated table by @ewgenius in #1395
- Use JSON response in OpenAI, text -> SQL model by @Jeadie in #1407
- support more common csv options by @y-f-u in #1411
- add a TLS error message in data connector and implement it for clickhouse by @y-f-u in #1413
- Add CSV to s3 data formats by @gloomweaver in #1414
- fix up dependencies now 0.5.0 disappeared by @Jeadie in #1417
- Add NSQL to FlightRepl by @Jeadie in #1409
- Update Cargo.lock by @phillipleblanc in #1418
- Enable spice.ai replication for
runtime.metrics
table by @ewgenius in #1408 - Restructure the runtime struct to make it easier to test by @phillipleblanc in #1420
- Make it easier to construct an App programatically by @phillipleblanc in #1421
- Add an integration test for federation by @phillipleblanc in #1426
- wait 2 seconds for the status to turn ready in refreshing status test by @y-f-u in #1419
- Add functional tests for federation push-down by @phillipleblanc in #1428
- Enable push-down federation by default by @phillipleblanc in #1429
- Add guides and examples about error handling by @ewgenius in #1427
- Add LRU cache support for http-based queries by @sgrebnov in #1410
- Update README.md - Remove bigquery from tablet of connectors by @digadeesh in #1434
- Update acknowledgements by @github-actions in #1433
- CLI wording and logs change reflected on readme by @y-f-u in #1435
- Add databricks_use_ssl parameter by @Sevenannn in #1406
- Update helm version and use v0.13.0-alpha by @Jeadie in #1436
- Don't include feature 'llms/candles' by default by @Jeadie in #1437
- Correctly map NullBuilder for Null arrow types by @phillipleblanc in #1438
- Propagate object store error by @gloomweaver in #1415
Full Changelog: v0.12.2-alpha...v0.13.0-alpha
v0.12.2-alpha
Spice v0.12.2-alpha (May 13, 2024)
The v0.12.2-alpha release introduces data streaming and key-pair authentication for the Snowflake data connector, enables general append
mode data refreshes for time-series data, improves connectivity error messages, adds nested folders support for the S3 data connector, and exposes nodeSelector and affinity keys in the Helm chart for better Kubernetes management.
Highlights
-
Improved Connectivity Error Messages: Error messages provide clearer, actionable guidance for misconfigured settings or unreachable data connectors.
-
Snowflake Data Connector Improvements: Enables data streaming by default and adds support for key-pair authentication in addition to passwords.
-
API for Refresh SQL Updates: Update dataset Refresh SQL via API.
-
Append Data Refresh: Append mode data refreshes for time-series data are now supported for all data connectors. Specify a dataset
time_column
withrefresh_mode: append
to only fetch data more recent than the latest local data. -
Docker Image Update: The
spiceai/spiceai:latest
Docker image now includes the ODBC data connector. For a smaller footprint, usespiceai/spiceai:latest-slim
. -
Helm Chart Improvements:
nodeSelector
andaffinity
keys are now supported in the Helm chart for improved Kubernetes deployment management.
Breaking Changes
- API to trigger accelerated dataset refreshes has changed from
POST /v1/datasets/:name/refresh
toPOST /v1/datasets/:name/acceleration/refresh
to be consistent with theSpicepod.yaml
structure.
Contributors
What's Changed
- Fix list type support in spark connect by @y-f-u in #1341
- Add nested folder support in S3 Parquet connector by @y-f-u in #1342
- Improves S3 connector using DataFusion ListingTable table provider by @y-f-u in #1326
- Update ROADMAP May 6, 2024 by @lukekim in #1315
- List flightsql and snowflake as supported connectors in README.md by @sgrebnov in #1317
- Helm chart for v0.12.1-alpha by @ewgenius in #1323
- Read sqlite_file param and use it as path by @Sevenannn in #1309
- Compile spiced with
release
feature in docker image by @ewgenius in #1324 - Add support for Snowflake key-pair authentication by @sgrebnov in #1314
- Wrap postgres errors in common DataConnectorError by @ewgenius in #1327
- Fix TPCH tests runner by @sgrebnov in #1330
- Spice CLI support for Snowflake key-pair auth by @sgrebnov in #1325
- sql_provider_datafusion: Support TimestampMicrosecond, Date32, Date64 by @mach-kernel in #1329
- Resolve dangling reference for SQLite by @Sevenannn in #1312
- Select columns from Spark Dataframe according to projected_schema by @Sevenannn in #1336
- Expose nodeselector and affinity keys in Helm chart by @mach-kernel in #1338
- Use streaming for Snowflake queries by @sgrebnov in #1337
- Publish ODBC images by @mach-kernel in #1271
- Include Postgres acceleration engine to types support tests by @sgrebnov in #1343
- Refactor dataconnector providers getters to return common
DataConnectorResult
andDataConnectorError
by @ewgenius in #1339 - s3 csv support to validate the listing table extensibility by @y-f-u in #1344
- Move model code into separate, feature-flagged crate by @Jeadie in #1335
- Initial setup for federated queries by @phillipleblanc in #1350
- Refactor dbconnection errors, and catch invalid postgres table name case by @ewgenius in #1353
- Rename default datafusion catalog to "spice", add internal "spice.runtime" schema by @ewgenius in #1359
- Add API to set Refresh SQL for accelerated table by @sgrebnov in #1356
- Set next version to v0.12.2 by @phillipleblanc in #1367
- Upgrade to DataFusion 38 by @phillipleblanc in #1368
- Incremental append based on time column by @y-f-u in #1360
- Update README.md to include correct output when running show tables from quickstart by @digadeesh in #1371
Full Changelog: v0.12.1-alpha...v0.12.2-alpha
v0.12.1-alpha
Spice v0.12.1-alpha (May 6, 2024)
The v0.12.1-alpha release introduces a new Snowflake data connector, support for UUID and TimestampTZ types in the PostgreSQL connector, and improved error messages across all data connectors. The Clickhouse data connector enables data streaming by default. The public SQL interface now restricts DML and DDL queries. Additionally, accelerated tables now fully support NULL values, and issues with schema conversion in these tables have been resolved.
Highlights
-
Snowflake Data Connector: Initial support for Snowflake as a data source.
-
Clickhouse Data Streaming: Enables data streaming by default, eliminating in-memory result collection.
-
Read-only SQL Interface: Disables DML (INSERT/UPDATE/DELETE) and DDL (CREATE/ALTER TABLE) queries for improved data source security.
-
Error Message Improvements: Improved the error messages for commonly encountered issues with data connectors.
-
Accelerated Tables: Supports NULL values across all data types and fixes schema conversion errors for consistent type handling.
Contributors
What's Changed
- Add schema types check for query result by @sgrebnov in #1212
- helm chart for v0.12.0-alpha by @y-f-u in #1235
- Update acknowledgements by @github-actions in #1232
- Bump spiceai version to v0.12.1-alpha by @ewgenius in #1239
- Update ROADMAP.md - remove v0.12.0-alpha by @ewgenius in #1241
- Raise errors in InsertBuilder by @Jeadie in #1242
- Update endgame template by @ewgenius in #1240
- Add E2E tests for acceleration engines types support by @sgrebnov in #1218
- Stream blocks to arrow by @gloomweaver in #1203
- Update enhancement.md to include a checklist item have a release notes entry for each enhancement. by @digadeesh in #1245
- arrow_sql_gen data column conversion by @Sevenannn in #1230
- Implement the Localhost Data Connector & fix DoPut by @phillipleblanc in #1266
- Update postgres parameter check by @Sevenannn in #1244
- Record batch casting to fix SQLite data type issues by @y-f-u in #1261
- typo fix on Decimal in postgres arrow_sql_gen by @y-f-u in #1277
- Move verify_schema to arrow_tools by @phillipleblanc in #1284
- Support UUID and TimestampTZ type for Postgres as Data Connector by @ahirner & @y-f-u #1276
- Fix linter warnings by @ewgenius in #1286
- Add Snowflake data connector by @sgrebnov in #1278
- Add Snowflake login support (username and password) by @sgrebnov in #1272
- convert timestamp properly in sql gen by @y-f-u in #1291
- Add if not exists clause to create statement on when creating a table using duckdb acceleration. by @digadeesh in #1290
- Disable DML & DDL queries in the public SQL interface by @phillipleblanc in #1294
- Refactor duckdb to properly set access_mode for connection by @ewgenius in #1285
- do not insert batch for sqlite and postgres if no records in the record batch by @y-f-u in #1293
- Postgres - add custom error message for invalid error table by @ewgenius in #1295
- SQLite/Accelerators handle null values by @gloomweaver in #1298
- Add command to attach to running process by @gloomweaver in #1297
- Use the
GITHUB_TOKEN
environment variable in the installation script, if available, to avoid rate limiting in CI workflows by @ewgenius in #1302 - Fix unsupported SSL mode options for PostgreSQL connection string by @ewgenius in #1300
- Add CLI cmd
spice login spark
by @phillipleblanc in #1303 - Check only the latest published release to avoid installing pre-release versions by @ewgenius in #1301
- Postgres data connector - handle invalid host/port and username/password errors by @ewgenius in #1292
- Fix the panic on bad clickhouse connection by @phillipleblanc in #1306
- Improve Snowflake Data Connector by @sgrebnov #1296
Full Changelog: v0.12.0-alpha...v0.12.1-alpha
v0.12-alpha
Spice v0.12-alpha (Apr 29, 2024)
The v0.12-alpha release introduces Clickhouse and Apache Spark data connectors, adds support for limiting refresh data periods for temporal datasets, and includes upgraded Spice Client SDKs compatible with Spice OSS.
Highlights
-
Clickhouse data connector: Use Clickhouse as a data source with the
clickhouse:
scheme. -
Apache Spark Connect data connector: Use Apache Spark Connect connections as a data source using the
spark:
scheme. -
Refresh data window: Limit accelerated dataset data refreshes to the specified window, as a duration from now configuration setting, for faster and more efficient refreshes.
-
ODBC data connector: Use ODBC connections as a data source using the
odbc:
scheme. The ODBC data connector is currently optional and not included in default builds. It can be conditionally compiled using theodbc
cargo feature when building from source. -
Spice Client SDK Support: The official Spice SDKs have been upgraded with support for Spice OSS.
Breaking Changes
- Refresh interval: The
refresh_interval
acceleration setting and been changed torefresh_check_interval
to make it clearer it is the check versus the data interval.
Contributors
- @phillipleblanc
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
- @gloomweaver
- @edmondop
- @mach-kernel
New Contributors
- Thanks to @mach-kernel who made their first contribution in #1204 by adding the ODBC data connector!
What's Changed
- Update helm version by @Jeadie in #1167
- Handle and trace errors in secret stores by @ewgenius in #1149
- bump the release versions to 0.12.0 by @y-f-u in #1171
- Don't fail acknowledgments flow if no changes detected by @ewgenius in #1170
- Allow Spice CLI to control runtime installation on Windows by @sgrebnov in #1173
- Allow
SELECT count(*)
for Sqlite Data Accelerator by @sgrebnov in #1166 - add refresh_period param in acceleration by @y-f-u in #1180
- Properly support Spark Connect filter pushdown by @phillipleblanc in #1186
- Avoid rate-limiting on arduino/setup-protoc@v3 by @phillipleblanc in #1189
- Clickhouse DataConnector base implementation by @gloomweaver in #1168
- rename refresh_interval to refresh_check_interval by @y-f-u in #1190
- Fix timestamp & add support for Decimal to Databricks/Spark by @phillipleblanc in #1194
- Convert temporal column and refresh period to datafusion expr by @y-f-u in #1187
- Hot reload accelerated table on dataset update by @ewgenius in #1195
- Upgrade DataFusion to 37.1 & DuckDB to 10.2 by @phillipleblanc in #1200
- Update version.txt for 0.11.2 release by @digadeesh in #1199
- Clickhouse E2E by @gloomweaver in #1193
- Clickhouse: fix darwin ci pipeline by @gloomweaver in #1201
- Add table_type to
show tables
in Spice SQL & update next version tov0.12.0-alpha
by @phillipleblanc in #1206 - print WARN if time_column does not exists in federated schema by @y-f-u in #1207
- Add FallbackOnZeroResultsScanExec for executing an input ExecutionPlan and optionally falling back to a TableProvider.scan() if the input has zero results by @phillipleblanc in #1196
- Clickhouse refactor connection code and set secure option by @gloomweaver in #1198
- E2E: reusable Spice installation by @sgrebnov in #1205
- Clickhouse block_to_arrow unit test by @gloomweaver in #1202
- rename refresh_period to refresh_data_period by @y-f-u in #1210
- Refactor E2E tests: dataset verification and PostgreSQL installation by @sgrebnov in #1211
- Add BI dashboard acceleration video to README.md by @lukekim in #1219
- Improve clarity and consistency of output messages by @lukekim in #1214
- Update ROADMAP Apr 29, 2024 by @lukekim in #1220
- Stand-alone Spark Connect: Isolate Spark Connect from Databricks Connect to make it reusable by @edmondop in #1213
- Optimize build time in dev mode by @gloomweaver in #1215
- Feature: Support ODBC reads using unixodbc by @mach-kernel in #1204
- Use non-fork deltalake by @phillipleblanc in #1223
- Support Date32 & Date64 in arrow_sql_gen by @Jeadie in #1217
- Update REPL output to be consistent with the latest Spice version by @sgrebnov in #1231
- rename refresh_data_period to refresh_data_window by @y-f-u in #1233
- Update README.md to include ODBC, Spark Connect, and Clickhouse data connectors in support data connector matrix. by @digadeesh in #1234
Full Changelog: v0.11.1-alpha...v0.12.0-alpha
0.11.1-alpha
Spice v0.11.1-alpha (Apr 22, 2024)
The v0.11.1-alpha release introduces retention policies for accelerated datasets, native Windows installation support, and integration of catalog and schema settings for the Databricks Spark connector. Several bugs have also been fixed for improved stability.
Highlights
-
Retention Policies for Accelerated Datasets: Automatic eviction of data from accelerated time-series datasets when a specified temporal column exceeds the retention period, optimizing resource utilization.
-
Windows Installation Support: Native Windows installation support, including upgrades.
-
Databricks Spark Connect Catalog and Schema Settings: Improved translation between DataFusion and Spark, providing better Spark Catalog support.
Contributors
New in this release
- PowerShell script to install Spice on Windows by @sgrebnov in #1128
- Support catalog and schema in Databricks Spark Connect by @phillipleblanc in #1137
- Retention handlers by @y-f-u in #1096
What's Changed
- Update CONTRIBUTING with new dependencies by @phillipleblanc in #1121
- Fix the Helm tag by @phillipleblanc in #1122
- Upgrade Spice version to 0.11.1 by @sgrebnov in #1123
- Remove 0.11 from roadmap by @ewgenius in #1124
- Include
refresh_sql
and manual refresh to e2e tests by @sgrebnov in #1125 - Respect executables file extension on Windows by @sgrebnov in #1130
- Use quoted strings when performing federated SQL queries by @phillipleblanc in #1129
- Make Windows artifact names consistent with other platforms by @sgrebnov in #1132
- Make Windows installation less verbose by @sgrebnov in #1138
- Document Windows installation and add test by @sgrebnov in #1134
- Use transaction for DuckDB Table Writer by @Sevenannn in #1135
- Update Windows installation script url by @sgrebnov in #1143
- Update roadmap Apr 18, 2024 by @lukekim in #1142
- Test connection when new connection pool created by @ewgenius in #1126
- Enable clippy::clone_on_ref_ptr by @phillipleblanc in #1146
- Allow only alphanumeric dataset names when using
spice dataset configure
by @ewgenius in #1140 - Extend PR check to build with no default features, and each individual feature by @phillipleblanc in #1156
- Bump rustls from 0.21.10 to 0.21.11 by @dependabot in #1150
- Serde rule for ISO8601 time format by @y-f-u in #1151
- Add static linking for vcruntime dependencies on Windows by @sgrebnov in #1152
- Use clearer retention param key - retention_check_enabled instead by @y-f-u in #1158
spice upgrade
on Windows by @sgrebnov in #1155
Full Changelog: v0.11.0-alpha...v0.11.1-alpha
Spice.ai v0.11.0-alpha
The Spice v0.11.0-alpha release significantly improves the Databricks data connector with Databricks Connect (Spark Connect) support, adds the DuckDB data connector, and adds the AWS Secrets Manager secret store. In addition, enhanced control over accelerated dataset refreshes, improved SSL security for MySQL and PostgreSQL connections, and overall stability improvements have been added.
Highlights in v0.11.0-alpha
DuckDB data connector: Use DuckDB databases or connections as a data source.
AWS Secrets Manager Secret Store: Use AWS Secrets Managers as a secret store.
Custom Refresh SQL: Specify a custom SQL query for dataset refresh using refresh_sql
.
Dataset Refresh API: Trigger a dataset refresh using the new CLI command spice refresh
or via API.
Expanded SSL support for Postgres: SSL mode now supports disable
, require
, prefer
, verify-ca
, verify-full
options with the default mode changed to require
. Added pg_sslrootcert
parameter for setting a custom root certificate and the pg_insecure
parameter is no longer supported.
Databricks Connect: Choose between using Spark Connect or Delta Lake when using the Databricks data connector for improved performance.
Improved SSL support for Postgres: ssl mode now supports disable
, require
, prefer
, verify-ca
, verify-full
options with default mode changed to require
.
Added pg_sslrootcert
parameter to allow setting custom root cert for postgres connector, pg_insecure
parameter is no longer supported as redundant.
Internal architecture refactor: The internal architecture of spiced
was refactored to simplify the creation data components and to improve alignment with DataFusion concepts.
New Contributors
@edmondop's first contribution github.com/spiceai/spiceai/pull/1110!
Contributors
- @phillipleblanc
- @Jeadie
- @ewgenius
- @sgrebnov
- @y-f-u
- @lukekim
- @digadeesh
- @Sevenannn
- @gloomweaver
- @ahirner
New in this release
- Fixes MySQL
NULL
values by @gloomweaver in #1067 - Fixes PostgreSQL
NULL
values forNUMERIC
by @gloomweaver in #1068 - Adds Custom Refresh SQL support by @lukekim and @phillipleblanc in #1073
- Adds DuckDB data connector by @Sevenannn in #1085
- Adds AWS Secrets Manager secret store by @sgrebnov in #1063, #1064
- Adds Dataset refresh API by @sgrebnov in #1075, #1078, #1083
- Adds
spice refresh
CLI command for dataset refresh by @sgrebnov in #1112 - Adds
TEXT
andDECIMAL
types support and properly handlingNULL
for MySQL by @gloomweaver in #1067 - Adds MySQL
DATE
andTINYINT
types support for MySQL by @ewgenius in #1065 - Adds
ssl_rootcert_path
parameter for MySql data connector by @ewgenius in #1079 - Adds
LargeUtf8
support and explicitly passing the schema to data acceleratorSqlTable
by @phillipleblanc in #1077 - Adds Ability to configure data retention for accelerated datasets by @y-f-u in #1086
- Adds Custom SSL certificates for PostgreSQL data connector by @ewgenius in #1081
- Adds Conditional compile for Dremio by @ahirner in #1100
- Adds Ability for Databricks connector to use spark-connect-rs as the mechanism to execute queries against the Databricks by @edmondop in #1110
- Adds Ability to choose between Spark Connect and Delta Lake implementation for Databricks by @phillipleblanc in https://github.com/spiceai/spiceai/pull/1115/files
- Updates Databricks login parameters by @phillipleblanc in #1113
- Updates Architecture to simplify data components development by @phillipleblanc in #1040
- Updates Improved readability of GitHub Actions test job names by @lukekim in #1071
- Updates Upgrade Arrow, DataFusion, Tonic dependencies by @phillipleblanc in #1097
- Updates Handling non-string spicepod params by @ewgenius in #1098
- Updates Optional features compile: duckdb, databricks by @ahirner in #1100
- Updates Helm version to 0.1.3 by @Jeadie in #1120
- Removes
pg_insecure
parameter support from Postgres by @ewgenius in #1081
Full Changelog: v0.10.2-alpha...v0.11.0-alpha
Spice.ai v0.10.2-alpha
The v0.10.2-alpha release adds the MySQL data connector and makes external data connections more robust on initialization.
Highlights in v0.10.2-alpha
-
MySQL data connector: Connect to any MySQL server, including SSL support.
-
Data connections verified at initialization: Verify endpoints and authorization for external data connections (e.g. databricks, spice.ai) at initialization.
New Contributors
- @rthomas made their first contribution in #1022
- @ahirner made their first contribution in #1029
- @gloomweaver made their first contribution in #1004
Contributors
New in this release
- Adds MySQL data connector by @gloomweaver in #1004
- Fixes
show tables;
parsing in the Spice SQL repl. - Adds data connector verification at initialization
- Fixes Ensures unit and doc tests compile and run by @rthomas in #1022
- Improves Helm chart + Grafana dashboard by @phillipleblanc in #1030
- Fixes Makes data connectors optional features by @ahirner in #1029
- Fixes Fixes SpiceAI E2E for external contributors in Github actions by @ewgenius in #1023
- Fixes remove hardcoded
lookback_size
(& improve SpiceAI's ModelSource) by @Jeadie in #1016
Full Changelog: v0.10.1-alpha...v0.10.2-alpha
Spice.ai v0.10.1-alpha
The v0.10.1-alpha release focuses on stability, bug fixes, and usability by improving error messages when using SQLite data accelerators, improving the PostgreSQL support, and adding a basic Helm chart.
Highlights in v0.10.1-alpha
Improved PostgreSQL support for Data Connectors TLS is now supported with PostgreSQL Data Connectors and there is improved VARCHAR and BPCHAR conversions through Spice.
Improved Error messages Simplified error messages from Spice when propagating errors from Data Connectors and Accelerator Engines.
Spice Pods Command The spice pods
command can give you quick statistics about models, dependencies, and datasets that are loaded by the Spice runtime.
Contributors
New in this release
- Adds Basic Helm Chart for spiceai (#1002)
- Adds Support for
spice login
in environments with no browser. (#994) - Adds TLS support in Postgres connector. (#970)
- Fixes Improve Postgres VARCHAR and BPCHAR conversion. (#993)
- Fixes
spice pods
Returns incorrect counts. (#998) - Fixes Return friendly error messages for unsupported types in sqlite. (#982)
- Fixes Pass Tonic errors when receiving errors from dependencies. (#995)