Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Roadmap 2023 #9448

Open
8 of 9 tasks
BohuTANG opened this issue Jan 3, 2023 · 9 comments
Open
8 of 9 tasks

Roadmap 2023 #9448

BohuTANG opened this issue Jan 3, 2023 · 9 comments
Assignees
Labels
roadmap-track Roadmap track issues

Comments

@BohuTANG
Copy link
Member

BohuTANG commented Jan 3, 2023

After a full year of research and development in 2022, the functionality and stability of Databend were significantly enhanced, and several users began using it in production. Databend has helped them greatly reduce costs and operational complexity issues.

This is Databend Roadmap in 2023 (discussion).

See also:

Main tasks

v1.3

v1.2 (Prepare for release on May 15th)

v1.1 (Prepare for release on April 5th)

v1.0 (Prepare for release on March 5th)

Features

Task Status Comments
Update#9261 DONE need optimized(release in v1.0)
Privileges DONE
Alter table DONE high-priority(release in v1.0 )
Window function#6342 DONE
Lambda function and high-order functions DONE
Materialized view Aggregating index DONE
Support SET_VAR hints#8833 DONE
Parquet reader DONE
DataFrame DONE
Data Sharing(community version) DONE
Concurrent query enhance IN PROGRESS
Distributed COPY#8594 DONE
Support Decimal data type#2931 DONE high-priority(release in v1.0 )
Add Column-Level dynamic data masking support PLAN

Improvements

Task Status Comments
New expression#9411 DONE
Error message PLAN

Planner

Task Status Comments
Scalar expression normalization DONE
Column constraint framework DONE
Functional dependency framework#7438 DONE
Join reorder DONE
CBO DONE high-priority(release in v1.0)
Support TPC-DS DONE
Support optimization tracing PLAN Easy to debug/study.

Cache

Task Status Comments
Unified cache layer DONE
Meta data cache DONE
Index data cache DONE
Block data cache DONE high-priority(release in v1.0 )

Data Storage

Task Status Comments
Fuse engine re-clustering DONE high-priority(release in v1.1)
Fuse engine orphan data cleanup DONE high-priority(release in v1.0)

Distributed Query Execution

Task Status Comments
Visualized profiling IN PROGRESS
Aggregation spilling DONE high-priority(release in v1.1)

Resource Quota

Task Status Comments
Session-level quota control (CPU/Memory) DONE

Schema-Less Search

Task Status Comments
JSON indexing DONE high-priority
Fulltext index#3915 IN PROGRESS high-priority
Array functions#7931 DONE high-priority
Faiss index#9699 PLAN

LakeHouse

Task Status Comments
Apache Hive DONE
Apache Iceberg DONE
Delta Lake PLAN
Querying external storage(Parquet) DONE

Integrations

Task Status Comments
Dbt integration DONE
Airbyte integration DONE
Datadog Vector integrate with Rust-driver DONE
Datax integrate with Java-driver DONE
CDC with Flink DONE
CDC with Kafka DONE

Meta

Task Status Comments
Jepsen test DONE
Store membership in raft DONE
Nonblocking snapshot building DONE
Snapshot file format impl DONE
Upgrade on-disk store format DONE

Testing

Task Status Comments
SQLlogic Test DONE Supports more test cases
SQLancer Test DONE Supports more type and more cases
Fuzzer Test IN PROGRESS

Releases

@BohuTANG BohuTANG pinned this issue Jan 3, 2023
@flaneur2020
Copy link
Member

any plan about improving concurrency capabilities? so developers can depend on databend to make some data exploring platforms (like google analystics?) on the web.

@flaneur2020
Copy link
Member

any plan about tuning the metasrv's memory usage? I've got a OOM last week, IMHO it can store most the data in the disk?

@BohuTANG
Copy link
Member Author

BohuTANG commented Jan 3, 2023

any plan about improving concurrency capabilities? so developers can depend on databend to make some data exploring platforms (like google analystics?) on the web.

Added: Concurrent query enhance

@BohuTANG
Copy link
Member Author

BohuTANG commented Jan 3, 2023

any plan about tuning the metasrv's memory usage? I've got a OOM last week, IMHO it can store most the data in the disk?

@drmingdrmer will fill the meta section, I think he will do it.

@yufan022
Copy link
Contributor

yufan022 commented Jan 12, 2023

Any plan to support decimal data type? This is essential If we want to use databend in financial related fields. Will we see it in the first half of the year?

@BohuTANG
Copy link
Member Author

Any plan to support decimal data type? This is essential If we want to use databend in financial related fields. Will we see it in the first half of the year?

Added to the main task, thanks.

@flaneur2020
Copy link
Member

will fault tolerance on query processing be planned in 2023?

likewise I have some spot instances, the cluster may handles a shutdowned instance gracefully and not affect the running queries.

@BohuTANG
Copy link
Member Author

BohuTANG commented Feb 8, 2023

will fault tolerance on query processing be planned in 2023?

Will do but hard to do, so the priority is low.

likewise I have some spot instances, the cluster may handles a shutdowned instance gracefully and not affect the running queries.

Please file an issue for that.

@thatcort
Copy link

thatcort commented Aug 7, 2023

Is there a plan for when the vector index feature will be added? It is part of #10689 but doesn't seem to have an associated ticket.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
roadmap-track Roadmap track issues
Projects
None yet
Development

No branches or pull requests

4 participants