Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ SDK PERFORMANCE] The execution of the show tables command takes a long time. #4295

Open
wxp0329 opened this issue Mar 27, 2023 · 2 comments

Comments

@wxp0329
Copy link

wxp0329 commented Mar 27, 2023

image

As shown in the above figure, CarbonShowTablesCommand obtains metadata from metastore for each table. Currently, when there are 180,000 tables, it takes a long time (about 1 hours) to run the show tables command in spark-sql shell, which needs to be optimized.
When the filter function is not invoked, it takes about 12 seconds to obtain 180,000 tables by running the show tables command.As shown in the following figure.
image

@kevinjmh
Copy link
Member

Maybe we can try to get tables' info in one batch instead of one by one

  def getTable(db: String, table: String): CatalogTable

  def getTablesByName(db: String, tables: Seq[String]): Seq[CatalogTable]

@wxp0329
Copy link
Author

wxp0329 commented Mar 30, 2023

Maybe we can try to get tables' info in one batch instead of one by one

  def getTable(db: String, table: String): CatalogTable

  def getTablesByName(db: String, tables: Seq[String]): Seq[CatalogTable]

hi, when and which version can solve the problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants