Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature] Unified catalog support kudu #45590

Merged

Conversation

predator4ann
Copy link
Contributor

@predator4ann predator4ann commented May 14, 2024

Make unified catalog support kudu data sources.

example:

CREATE EXTERNAL CATALOG unified
PROPERTIES
(
    "type" = "unified",
    "unified.metastore.type" = "hive",
    "hive.metastore.uris" = "thrift://localhost:9083",
    "kudu.master" = "localhost:7051",
    "kudu.schema-emulation.enabled" = "true",
    "kudu.schema-emulation.prefix" = "impala::"
);

Fixes #45591

What type of PR is this:

  • BugFix
  • Feature
  • Enhancement
  • Refactor
  • UT
  • Doc
  • Tool

Does this PR entail a change in behavior?

  • Yes, this PR will result in a change in behavior.
  • No, this PR will not result in a change in behavior.

If yes, please specify the type of change:

  • Interface/UI changes: syntax, type conversion, expression evaluation, display information
  • Parameter changes: default values, similar parameters but with different default values
  • Policy changes: use new policy to replace old one, functionality automatically enabled
  • Feature removed
  • Miscellaneous: upgrade & downgrade compatibility, etc.

Checklist:

  • I have added test cases for my bug fix or my new feature
  • This pr needs user documentation (for new or modified features or behaviors)
    • I have added documentation for my new feature or new function
  • This is a backport pr

Bugfix cherry-pick branch check:

  • I have checked the version labels which the pr will be auto-backported to the target branch
    • 3.3
    • 3.2
    • 3.1
    • 3.0
    • 2.5

Signed-off-by: predator4ann <yunlong.sun@hotmail.com>
@predator4ann predator4ann requested review from a team as code owners May 14, 2024 07:52
@github-actions github-actions bot added the documentation Improvements or additions to documentation label May 14, 2024
}
return properties;
}

@Override
public List<String> getPartitionColumnNames() {
return partColNames;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
The KuduTable constructor does not initialize the properties, masterAddresses, catalogName, databaseName, tableName, and partColNames fields which could lead to NullPointerException when accessing these fields.

You can modify the code like this:

public KuduTable() {
    super(TableType.KUDU);
    this.properties = new HashMap<>();
    this.masterAddresses = "";
    this.catalogName = "";
    this.databaseName = "";
    this.tableName = "";
    this.partColNames = new ArrayList<>();
}

HiveMetaClient metaClient = HiveMetaClient.createHiveMetaClient(this.hdfsEnvironment, properties);
hiveMetastore = Optional.of(new HiveMetastore(metaClient, catalogName, MetastoreType.HMS));
hiveMetastore = Optional.of(new HiveMetastore(metaClient, catalogName, metastoreType));
// TODO caching hiveMetastore support
}
return new KuduMetadata(catalogName, hdfsEnvironment, kuduMaster, schemaEmulationEnabled, schemaEmulationPrefix,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The most risky bug in this code is:
A security issue due to changing visibility of configuration keys from private to public.

You can modify the code like this:

@@ -38,12 +38,14 @@

public class KuduConnector implements Connector {
    private static final String HIVE = "hive";
+   private static final String GLUE = "glue";
    private static final String KUDU = "kudu";
-   private static final Set<String> SUPPORTED_METASTORE_TYPE = Sets.newHashSet(HIVE, KUDU);
-   private static final String KUDU_MASTER = "kudu.master";
-   private static final String KUDU_CATALOG_TYPE = "kudu.catalog.type";
-   private static final String KUDU_SCHEMA_EMULATION_ENABLED = "kudu.schema-emulation.enabled";
-   private static final String KUDU_SCHEMA_EMULATION_PREFIX = "kudu.schema-emulation.prefix";
+   private static final Set<String> SUPPORTED_METASTORE_TYPE = Sets.newHashSet(HIVE, GLUE, KUDU);
+   private static final String KUDU_MASTER = "kudu.master";
+   private static final String KUDU_CATALOG_TYPE = "kudu.catalog.type";
+   private static final String KUDU_SCHEMA_EMULATION_ENABLED = "kudu.schema-emulation.enabled";
+   private static final String KUDU_SCHEMA_EMULATION_PREFIX = "kudu.schema-emulation.prefix";
+   private static final String DEFAULT_KUDU_MASTER = "localhost:7051";
    private final String catalogName;
    private final String kuduMaster;
    private final String catalogType;

Copy link

sonarcloud bot commented May 14, 2024

Copy link

[FE Incremental Coverage Report]

pass : 13 / 13 (100.00%)

file detail

path covered_line new_line coverage not_covered_line_detail
🔵 com/starrocks/connector/unified/UnifiedConnector.java 1 1 100.00% []
🔵 com/starrocks/connector/kudu/KuduConnector.java 5 5 100.00% []
🔵 com/starrocks/catalog/KuduTable.java 5 5 100.00% []
🔵 com/starrocks/connector/unified/UnifiedMetadata.java 2 2 100.00% []

Copy link

[BE Incremental Coverage Report]

pass : 0 / 0 (0%)

@DorianZheng DorianZheng enabled auto-merge (squash) May 15, 2024 07:09
@DorianZheng DorianZheng merged commit f9cd67d into StarRocks:main May 20, 2024
68 checks passed
@miomiocat
Copy link
Contributor

@mergify backport branch-3.2

Copy link
Contributor

mergify bot commented May 20, 2024

backport branch-3.2

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 20, 2024
Signed-off-by: predator4ann <yunlong.sun@hotmail.com>
(cherry picked from commit f9cd67d)

# Conflicts:
#	docs/en/data_source/catalog/kudu_catalog.md
#	docs/zh/data_source/catalog/kudu_catalog.md
#	fe/fe-core/src/main/java/com/starrocks/catalog/KuduTable.java
#	fe/fe-core/src/main/java/com/starrocks/connector/kudu/KuduConnector.java
#	fe/fe-core/src/main/java/com/starrocks/connector/unified/UnifiedConnector.java
@miomiocat
Copy link
Contributor

@mergify backport branch-3.3

Copy link
Contributor

mergify bot commented May 20, 2024

backport branch-3.3

✅ Backports have been created

mergify bot pushed a commit that referenced this pull request May 20, 2024
Signed-off-by: predator4ann <yunlong.sun@hotmail.com>
(cherry picked from commit f9cd67d)
miomiocat pushed a commit that referenced this pull request May 20, 2024
Co-authored-by: predator4ann <yunlong.sun@hotmail.com>
@eshishki
Copy link

i'm using unified catalog with glue, and there is no kudu support there and i'm getting an error now trying to access unified catalog

Null hive.metastore.uris, please check your property's key and value of catalog or resource.
at com.starrocks.common.util.Util.validateMetastoreUris(Util.java:425) ~[starrocks-fe.jar:?]
at com.starrocks.connector.kudu.KuduConnector.getMetadata(KuduConnector.java:98) ~[starrocks-fe.jar:?]
at com.starrocks.connector.unified.UnifiedConnector.lambda$getMetadata$0(UnifiedConnector.java:76) ~[starrocks-fe.jar:?]
at com.google.common.collect.RegularImmutableMap.forEach(RegularImmutableMap.java:297) ~[spark-dpp-1.0.0.jar:?]
at com.starrocks.connector.unified.UnifiedConnector.getMetadata(UnifiedConnector.java:76) ~[starrocks-fe.jar:?]
at com.starrocks.connector.CatalogConnector.getMetadata(CatalogConnector.java:43) ~[starrocks-fe.jar:?]

@predator4ann
Copy link
Contributor Author

predator4ann commented May 30, 2024

i'm using unified catalog with glue, and there is no kudu support there and i'm getting an error now trying to access unified catalog

Null hive.metastore.uris, please check your property's key and value of catalog or resource. at com.starrocks.common.util.Util.validateMetastoreUris(Util.java:425) ~[starrocks-fe.jar:?] at com.starrocks.connector.kudu.KuduConnector.getMetadata(KuduConnector.java:98) ~[starrocks-fe.jar:?] at com.starrocks.connector.unified.UnifiedConnector.lambda$getMetadata$0(UnifiedConnector.java:76) ~[starrocks-fe.jar:?] at com.google.common.collect.RegularImmutableMap.forEach(RegularImmutableMap.java:297) ~[spark-dpp-1.0.0.jar:?] at com.starrocks.connector.unified.UnifiedConnector.getMetadata(UnifiedConnector.java:76) ~[starrocks-fe.jar:?] at com.starrocks.connector.CatalogConnector.getMetadata(CatalogConnector.java:43) ~[starrocks-fe.jar:?]

Apologies for any confusion, but there's no need to check hive.metastore.uris when kudu.catalog.type is configured to glue. The pull request #46436 has been submitted to address this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.3-merged documentation Improvements or additions to documentation version:3.4
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature] Unified catalog support Kudu
5 participants