Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#306 - [DRAFT] Introduce Injection of PerTableConfig params into TargetClients by the Factory #307

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

lmccay
Copy link
Contributor

@lmccay lmccay commented Jan 7, 2024

As described in #306 , this PR introduces injection of config params into TargetClient implementation classes. It removes the need for the api module and the init() in the TargetClient interface. It moves interfaces that were added for #304 back to the core module and will perhaps remove them again since this abstraction may no longer be needed. This is an open question.

Brief change log

  • added injection code to the TableFormatClientFactory class
  • added required setters to each TargetClient implementation
  • removed PerTableConfig from the init() args
  • moved interfaces related to PerTableConfig back to core
  • changed TestTableFormatClientFactory to align with and exercise the injection path

Verify this pull request

Ran existing tests and modified TestTableFormatClientFactory to align with and cover this path

@lmccay lmccay marked this pull request as draft January 7, 2024 07:27
@lmccay lmccay changed the title #306 - Introduce Injection of PerTableConfig params into TargetClients by the Factory #306 - [DRAFT] Introduce Injection of PerTableConfig params into TargetClients by the Factory Jan 7, 2024
Copy link
Contributor

@the-other-tim-brown the-other-tim-brown left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried this is adding too much indirection and complexity for users and new developers. Making the fields mutable within the clients can also lead to unexpected behavior. Previously these values would not be updated after initially set.

I will put some time into thinking of what alternative would look like this week.

@@ -82,8 +84,18 @@ public IcebergClient() {}
IcebergPartitionSpecSync partitionSpecSync,
IcebergDataFileUpdatesSync dataFileUpdatesExtractor,
IcebergTableManager tableManager) {
this.tableName = perTableConfig.getTableName();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why move this out of the _init method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@the-other-tim-brown - not sure I am answering the right question but... because PerTableConfig was removed from the init() which calls _init() and the ctor's also call _init so they should handle the PerTableConfig specifics. This is assuming that the ctor's are mostly (if not entirely) used by tests and they would be setting these things there rather than via the factory.

Maybe you see something wrong with what I did there that I am still missing?

@lmccay
Copy link
Contributor Author

lmccay commented Jan 17, 2024

I'm worried this is adding too much indirection and complexity for users and new developers. Making the fields mutable within the clients can also lead to unexpected behavior. Previously these values would not be updated after initially set.

I will put some time into thinking of what alternative would look like this week.

Interesting. Can you elaborate on both the indirection and the complexity? Seems less complex to me that they just have to add a setter for any config value that they require and it will be set for them.

Would adding an annotation to call out the intent of the setter more explicitly help with the indirection?
Seems to me that the indirection has always been provided by the factory which is what it is for. The fact that the factory is calling setters after initialization for you is really not anymore than calling the builders like it was before. Is it just the fact that the client isn't in the business of scraping the PerTableConfig itself anymore for the factory path?

Mutability being a problem - is there some longer running process where setters can be called arbitrarily that I am not aware of? If the setters are only called by the factory, I don't see how it would get messed up but again, maybe there are programmatic usecases other than a sync run where this might happen. I guess we could always not allow a setter to be called on a non-null member - effectively making it final (that feels icky though).

Anyway, interested to hear what your alternative will look like!

@the-other-tim-brown
Copy link
Contributor

Regarding complexity, is using reflection more complex than not using it? I think it is. It also makes it challenging to find callers of the methods which is how I typically navigate projects. Adding some annotation as you recommended would help here.

I also consider the above to be indirection.

On the question of whether someone will call public methods, I think you should consider any public method callable and users will not always understand the library deeply so you must approach the problem with this in mind. Users will use the library directly and not just use some shaded jar and CLI. This is what the demo does and what we are currently doing while running this production.

@lmccay
Copy link
Contributor Author

lmccay commented Jan 17, 2024

Regarding complexity, is using reflection more complex than not using it? I think it is. It also makes it challenging to find callers of the methods which is how I typically navigate projects. Adding some annotation as you recommended would help here.

Ahhh - thought you meant for the client developer. Certainly reflection is a bit more complex.

I also consider the above to be indirection.

okay.

On the question of whether someone will call public methods, I think you should consider any public method callable and users will not always understand the library deeply so you must approach the problem with this in mind. Users will use the library directly and not just use some shaded jar and CLI. This is what the demo does and what we are currently doing while running this production.

Sure. But any reasonable interpretation of the library would be that you instantiate the clients, in this case via the factory. Once you have them put together then sure you can call setters again but it is intentionally done and not likely in a multi-threaded scenario or the like. Once the sync starts there is no reasonable expectation that you would call setters during the sync.

The only sort of things that I can think of to make these immutable by end user code would add more indirection and complexity. :)

The other conversation on PTC refactoring possibly to a key-value type approach may make this unnecessary though.

I did learn how to cast an object to it actual self rather than the interface in this though which is cool!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants