Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-2174: Register Retry Topic's components earlier #2176

Closed
wants to merge 4 commits into from

Conversation

tomazfernandes
Copy link
Contributor

Fixes #2174

Please let me know if there's a more adequate way of solving this, or if there's anything to be changed. Thanks.


Non-blocking delayed retries feature's beans were being registered lazily, when the first @KafkaListener annotated bean had a configuration associated with it in the KafkaListenerAnnotationBeanPostProcessor.

With that, if any bean autowired one of the feature's components before such bootstrapping it would throw a NoSuchBeanDefinitionException.

Now the components are registered in the RetryTopicRegistryPostProcessor before singletons are instantiated, so autowiring the feature's components in beans works as expected.

Adds a few adjustments to related classes and integration and unit tests.

@garyrussell
Copy link
Contributor

Thanks @tomazfernandes . I don't think I can get to review this before Monday's release.

It's a pretty big change - I wonder if it needs to be back-ported or whether the work around in the discussion is sufficient for the current releases?

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with, @garyrussell , that this is too much.
We may only consider this for the current 3.0 version. Back-port is too risky.

Thanks

this.clock = Clock.systemUTC();
}

@SuppressWarnings("deprecated")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this suppression here?
Doesn't look like you use some other deprecated API...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought my IDE was complaining, I'll remove it, thanks.

* @param clock the clock instance
*/
public void setClock(Clock clock) {
this.clock = clock;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assert.notNull() ?


@SuppressWarnings("deprecated")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DITTO

*/
@SpringJUnitConfig
@DirtiesContext
@EmbeddedKafka
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we don't need these top-level annotation since all your tests are inside those nested classes.
However I wonder what is the reason of those nested classes at all...
Why those tests cannot be in top-level class (or classes) ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to test whether it works with only one configuration in the registry or another, or none, so I had to have a separate context for each test.

Maybe there's a better way to solve this?

@tomazfernandes
Copy link
Contributor Author

tomazfernandes commented Mar 17, 2022

Thanks @tomazfernandes . I don't think I can get to review this before Monday's release.

No worries, I don't know why I had in mind that the release was on Tuesday.

It's a pretty big change - I wonder if it needs to be back-ported or whether the work around in the discussion is sufficient for the current releases?

I agree in terms of functionality the workaround should work. But I worry about the codebases of 3.0 and 2.x.x drifting apart and making it harder to add new features and fixes to 2.x.x. WDYT?

Maybe I can break this in two PRs, one with the functionality changes per se for 3.0, and another with some constructor changes other small changes I had to make to be back ported to 2.x.x. to minimize this. Even so, it might be confusing.

About the functionality itself, I did create enough unit and integration tests to be confident it'll work properly. The only thing I think I might be missing is if there's any difference in behavior between the testing runtime and the actual runtime that might make any of the scenarios not work.

@tomazfernandes
Copy link
Contributor Author

What if we add this to 2.9.0-M1 and release it shortly, leaving the Single Container feature for a 2.9.0-M2 release?

@garyrussell
Copy link
Contributor

I am ok with releasing a 2.9, but would prefer if we can sync it with Boot 2.7.

I don't think we can do M1 on Monday for Boot's M3; the next window is Boot's RC1 in a month.

That means we would either have to go straight to RC1 on April 18, or we could do an interim M1 when this feature is ready.

I am not sure how receptive the Boot team will be to get an RC out of the blue so 2.9 may end up being a Bootless version that users have to manually override.

@tomazfernandes
Copy link
Contributor Author

I see, I hadn't noticed the Mx versions are also synced to Boot. Do you think having a bootless version would be a good solution?

If you're not worried with the codebases drifting apart, we can merge this only to 3.0.0, or break the PR as I mentioned.

I was planning on adding the DestinationTopicResolver solution as a feature in the docs with some API polishing, but I think it might be messy if it might depend on a workaround, so maybe it'd also be a 3.0.0 or 2.9.x exclusive feature, or at least hold it until this fix is released in those versions.

We might also hold this PR and consider merging it in a later moment if that's better, no worries, I'm ok with whatever works for you.

Thanks

@garyrussell
Copy link
Contributor

After further thought; I think a stand-alone 2.9 (if we decide to do one) would be preferable, rather than forcing Boot 2.7 users to go to 2.9, given that it will get little milestone exposure before an RC will be due.

You can see the schedule on the calendar; we generally release Monday before Boot's Thursday releases.

https://calendar.spring.io/

A decoupled 2.9 would give us a bit more flexibility to add more feature content, and we could release it early Summer instead of late Spring.

@tomazfernandes
Copy link
Contributor Author

After further thought; I think a stand-alone 2.9 (if we decide to do one) would be preferable, rather than forcing Boot 2.7 users to go to 2.9, given that it will get little milestone exposure before an RC will be due.

Sure, we can do that if that's better.

You can see the schedule on the calendar; we generally release Monday before Boot's Thursday releases.

Then I probably mixed up with Boot's 😄

https://calendar.spring.io/

That's useful, thanks.

A decoupled 2.9 would give us a bit more flexibility to add more feature content, and we could release it early Summer instead of late Spring.

Ok, let's go this way then. I still have a couple of things to submit for this PR, I should commit it later today or tomorrow.

Thanks for looking into this.

@tomazfernandes
Copy link
Contributor Author

I've made a few adjustments and addressed the review comments.

I'm still not sure I've made the best choices for this PR - for example, I might have left the current bootstrapping as a fallback to the new one - it might help reducing risk and making the PR smaller. But it still wouldn't be risk free, so I'm not sure if it'd be any better considering we'd have more logic to deal with in the end.

Please let me know if you have any suggestions or any further concerns.

Thanks.

Resolves spring-projects#2174

Retry Topic feature's beans were being registered lazily, when the first @KafkaListener annotated bean had a configuration associated with it in the KafkaListenerAnnotationBeanPostProcessor.

With that, if any bean autowired one of the feature's components before such bootstrapping it would throw a NoSuchBeanDefinitionException.

Now the components are registered in the RetryTopicRegistryPostProcessor before singletons are instantiated, so autowiring of the feature's components in beans works as expected.

Adds a few adjustments to related classes and integration and unit tests.
@tomazfernandes
Copy link
Contributor Author

I've made a few adjustments to this.

  • Reestablished the legacy bootstrapping as a fallback
  • Rolled back non-essential changes to existing components
  • Added an IT to assure that users can disable the new bootstrapping method by adding a no-ops RetryTopicRegistryPostProcessor subclass bean
  • Added a WARN log to let users know if early bootstrapping didn't work and the fallback kicked in - likely if that happens a lot someone will reach out and let us know

I think with this changes this PR will be less risky and easier to review.

Please let me know if you have any further concerns.

Thanks.

Allow disabling new bootstrapping method
Warn if early bootstrapping is not successful
@tomazfernandes
Copy link
Contributor Author

Just removed an unnecessary annotation in the last couple force-pushes. Thanks.

@tomazfernandes
Copy link
Contributor Author

I've deleted my last comment, it was actually a configuration error on my part, sorry about that 😄

@@ -524,6 +524,7 @@ private void bootstrapRetryTopicIfNecessary() {
BeanDefinitionRegistry registry = (BeanDefinitionRegistry) this.beanFactory;
if (!registry.containsBeanDefinition(RetryTopicInternalBeanNames
.RETRY_TOPIC_BOOTSTRAPPER)) {
this.logger.warn(() -> "Could not bootstrap RetryTopic early. Bootstrapping lazily.");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this warn?
Looks like this one is going to happen in 99% application in the field.

What is wrong with just silently registering that bean if it does not present yet?
We do something like this conditional registration all the time and no warnings.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this is why I got the impression that you didn't get the context and wrote the comment below. Actually it's the opposite - I expect all of the applications to bootstrap the feature via the RegistryPostProcessor, and only if for some unexpected reason the RPP is not triggered properly this logic should be executed.

Since that's an unexpected path, I'd rather know that this is happening so we can properly fix it, and when we get enough confidence we should be able to remove this logic.


private void bootstrapRetryTopic(BeanDefinitionRegistry registry) {
BeanDefinition beanDef = getOrRegisterBeanDefinition(registry);
this.applicationContext.getBean(RETRY_TOPIC_BOOTSTRAPPER_BEAN_NAME, RetryTopicBootstrapper.class)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot call getBean() in the postProcessBeanDefinitionRegistry(). This is the same "Inversion of Control impedance" I mentioned in other place.
That's too early to deal with beans.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I think I can find a way around this, but it might require more changes than I'd like to.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean changes to existing components such as the RetryTopicBootstrapper.

@Override
public void setApplicationContext(ApplicationContext applicationContext) {
this.applicationContext = applicationContext;
this.clock = applicationContext.getBean(RetryTopicInternalBeanNames.INTERNAL_BACKOFF_CLOCK_BEAN_NAME, Clock.class);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot call getBean() from the setApplicationContext() - too early, may cause some bean initialization problem...

private void bootstrapRetryTopic(BeanDefinitionRegistry registry) {
BeanDefinition beanDef = getOrRegisterBeanDefinition(registry);
this.applicationContext.getBean(RETRY_TOPIC_BOOTSTRAPPER_BEAN_NAME, RetryTopicBootstrapper.class)
.bootstrapRetryTopic();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This better to be done via SmartInitializingSingleton impl on that RetryTopicBootstrapper. So, we don't need to have this getBean() over here and no early BeanFactory access.
Said that it feels like we don't need this RetryTopicRegistryPostProcessor at all...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I think it's about the premise. If we want to keep the component's bootstrapping optional while also enabling their autowiring to the end user, I think the SmartInitializingSingleton hook is too late for that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we really need to have them to be "autowireable" in the end?
Are they not infrastructure beans to support something else which is already those @RetryableTopic or so?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. All sounds good.

Here is my take on this: let's make those beans registered unconditionally if there is a requirements to be able to inject them somewhere in the end-user code. Even if these beans are going to be dangling for nothing, doesn't look like they are heavy, active or store some context to exhaust memory or CPU.

We may revise a conditional feature in the future.

Now that I think of it, we have two kinds of components - infrastructure components used at assembly time, and runtime components that are used at, well, runtime.

I think the runtime components are the ones users might be interested in. For example in #2172, the DestinationTopicContainer is injected for the user to figure out what is the first retry-topic name for a given main topic and reprocess DLT messages.

I think it might also be useful for users to have access to the KafkaConsumerBackOffManager so they can know which topic partitions are currently backed off and their due time, for example. But that one might be trickier because it has a more complex instantiation, but that's something fixable, and probably we won't need it now.

So maybe we can register early only DestinationTopicContainer and maybe KafkaBackOffManager and let the infrastructure components be instantiated where they currently are, when KLABPP detects the feature.

How does this sound? Would KafkaBootstrapConfiguration be the right place for that?

Your scan for @RetryableTopic doesn't look bad. Only the problem here that we need opposite injection from these infrastructure beans into end-user stuff. This is kinda anti-pattern and hard to implement and support eventually.

I'm not sure I understand what opposite injection is. You mean restricting users' access to the infrastrucure beans via injection?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Sounds like a plan.

I mean that you are going to register a bean dynamically, but at the same time it is asked for end-user autowiring.
That's my concern where we may cause circular dependency.
That's where it might not be conditional. At least for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Sounds like a plan.

Great, I'll do that then. I'll also take a closer look if there are any other beans that might be interesting for users.

I mean that you are going to register a bean dynamically, but at the same time it is asked for end-user autowiring. That's my concern where we may cause circular dependency. That's where it might not be conditional. At least for now.

Oh, I see, thanks for the explanation.

Thanks a lot for looking into this!

@tomazfernandes
Copy link
Contributor Author

Hi @artembilan, thanks again for looking into this.

Before I try to address your specific concerns, let me give you some more context - hopefully not as much as in the other PR 😄 - on why this came up. Or maybe you got the context and I didn't get your solution, but at least we can be sure we're in the same page.

Currently, at application bootstrapping, no RetryTopic components' beans are automatically registered, such as in the KafkaBootstrapConfiguration class, or boot's auto configuration. I did it that way so that, if the feature is not being used, the components are simply not registered. You might have noticed this is a rather bean-heavy feature.

So these are registered by the RetryTopicBootstrapper only if KafkaListenerAnnotationBeanPostProcessor process a bean that has and topic associated with a RetryTopicConfiguration or @RetryableTopic annotation.

The problem with this is that when users try to autowire any of the components, such as the DestinationTopicContainer as requested in #2172, depending on the order things happen, the component is not registered yet and we get a NoSuchBeanDefinitionException.

So, maybe my premise is wrong, and we should just register the BeanDefinitions whether or not the feature is being used - that would certainly be an easier path.

If we stick with the current premise, though, we need an earlier point in application bootstrapping to make this check and register the components. This is how I got to this solution using this RegistryPostProcessor, since it's executed before singletons are instantiated and its dependencies autowired.

Makes sense overall? I'll look into the specific concerns now.

Thanks

@artembilan
Copy link
Member

OK. All sounds good.

Here is my take on this: let's make those beans registered unconditionally if there is a requirements to be able to inject them somewhere in the end-user code.
Even if these beans are going to be dangling for nothing, doesn't look like they are heavy, active or store some context to exhaust memory or CPU.

We may revise a conditional feature in the future.

Your scan for @RetryableTopic doesn't look bad. Only the problem here that we need opposite injection from these infrastructure beans into end-user stuff.
This is kinda anti-pattern and hard to implement and support eventually.

@garyrussell
Copy link
Contributor

A decoupled 2.9 would give us a bit more flexibility to add more feature content,

I had forgotten about this discussion; I will change the milestone dates.

@tomazfernandes
Copy link
Contributor Author

I'm wondering, since we're adding quite a few improvements that we both want to reduce risk, but also get it in the hands of the users quickly since these are actual issues they're facing, how about we schedule a 2.10.0 after 2.9.0?

This way we could release 2.9.0 early, after the milestone exposure we think it's necessary, but don't have to wait until all features are done, or postpone everything else to 3.0.0.

@garyrussell
Copy link
Contributor

I'd rather not have 5 active branches; although 2.7.x officially goes out of OSS support in May, realistically, we will probably have a release or two after that.

I do not anticipate much of a delay for 2.9; we can release it in June; I just don't want to cram things into a new release in a couple of weeks by tying it to the Boot release train.

@tomazfernandes
Copy link
Contributor Author

I see, makes sense. I think June is a good timeline. After that, we can work with the 2.9.x versions for small changes and leave bigger ones to 3.0.0.

Copy link
Member

@artembilan artembilan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to revise this after we done with the #2204 .

Perhaps all of these "earlier" concerns will go away by themself when we have one single central configuration for infrastructure beans.

@tomazfernandes
Copy link
Contributor Author

Closed in favor of #2227

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Retry Topic's components can't be autowired before feature bootstrapping
3 participants