Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Binary backwards-incompatibily in 1.33.0 #7552

Closed
njhill opened this issue Oct 24, 2020 · 16 comments · Fixed by #7564
Closed

Binary backwards-incompatibily in 1.33.0 #7552

njhill opened this issue Oct 24, 2020 · 16 comments · Fixed by #7564
Assignees
Labels

Comments

@njhill
Copy link
Contributor

njhill commented Oct 24, 2020

My library contains code like:

NettyChannelBuilder ncb = NettyChannelBuilder.forTarget(...);
ncb.setMaxInboundMessageSize(...);
ncb.eventLoopGroup(...);
// ...
ManagedChannel channel = ncb.build();

which breaks when moving from pre-1.33.0 to 1.33.0, unless recompiled (and then the new binary would not be compatible with versions < 1.33.0). This is because a class (AbstractManagedChannelImplBuilder) was removed from the direct superclass heriarchy:

Changing the direct superclass or the set of direct superinterfaces of a class type will not break compatibility with pre-existing binaries, provided that the total set of superclasses or superinterfaces, respectively, of the class type loses no members.

I know that technically netty transport is still considered experimental (per #1784), but (a) it's the primary transport and (b) it's been that way for more than 4 years.

Was this intentional/known or is it a case of "it's experimental, tough!"?

It should be easy to avoid I think by for example just adding an empty abstract class AbstractManagedChannelImplBuilder extends ForwardingChannelBuilder... maybe in a 1.33.1? :)

@sanjaypujare
Copy link
Contributor

@njhill what errors do you get when you try to use the existing libraries with 1.33.0? AFAICS no members of any super class/interface are lost and no user code should be referencing AbstractManagedChannelImplBuilder.

@njhill
Copy link
Contributor Author

njhill commented Oct 25, 2020

@sanjaypujare see the example above, it is only referencing NettyChannelBuilder. And see my quote from the spec - it's clear that removal of AbstractManagedChannelImplBuilder is not binary backwards compatible even if the methods are still present. You can insert a superclass of NettyChannelBuilder above or below AbstractManagedChannelImplBuilder but it still needs to be present as a superclass.

The error I get is:

java.lang.NoSuchMethodError: 'io.grpc.internal.AbstractManagedChannelImplBuilder io.grpc.netty.NettyChannelBuilder.maxInboundMessageSize(int)'

@sanjaypujare
Copy link
Contributor

Okay, I was able to reproduce it. The code generated with the previous grpc version (1.32.1 in my case) for the io.grpc.netty.NettyChannelBuilder.maxInboundMessageSize(int) call was:

       9: invokevirtual #20                 // Method io/grpc/netty/NettyChannelBuilder.maxInboundMessageSize:(I)Lio/grpc/internal/AbstractManagedChannelImplBuilder;

which references AbstractManagedChannelImplBuilder as the return type. In the pre-1.33 versions that's the class where maxInboundMessageSize was defined as

T maxInboundMessageSize(int max)

This was moved to NettyChannelBuilder in 1.33.

@sergiitk @ejona86 looks like this issue needs to be resolved?

@ejona86
Copy link
Member

ejona86 commented Oct 26, 2020

The goal of the change was to remove the internal class from the hierarchy. We didn't consider the difference in return types. If a break is the only way to remove the internal class from the hierarchy, we may have to do that. Although we should also consider if the new forwarding parent class exposes us to this same problem in the future.

(a) it's the primary transport and (b) it's been that way for more than 4 years.

While it is the primary transport, the primary way to use the transport is via ManagedChannelBuilder, not NettyChannelBuilder directly. That said, yes, there's a lot of users of this API.

A strategy to allow a graceful migration would be to have FowardingChannelBuilder extend AbstractManagedChannelImplBuilder. AbstractManagedChannelImplBuilder itself could just have every method throw or call super. It should only implement methods it previously implemented. That would cause recompilations to use the FowardingChannelBuilder return type instead of AbstractManagedChannelImplBuilder with compatibility methods generated by javac for the older ABI. After 6 months or a year we would remove AbstractManagedChannelImplBuilder.

The question that remains for me is whether we should expose FowardingChannelBuilder as a parent class of NettyChannelBuilder. We could just re-define all the methods. That does make maintenance harder as we are more likely to forget to add support for new methods. But it would avoid this sort of problem in the future and it would avoid needing to expose the internal class to FowardingChannelBuilder's hierarchy.

@sergiitk
Copy link
Member

@njhill In Here's more context on the reasoning behind removal the internal class from the hierarchy: #7211.

@njhill
Copy link
Contributor Author

njhill commented Oct 26, 2020

A strategy to allow a graceful migration would be to have FowardingChannelBuilder extend AbstractManagedChannelImplBuilder. AbstractManagedChannelImplBuilder itself could just have every method throw or call super. It should only implement methods it previously implemented.

@ejona86 right, that's what I was alluding to above. I think it should not even be necessary for such a placeholder AbstractManagedChannelImplBuilder to explicitly impl any methods... only the constructors calling super.

@ejona86
Copy link
Member

ejona86 commented Oct 26, 2020

I think it should not even be necessary for such a placeholder AbstractManagedChannelImplBuilder to explicitly impl any methods... only the constructors calling super.

I don't think so, since I think Javac only duplicates methods based on what it knows is shadowed. My testing confirms that behavior; we need to re-define the methods.

@ejona86
Copy link
Member

ejona86 commented Oct 26, 2020

@sergiitk, it seems 25 methods from ForwardingChannelBuilder aren't shadowed by NettyChannelBuilder. That's a lot. That means we'll want to continue using ForwardingChannelBuilder.

It looks like we can change the generics of ForwardingChannelBuilder from <T extends ForwardingChannelBuilder<T>> to <T extends ManagedChannelBuilder<T>> to avoid ForwardingChannelBuilder being defined as the return type of the overridden methods. That avoids us repeating this problem in the future. Although it will be an ABI breakage for existing users of the class.

Unfortunately, we can't change to the weaker generics in ForwardingChannelBuilder and have it extend AbstractManagedChannelImplBuilder. That makes it unclear how to proceed.

@sergiitk
Copy link
Member

sergiitk commented Oct 26, 2020

On top of that, other Server and Channel Builders are affected, see v.1.33.0 release notes:

  • netty: The class io.grpc.netty.NettyServerBuilder is no longer a subclass of the internal class io.grpc.internal.AbstractServerImplBuilder
  • netty: The class io.grpc.netty.NettyChannelBuilder is no longer a subclass of the internal class io.grpc.internal.AbstractManagedChannelImplBuilder
  • okhttp: The class io.grpc.okhttp.OkhttpChannelBuilder is no longer a subclass of the internal class io.grpc.internal.AbstractManagedChannelImplBuilder
  • core: The class io.grpc.inprocess.InProcessChannelBuilder is no longer a subclass of the internal class io.grpc.internal.AbstractManagedChannelImplBuilder
  • cronet: The class io.grpc.cronet.CronetChannelBuilder is no longer a subclass of the internal class io.grpc.internal.AbstractManagedChannelImplBuilder

EDIT: all -> other

@ejona86
Copy link
Member

ejona86 commented Oct 27, 2020

For the short-term, we're going to copy ForwardingChannelBuilder (and server) to the old AbstractManagedChannelImplBuider name to restore the previous API. We'll release a 1.33.1.

That will then give us time to figure out what we want to do. I am expecting we'll need to break at least one ABI, and make use of the ExperimentalAPI allowance to do so, but we need to figure out which and how we want users to react.

@njhill
Copy link
Contributor Author

njhill commented Oct 27, 2020

Thanks @ejona86

I think it should not even be necessary for such a placeholder AbstractManagedChannelImplBuilder to explicitly impl any methods... only the constructors calling super.

I don't think so, since I think Javac only duplicates methods based on what it knows is shadowed. My testing confirms that behavior; we need to re-define the methods.

Are you sure? This appears to work for me:

public abstract class AbstractManagedChannelImplBuilder<T extends ManagedChannelBuilder<T>>
        extends ManagedChannelBuilder<T> {}

public abstract class ForwardingChannelBuilder<T extends AbstractManagedChannelImplBuilder<T>>
    extends AbstractManagedChannelImplBuilder<T> {
    // ... contents of class unchanged
}

Note generic param of ForwardingChannelBuilder does still need to be changed (unfortunately)

@ejona86
Copy link
Member

ejona86 commented Oct 27, 2020

@njhill, in your example ForwardingChannelBuilder<T extends AbstractManagedChannelImplBuilder<T>> is redefining all the methods. So it is necessary. That signature is very bad though, as it increases the amount of code that will reference the internal class.

I was suggesting having ForwardingChannelBuilder<T extends ForwardingChannelBuilder<T>>, so that new compilations would stop referencing the internal class. For the old methods to show up with that approach, AbstractManagedChannelImplBuilder would need to redefine all the methods.

@njhill
Copy link
Contributor Author

njhill commented Oct 27, 2020

@ejona86 I see what you mean, that makes sense. AbstractManagedChannelImplBuilder could still extend ForwardingChannelBuilder though, right? (just overriding the relevant methods)

and might you consider changing ForwardingChannelBuilder<T extends ForwardingChannelBuilder<T>> to ForwardingChannelBuilder<T extends ManagedChannelBuilder<T>> as part of this same 1.33.1 update (as you had previously suggested above)? Since it's a brand new class so there should not be much risk of breakage (and given that 1.33.0 already has breakage)?

@ejona86
Copy link
Member

ejona86 commented Oct 28, 2020

AbstractManagedChannelImplBuilder could still extend ForwardingChannelBuilder though, right? (just overriding the relevant methods)

Yes, it could. But that is of dubious value. New invocations of javac will still use the AbstractManagedChannelImplBuilder-returning methods.

and might you consider changing ForwardingChannelBuilder<T extends ForwardingChannelBuilder> to ForwardingChannelBuilder<T extends ManagedChannelBuilder> as part of this same 1.33.1 update (as you had previously suggested above)? Since it's a brand new class so there should not be much risk of breakage (and given that 1.33.0 already has breakage)?

No. It isn't a new class. It's actually pretty old. We can do that with ForwardingServerBuilder, but it is probably better to just remove/hide the class in the short-term.

ejona86 added a commit to ejona86/grpc-java that referenced this issue Nov 18, 2020
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
grpc#7552. ForwardingServerBuilder
isn't yet public, so just copy it; it'll be easy to clean up.
ejona86 added a commit to ejona86/grpc-java that referenced this issue Nov 18, 2020
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
grpc#7552. ForwardingServerBuilder
isn't yet public, so just copy it; it'll be easy to clean up.

Since ForwardingServerBuilder is package-private, the JavaDoc isn't
particularly pretty. But I think that's the best we can do at the
moment.
ejona86 added a commit to ejona86/grpc-java that referenced this issue Nov 18, 2020
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
grpc#7552.
ejona86 added a commit that referenced this issue Nov 18, 2020
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
#7552.
@ejona86
Copy link
Member

ejona86 commented Jan 9, 2021

Dunno if it is a good idea, but .class rewriting may be a good enough hack to ease migration. Even if it works, I think we'd remove it after a few releases; we don't want this long-term.

Option 1. Change the generics to the ideal and generate methods with the old method signature. It is possible to do that via .class constant rewriting, but I also found a tool that makes it easier. c8140e6

Option 2. Let javac generate both method signatures like today, but rewrite the generics in the class file, When javac compiles other code, it sees the rewritten generics. e51c509

Seems to work for Java. Dunno if it breaks things like Kotlin, but I mostly expect not.

dfawley pushed a commit to dfawley/grpc-java that referenced this issue Jan 15, 2021
This reduces ABI issues caused by returning the more precise
XdsServerBuilder in the API. See
grpc#7552.
@sergiitk
Copy link
Member

For posterity, here's some details on reproducing, fixing, and understanding this issue: https://gist.github.com/sergiitk/39583f20906df1813f5e170317a35dc4

ejona86 added a commit to ejona86/grpc-java that referenced this issue Jan 22, 2021
This provides us a path forward with grpc#7211 (hiding
AbstractManagedChannelImplBuilder and AbstractServerImplBuilder) while
providing users a migration path to manage the ABI breakage (grpc#7552). We
do a .class hack so that recompiling avoids the internal class reference
yet the old methods are still available.

Leaving the classes as-is causes javac to compile two versions of each
method, one returning the public class (e.g. ServerBuilder) and one
returning the internal class (e.g., AbstractServerImplBuilder). However,
we rewrite the signature that is used at compile time so that new
compilations will not reference internal-returning methods.

This is intended to be temporary, just to give a migration path. Once we
have given users some time to recompile we will remove this rewriting
and change the generics to use public classes.
ejona86 added a commit that referenced this issue Jan 26, 2021
This provides us a path forward with #7211 (hiding
AbstractManagedChannelImplBuilder and AbstractServerImplBuilder) while
providing users a migration path to manage the ABI breakage (#7552). We
do a .class hack so that recompiling avoids the internal class reference
yet the old methods are still available.

Leaving the classes as-is causes javac to compile two versions of each
method, one returning the public class (e.g. ServerBuilder) and one
returning the internal class (e.g., AbstractServerImplBuilder). However,
we rewrite the signature that is used at compile time so that new
compilations will not reference internal-returning methods.

This is intended to be temporary, just to give a migration path. Once we
have given users some time to recompile we will remove this rewriting
and change the generics to use public classes.
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jun 3, 2021
@grpc grpc unlocked this conversation Oct 3, 2023
@grpc grpc locked and limited conversation to collaborators Oct 3, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants