Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3947: [Java] Support subclasses in custom LogicalType Conversions #2766

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

tmoschou
Copy link

What is the purpose of the change

Currently GenericData.resolveUnion(schema, datum) will throw an AvroRuntimeException if datum is a subclass (or any instance) of the target type of an applicable logical type Conversion. For instance

  • "ip-address" logical type string converter to/from java.net.InetAddress (with Inet4Address, Inet6Address as subclasses)
  • "json" logical type string conversion to/from Jackson's JsonNode (with ObjectNode, ArrayNode, etc as subclasses)

This PR fixes AVRO-3947.

Verifying this change

A unit test was added to validate that GenericData.resolveUnion(schema, datum) behaves as intended.

Documentation

  • Does this pull request introduce a new feature? no
  • If yes, how is the feature documented? not applicable

@github-actions github-actions bot added the Java Pull Requests for Java binding label Feb 26, 2024
@tmoschou tmoschou force-pushed the bug/AVRO-3947-logical-type-conversions-subclasses branch 3 times, most recently from f3297d0 to 6a5a909 Compare February 26, 2024 05:31
@tmoschou tmoschou marked this pull request as draft February 26, 2024 06:15
@tmoschou tmoschou force-pushed the bug/AVRO-3947-logical-type-conversions-subclasses branch from 6a5a909 to 884b0f3 Compare February 26, 2024 07:20
@tmoschou tmoschou force-pushed the bug/AVRO-3947-logical-type-conversions-subclasses branch from 884b0f3 to 748f77f Compare February 26, 2024 07:26
@tmoschou
Copy link
Author

tmoschou commented Feb 26, 2024

Okay, I have also extended the fix to handling other code like GenericDatumWriter by fixing GenericData.getConversionByClass(datumClass, logicalType) which also failed when writing datums with type Class<? extends T> rather than Class<T>. We didn't notice it before as we are using our own custom writer. The only other usage I can see potentially being a problem is GenericData.getConversionByClass(class) whose usage as far as I can tell is only used for schema inference in Reflect/Protobuf - In that case I think it makes sense to keep the old behaviour, although some might have knowledge that values of a specific logical type is always of a particular subclass, although they can always use @AvroSchema for Reflect data. E.g.

class MyRecord {
  String foo;
  
  @AvroSchema("{\"type\":\"string\",\"logical-type\":\"json\"")
  ObjectNode metadata;
}

where conversion is defined for JsonNode (JsonNode.class.isAssignableFrom(ObjectNode.class))

@tmoschou tmoschou marked this pull request as ready for review February 26, 2024 08:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Java Pull Requests for Java binding
Projects
None yet
1 participant