Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More compact code for equals methods #99

Open
AndreasVoellmy opened this issue Jun 10, 2021 · 3 comments
Open

More compact code for equals methods #99

AndreasVoellmy opened this issue Jun 10, 2021 · 3 comments

Comments

@AndreasVoellmy
Copy link

AndreasVoellmy commented Jun 10, 2021

Consider a class with a few cases, like this:

@Data
 public abstract class C1
{
  public interface Cases <R>
  {
    R m1 (Integer p1)
    ;
    R m2 (Integer p1, String p2)
    ;
    R m3 (Integer p1, String p2, Object p3)
    ;
  }
  public abstract <R> R match (Cases<R> cases)
  ;
  @Override
  public abstract String toString ()
  ;
  @Override
  public abstract boolean equals (Object other)
  ;
  @Override
  public abstract int hashCode ()
  ;
}

This causes derive4j to generate an equals method in every subclass M1, M2, M3. Each one looks like this:

@Override
    public boolean equals(Object other) {
      return (other instanceof C1) && ((C1) other).match(C1s.cases((p1) -> this.p1.equals(p1),
          (p1, p2) -> false,
          (p1, p2, p3) -> false));
    }

More generally, for N cases, there will be N equals methods, each with N lines most of which are just mapping to false. When N gets moderately large, this generates a lot of code. For example, with N = 100, we get 10k lines of code just for these simple equals methods.

How about generating more compact code, something like this for each equals:

@Override
    public boolean equals(Object other) {
      return (other instanceof M1) && p1.equals(((M1) other).p1);
    }

That would cause the number of lines of code to scale linearly, rather than quadratically, since each equals just implements equals on its attributes.

@jbgi
Copy link
Member

jbgi commented Jun 11, 2021

The main constraint is that everything that is generated must be type-safe. C1 is an open hierarchy so there is no static guarentees that the subclasses of C1 generated by derive4j are the only subclasses of C1. Hence everything has to go through the contract of the C1 class, that is the match method. It must be possible to sanely compare a instance of C1 generated by derive4j with an instance of C1 that was manually defined.

Given that constraint there is two code reduction strategies we could implement:

  1. use caseOf() (or cases()) and otherwise :
class m1 extends C1 {
    @Override
    public boolean equals(Object other) {
      return (other instanceof C1) && caseOf((C1) other).m1(p1 -> this.p1.equals(p1)).otherwise(false);
    }
}
  1. first generate a default visitor, then use it to only implement one case for every subclasses:
interface DefaultCases<R> extends Cases {
   R defaultValue();
   default R m1 (Integer p1) { retrun defaultValue(); }
   default R m2 (Integer p1, String p2) { retrun defaultValue(); }
   default R m3 (Integer p1, String p2, Object p3) { retrun defaultValue(); }
}
class DefaultCasesToFalse extends DefaultCases<Boolean> {
  Boolean defaultValue() { return false; };
}
class m1 extends C1 {
   @Override
   public boolean equals(Object other) {
     return (other instanceof C1) && ((C1) other).match(new DefaultCasesToFalse() {
        @override
        R m1 (Integer p1) { return C1.this.p1.equals(p1); }
     });
   }
}

The second strategy is move involve (actually comes with a whole new feature: "default visitor") and only works with case class defined via a visitor interface (ie. not available when match method takes multiple function arguments). But its execution should be slightly more efficient.

My feeling is that since equals should be rarely a bottleneck, first strategy is fine (obviously only when there is more than 2 cases). Except for now equals generation does not depends on caseOf being generated. So this strategy could be used only when caseOf is generated (which it is by default). The second strategy could be used if a "default visitor" feature is implemented first (if there is demand for it on its own) but otherwise probably not worth it.

@AndreasVoellmy
Copy link
Author

Hi @jbgi!

I just noticed I had a mistake in the final code snippet in my post above. I edited it to fix it. What I meant was to check if other is an instance of M1. Since M1 is generated with private static final modifiers, the issue that C1 is an open hierarchy does not affect the soundness of M1's equals method, right?

@jbgi
Copy link
Member

jbgi commented Jun 17, 2021

The issue is that all sound instances of C1 should be comparable, be it generated by derive4j or manually crafted.
Ie.

  boolean testEqualsWithManuallyDefined() {
     return C1s.m1(1).equals(new C1() {
        public <R> R match (Cases<R> cases) { return cases.m1(1); };
        public boolean equals(Object other) {
          return (other instanceof C1) && caseOf((C1) other).m1(p1 -> p1.equals(1)).otherwise(false);
        }
     })
  };

should return true. But if we use your proposed definition of M1.equals then it would return false.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants