Wednesday, February 10, 2016

Cleaner Responsibilities - Get rid of equals, compareTo and toString

Have you ever looked at the javadoc of the Object-class in Java? Probably. You tend to end up there every now and then when digging your way down the inheritance tree. One thing you might have noticed is that it has quite a few methods that every class must inherit. The favorite methods to implement yourself rather than stick with the original ones are probably .toString(), .equals() and .hashCode() (why you should always implement both of the latter is described well by Per-Åke Minborg in this post).

But these methods are apparently not enough. Many people mix in additional interfaces from the standard libraries like Comparable and Serializable. But is that really wise? Why do everyone want to implement these methods on their own so badly? Well, implementing your own .equals() and .hashCode() methods will probably make sense if you are planning on storing them in something like a HashMap and want to control hash collisions, but what about compareTo() and toString()?

In this article I will present an approach to software design that we use on the Speedment open source project where methods that operate on objects are implemented as functional references stored in variables rather than overriding Javas built in methods. There are several advantages to this. Your POJOs will be shorter and more concise, common operations can be reused without inheritance and you can switch between different configurations in a flexible matter.

Original Code

Let us begin by looking at the following example. We have a typical Java class named Person. In our application we want to print out every person from a Set in the order of their firstname followed by lastname (in case two persons share the same firstname).

Person.java
public class Person implements Comparable<Person> {
    
    private final String firstname;
    private final String lastname;
    
    public Person(String firstname, String lastname) {
        this.firstname = firstname;
        this.lastname  = lastname;
    }

    public String getFirstname() {
        return firstname;
    }

    public String getLastname() {
        return lastname;
    }
    
    @Override
    public int hashCode() {
        int hash = 7;
        hash = 83 * hash + Objects.hashCode(this.firstname);
        hash = 83 * hash + Objects.hashCode(this.lastname);
        return hash;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null) return false;
        if (getClass() != obj.getClass()) return false;
        final Person other = (Person) obj;
        if (!Objects.equals(this.firstname, other.firstname)) {
            return false;
        }
        return Objects.equals(this.lastname, other.lastname);
    }

    @Override
    public int compareTo(Person that) {
        if (this == that) return 0;
        else if (that == null) return 1;

        int comparison = this.firstname.compareTo(that.firstname);
        if (comparison != 0) return comparison;

        comparison = this.lastname.compareTo(that.lastname);
        return comparison;
    }

    @Override
    public String toString() {
        return firstname + " " + lastname;
    }
}
Main.java
public class Main {
    public static void main(String... args) {
        final Set people = new HashSet<>();
        
        people.add(new Person("Adam", "Johnsson"));
        people.add(new Person("Adam", "Samuelsson"));
        people.add(new Person("Ben", "Carlsson"));
        people.add(new Person("Ben", "Carlsson"));
        people.add(new Person("Cecilia", "Adams"));
        
        people.stream()
            .sorted()
            .forEachOrdered(System.out::println);
    }
}
Output
run:
Adam Johnsson
Adam Samuelsson
Ben Carlsson
Cecilia Adams
BUILD SUCCESSFUL (total time: 0 seconds)

Person implements several methods here to control the output of the stream. The hashCode() and equals() method make sure that duplicate persons can't be added to the set. The compareTo() method is used by the sorted action to produce the desired order. The overridden toString()-method is finally controlling how each Person should be printed when System.out.println() is called. Do you recognize this structure? You can find it in almost every java project out there.

Alternative Code

Instead of putting all functionality into the Person class, we can try and keep it as clean as possible and use functional references to handle these decorations. We remove all the boilerplate with equals, hashCode, compareTo and toString and instead we introduce two static variables, COMPARATOR and TO_STRING.

Person.java
public class Person {
    
    private final String firstname;
    private final String lastname;
    
    public Person(String firstname, String lastname) {
        this.firstname = firstname;
        this.lastname  = lastname;
    }

    public String getFirstname() {
        return firstname;
    }

    public String getLastname() {
        return lastname;
    }
    
    public final static Comparator<Person> COMPARATOR =
        Comparator.comparing(Person::getFirstname)
            .thenComparing(Person::getLastname);
    
    public final static Function<Person, String> TO_STRING =
        p -> p.getFirstname() + " " + p.getLastname();
}
Main.java
public class Main {
    public static void main(String... args) {
        final Set people = new TreeSet<>(Person.COMPARATOR);
        
        people.add(new Person("Adam", "Johnsson"));
        people.add(new Person("Adam", "Samuelsson"));
        people.add(new Person("Ben", "Carlsson"));
        people.add(new Person("Ben", "Carlsson"));
        people.add(new Person("Cecilia", "Adams"));
        
        people.stream()
            .map(Person.TO_STRING)
            .forEachOrdered(System.out::println);
    }
}
Output
run:
Adam Johnsson
Adam Samuelsson
Ben Carlsson
Cecilia Adams
BUILD SUCCESSFUL (total time: 0 seconds)

The nice thing with this approach is that we can now replace the order and the formatting of the print without changing our Person class. This will make the code more maintainable and easier to reuse, not to say faster to write.

8 comments:

  1. "Instead of putting all the functionality into the Person class"... It's put in static variables in the Person class... so still in the Person class.

    These static functions crash on null references and you've simply punted on equality/hashcode. They're also far less discoverable, don't play with existing knowledge or libraries (e.g. debugging, utilities that call ToString, etc).
    They're actually *longer* when used in streams as well
    .map(Person.TO_STRING) vs
    .map(p -> p.toString())

    How is this good?

    ReplyDelete
    Replies
    1. ""Instead of putting all the functionality into the Person class"... It's put in static variables in the Person class... so still in the Person class."
      No, I would not consider static constants as part of any class. For the convenience of the programmer you can organize your constants inside the class declaration of a java-file, but since they are not included in any instance of that class and doesn't follow the rules of inheritance, it is quite different than a member of that class. This is apparent when you refactor code and want to move functionality. Removing a public function from a class can be quite difficult. Removing a public constant is easier and can often be performed automatically by the IDE.

      "These static functions crash on null references": If your system uses null references you will have to implement these in a different matter. Personally I don't like null and try to avoid it in any project I work on. I didn't want to include the guards in the example since it might confuse some readers, but if you put requireNonNull()-statements around the initializations in the constructor you can protect yourself against null-pointers in an earlier phase. Solving it open executing equals or hashCode is not very good since the original bug (that allowed a field to be set to null in the first place) is probably already lost from the stack trace.

      "you've simply punted on equality/hashcode": Of course, in a larger system you would need to test this thoroughly to make sure the hash and equals methods are working properly.

      "They're also far less discoverable": Yes, that is true, but also less prune to misinterpretations. When I see a class called "Person" in a third party library and I see that it is declared as "Comparable", I might asume that it is ordered by firstname then lastname and just start using it in my system. A problem here might be that the programmer that wrote that class choose to order it by lastname first, then firstname. The error might not be revealed until way later since the element appear to be ordered correctly. That is the primary reason why you shouldn't implement the compareTo()-method in every object.

      "don't play with existing knowledge or libraries (e.g. debugging, utilities that call ToString, etc)": In my opinion, ToString us used way to much today. Some classes (like HashMap and ArrayList) have very good toString implementations that can certainly help you when debugging. In the case with person on the other hand, I would not like the toString to be overridden. Why? Because it is an invite to bad code. Someone less experienced or too lazy working on the project might see the toString and think "this is perfect, I can get a string consisting of a persons first- and lastname in one call!" and start using it as a concatenation method. This is, as you probably know, not the intention of the toString, but at the same time you see this kind of usage everywhere. Removing the toString all together makes sure nobody misuses it. Most debugging tools worth the name can visualize the members of an object using reflection anyway, which also eliminates the risk that it is the toString-method that is at fault.

      "They're actually *longer* when used in streams as well": If you are going to count characters (which is totally fine to do ;) ) I would suggest statis importing the TO_STRING so that you get:
      .map(TO_STRING)
      which to me atleast appears to be shorter than:
      .map(p->p.toString())
      even with the spaces omitted.

      How is this good? I hope I have answered that question for you, otherwise I would be happy to continue this discussion as it has totally made my day.

      Delete
    2. "That is the primary reason why you shouldn't implement the compareTo()-method in every object."
      To that I agree but for different reasons.

      "I see that it is declared as "Comparable", I might asume that it is ordered by firstname then lastname"
      But what have your static comparator do to help solve the issue?
      If it is called COMPARATOR it solves nothing.
      I guess you will want to call it like COMPARE_FIRST_NAME_ALPHABETICAL.
      But you could illustrate that better in the post.

      "Because it is an invite to bad code."
      Bull shit.
      "but at the same time you see this kind of usage everywhere."
      Seen once or twice from fellow students but never in production code.
      Also ever heard of code review?

      "Removing the toString all together makes sure nobody misuses it."
      Hey let's remove ability to log so nobody misuses it.
      And putting anything to database so nobody breaks it.
      Also what stops anyone misusing TO_STRING?

      "Most debugging tools worth the name can visualize the members of an object using reflection anyway"
      Yeah! Let's connect debugger to production, what a splendid idea.
      Especially to diagnose this weird bug that client just reported and happened a week ago.

      Sorry but you have not convinced me that is a good idea.
      I don't see how those help having clear responsibilities in code.

      Delete
  2. Have to agree with Orion. This seems like very questionable advice. At the very least the example you include shows a Person class without overriding equals and hashcode then being put into a Set. It doesn't need to be tested. It's just broken.

    A better solution to this problem is to use one of the many libraries that do this for you or use a language that solves the problem such as Scala/Kotlin/Clojure/etc.

    ReplyDelete
    Replies
    1. Had the keys of the object been mutable, the yes, the class would have been broken. Since the entire object is an immutable it is totally fine to add it to a set without overriding the hashCode-method. The difference here is that the Object.hashCode method determines the hash based on identity comparison and not equality comparison. There is nothing in the constract of HashSet that says which comparison method should be used as long as the same object will always return the same hash.

      Delete
    2. Yes but if you are retrieving data from database or remote service you will have equal data stored in different objects.
      If you really don't want to write those methods just use lombok.

      Also if you don't want to store compareTo in class, you can implement it in the code which is comparing so it is non-issue.

      Delete
  3. Good use of functional concepts, specially the toString map, but why not use lombok https://projectlombok.org ?

    ReplyDelete
  4. We have almost established better grounds out here and hopefully for the future would also remain pretty much conscious as well. need help programming assignment

    ReplyDelete