Halfspace

Subscribe
Archives
February 11, 2021

Little Annoyances, Big Decisions

Little annoyances, big decisions

A story

I think software engineers, myself included, are influenced heavily by annoyances, and those annoyances drive major decisions in our industry far more than we'd like to admit.

I worked with someone, a much more senior engineer than me at the time, who strongly dislikes Java and strongly prefers C++. I'll call this person Bjarne to remind you that they love C++ (Bjarne Stroustrup created C++).

Years ago Bjarne got into a dispute with another engineer over the design of a new system, which I'll call the Accounting Gateway.[1] I'll call this other engineer James (after James Gosling, who created Java), because he wanted to write the new system in Java, while Bjarne wanted C++. This discussion was possible because the Accounting Gateway communicated via a REST API with other systems, most written in Java.

Well, Bjarne lost the battle and the Accounting Gateway was written in Java. It had some problems—for example, it had poor performance, and James went a bit overboard on the fancy Java features—but after a few years the system had matured, and these issues were largely fixed. James left the company, but Bjarne was still around. A new team took over the Accounting Gateway, and actively maintained and improved it.

Later, Bjarne started a new project. Its feature set overlapped largely with the Accounting Gateway, so I'll call it the Auditing Gateway. In fact, the Accounting Gateway was designed in such a way that its internal components could be largely reused for the Auditing Gateway, albeit with some minor refactoring.

But Bjarne decided the Auditing Gateway should be written in C++. This is the point I want to scrutinize. Reuse seems like the obvious choice to me, and modulo exposing the internal components of the Accounting Gateway as microservices themselves (which would take a lot of work), reuse means Java. Reuse would also leverage years of engineering, lessons learned, production-hardened code, and requires less effort to build. Quick and safe reuse is the primary benefit of good software design. Why waste that?

I suspect that Bjarne's opinion of Java was a significant factor. But when I asked Bjarne, I never got more concrete answers than, "I don't like Java," and falsehoods like, "C++ is much more performant than Java." What happened in the end? Amazon bought the company and decided to rewrite everything anyway to fit in with Amazon's systems.

How to hate a language less

I wouldn't say I love Java, but my usage of it eclipses all other languages in the last four years. I had used a lot of Java in the mid 2000's as a student, but since then much had changed, especially in how Google does Java.

I was already familiar with the concept of many of these practices, such as heavily favoring immutable collection types, and the analogues of Java streams in other languages. Dependency injection [2] was new to me, but that's too big of a topic for this newsletter.

Instead, I want to show some incredibly mundane things about Java (and my particular workflow) that have drastically improved my feelings about working with Java. That is, they're solutions to stupid annoyances, so dumb that most people don't bother to think about them, but without which, I would be incentivized to make poorer decisions that build up to hating everything, and maybe giving up on Java entirely like Bjarne.

I know because I actually started "hating everything" (almost) about the Java I worked on, but it was a vague cloud of contempt. After repeatedly hitting roadblocks, reflecting on the causes, and coming up with better ways, I reduced the contempt to the point that now (looking at other people's C++ and Python code), I prefer Java. Though I like to think that, through the same process in any other language, I would develop similar ways to cope.

In the end, this showcases how to invest in your tools to alleviate pain points. It's a force multiplier, and hopefully you won't find yourself throwing your hard work away in the hopes that a different language will save you. Chances are good it won't.

AutoValue

People who hate Java often complain about the excessive amount of boilerplate code. I largely agree. Everything is in a class, and if you write a class to hold some data, then you think about equals and hashCode and if they don't agree then collections of your objects (like Set or Map) start to behave strangely and stuff breaks.

Python gets around this with dataclass (or namedtuple, its backwater cousin), though I don't know many other people besides me who use dataclasses heavily. I didn't used to use dataclasses, until I discovered the Java library AutoValue, and realized how much better it made Java.

AutoValue is basically a compiler preprocessor that auto-generates dataclasses (with equals and hashCode) from a succinct(er) syntax, using some trickery with class inheritance to make the succincter part you write correct Java code. An AutoValue looks like this

@AutoValue
abstract class Animal {
  abstract String name();
  abstract int numberOfLegs();

  static Animal create(String name, int numberOfLegs) {
    return new AutoValue_Animal(name, numberOfLegs);
  }
}

And it includes a bunch of variants that allow you to have AutoValues with builder syntax, to use factories, and to sensibly handle types like Optional.

A typical AutoValue I write these days looks a bit more complex, like

import com.google.auto.value.AutoValue;

/** javadoc */
@AutoValue
public abstract class Animal {

  public abstract String synthesizedField();

  /** Static factory method for {@link Factory.Builder}. */
  public static Factory.Builder builder() {
    return new AutoValue_Animal_Factory.Builder();
  }

  /** Factory for {@link Animal} */
  @AutoValue
  public abstract static class Factory {

    public abstract String name();

    public abstract int numberOfLegs();

    private Animal build() {
      // this would normally be lots of complex logic
      String synthesizedField = String.format("%s, %s", name(), numberOfLegs());
      return new AutoValue_Animal(mySynthesizedField);
    }

    /** Builder for {@link Factory}. */
    @AutoValue.Builder
    public abstract static class Builder {
      public abstract Factory autoBuild();

      public abstract Builder setName(String name);

      public abstract Builder setNumberOfLegs(int numberOfLegs);

      public final Animal build() {
        return autoBuild().build();
      }
    }
  }
}

Still looks verbose? It is, but to avoid the tedium of typing this out every time (I make dozens of these) I created a snippet and so I just type the following and hit <C-j>, and then fill in the "synthesized" parts manually. Here's the definition of the snippet.

autovaluefactory Animal String name int numberOfLegs

With a similar one for a "plain" AutoValue.

The benefit of the Factory is to hide the setup logic internally, and the benefit of the builder is to allow one to add and remove new builder pieces (or set defaults for some fields) without method overloads or forcing all clients to update a complicated argument list, as would be required with create(String, int, ...).

Some additional benefits of this are in reducing test-code boilerplate. I often have AutoValues with factories that have ten or more fields, and so in a unit test I can set up the default data in a helper, like so

class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() {
    return Animal.builder()
       .setName("Babe")
       .setNumberOfLegs(4);
  }

  @Test
  public void myTest() {
    Animal actual = defaultBuilder().setNumberOfLegs(3).build();
    asserThat(actual.getSynthesizedField()).isEqualTo("blah");
  }
}

Each test only sets the subset of fields that matter for that test's assertion. Before I started using this style of AutoValue, I resisted changing code, in part because the amount of work required to update the tests far exceeded the amount of work to update the business logic. It also had the side effect of me wanting to write fewer tests, because I knew it incurred such a large maintenance burden. AutoValues made writing (and reading!) tests much easier. It's not an obvious benefit at first, but a very meaningful one.

(Functional) Interfaces

Generally, it's a good idea to design software in terms of interfaces and guarantees provided by those interfaces. This allows implementation details to be changed while reducing the likelihood of breaking client code.

Java embraces this ideology, and I particularly like it for faking in tests. Say the factory for the Animal class above requires an instance of another class AnimalKingdomHierarchy, which also has complicated setup. In the test, your tests must instantiate the AnimalKingdomHierarchy every time you create an Animal. Many people get around this by using mocks,[3] which introduces a whole host of other issues like, for example, the fact that poorly-setup mocks throw null-pointer errors at runtime. And mock setup is often as cumbersome as setting up real objects.

However if AnimalKingdomHierarchy is an interface, then you can fake the interface, meaning create an implementation that has real code backing it, but is otherwise a simplified version of the real instance. For AnimalKingdomHierarchy it might look like this

public interface AnimalKingdomHierarchy {
  public Order getOrder(int numberOfLegs);
}

And in the test you could fake it like

class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() { ... }

  @Test
  public void myTest() {
    AnimalKingdomHierarchy hierarchy = new AnimalKingdomHierarchy() {
      @Override
      public Order getOrder(int numberOfLegs) {
        return numberOfLegs == 8 ?  Order.create("Araneae") : Order.create("unknown");
      }
    };
    Animal actual = 
      defaultBuilder()
        .setName("Babe")
        .setNumberOfLegs(8)
        .setAnimalKingdomHierarchy(hierarchy)
        .build();
    asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
  }
}

Most casual Java users are surprised when they realize that you can create an ad hoc implementation of an interface in-line. It seems people are turned off from interfaces because they think that every implementation needs to be in its own class, and dealing with lots of files is another kind of annoying boilerplate.

Well, even that 6 lines of in-line instantiation is annoying, since only one line of it is actual logic. For single-method interfaces, Java makes it even simpler with the annotation @FunctionalInterface, paired with anonymous function support. First, you annotate your interface (which must have one method).

@FunctionalInterface
public interface AnimalKingdomHierarchy {
  public Order getOrder(int numberOfLegs);
}

Then, the compiler automatically casts any object that has a single method matching the type signature of getOrder to an instance of AnimalKingdomHierarchy whenever needed. The test code could be rewritten as

class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() { ... }

  @Test
  public void myTest() {
    // This can be inlined in the builder, but I made it a variable here to
    // emphasize that it compiles to this type.
    AnimalKingdomHierarchy hierarchy = 
      legs -> legs == 8 ? Order.create("Araneae") : Order.create("unknown");

    Animal actual = 
      defaultBuilder()
        .setName("Babe")
        .setNumberOfLegs(8)
        .setAnimalKingdomHierarchy(hierarchy)
        .build();
    asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
  }
}

This also works seamlessly with dependency injection. The process of changing which instance of an interface is injected requires only changing the line that @Binds the interface type to the implementation type.

Because of these conveniences, I've embraced functional interfaces.

Suppose there is a calculation of ownership of an Animal. It starts out simple: an if statement using only other fields of Animal. A hypothetical programmer might add it as a method to Animal, and client code would call animal.owner(). Instead, I would have the client code depend on a functional interface AnimalOwnerFn, whose method has the type signature Animal -> Owner.

Invariably, I expect ownership to get more complex. It may start to depend on other types, like, say, an OwnershipHistory and a PoundLog. With the object-oriented style, these might be added as new fields on Animal, which at best requires plumbing and likely updating tests for Animal. At worst it requires extra computation, because what subset of an entire Pound's log "belongs" on a specific animal instance? Surely, not the entire pound log...

With FunctionalInterface—which might be called the "functional style", though that term has baggage I don't mean to include here—you make a dedicated interface, pass the interface to clients, and the implementation details become less jarring when they change. Clients are forced to only depend on the bit of logic they actually need, and not the entire package of data in Animal if they don't need it. And it allows you to keep data classes as only holding data, rather than mixing in business policy.

@AutoValue
class RabiesCheckImpl implements RabiesCheckFn {
  ...

  public abstract AnimalOwnershipFn ownershipFn();
  public abstract HasRabiesFn hasRabiesFn();
  public abstract VaccineSchedulerFn vaccineSchedulerFn();

  public void doStuff(Animal animal) {
    Owner owner = ownershipFn().getOwner(animal);
    if (hasRabiesFn().hasRabies(animal)) {
      AppointmentRecord record = vaccineSchedulerFn.scheduleVaccine(owner);
    }
    ...
  }
}

This is the distinction that has made my work much easier: anything that involves business logic goes into a functional interface, or, if not possible, a multi-method interface. Then clients only depend on the interfaces, even if one class implements multiple interfaces. And when implementations get reorganized, or pieces get reused, the interfaces need not change.

And that's the sort of idea that can transfer to another language, whereas FunctionalInterface is just how Java makes it palatable.


  1. This was the name of the first project I was ever paid to work on, and I always think of the grandiose name and laugh, because it was plain, boring, accounting logic. ↩

  2. I linked to the Guice library, which is by far the most widespread dependency injection framework for Java, and ubiquitous at Google. However, I prefer Dagger because it checks the dependency graph structure at compile time instead of runtime. I happily use both. ↩

  3. I actually hate mocks, and I insist they are only ever used for things that cannot be reasonably faked. ↩

Don't miss what's next. Subscribe to Halfspace:
This email brought to you by Buttondown, the easiest way to start and grow your newsletter.