Little Annoyances, Big Decisions

the primary benefit

                            February 11, 2021

                Little Annoyances, Big Decisions

                        Little annoyances, big decisions
A story
I think software engineers,
myself included,
are influenced heavily by annoyances, 
and those annoyances drive major decisions in our industry
far more than we'd like to admit.
I worked with someone,
a much more senior engineer than me at the time,
who strongly dislikes Java
and strongly prefers C++.
I'll call this person Bjarne 
to remind you that they love C++ (Bjarne Stroustrup created C++).
Years ago Bjarne got into a dispute
with another engineer
over the design of a new system,
which I'll call the Accounting Gateway.[¹]
I'll call this other engineer James 
(after James Gosling, who created Java),
because he wanted to write the new system in Java,
while Bjarne wanted C++.
This discussion was possible
because the Accounting Gateway communicated
via a REST API with other systems, 
most written in Java.
Well, Bjarne lost the battle 
and the Accounting Gateway was written in Java.
It had some problems—for example,
it had poor performance,
and James went a bit overboard on the fancy Java features—but
after a few years the system had matured,
and these issues were largely fixed.
James left the company,
but Bjarne was still around.
A new team took over the Accounting Gateway,
and actively maintained and improved it.
Later, Bjarne started a new project.
Its feature set overlapped largely with the Accounting Gateway,
so I'll call it the Auditing Gateway.
In fact, the Accounting Gateway was designed
in such a way that its internal components could be largely reused
for the Auditing Gateway,
albeit with some minor refactoring.
But Bjarne decided the Auditing Gateway
should be written in C++.
This is the point I want to scrutinize.
Reuse seems like the obvious choice to me,
and modulo exposing the internal components of the Accounting Gateway as microservices themselves
(which would take a lot of work), reuse means Java.
Reuse would also leverage years of engineering,
lessons learned, 
production-hardened code,
and requires less effort to build.
Quick and safe reuse
is the primary benefit of good software design.
Why waste that?
I suspect that Bjarne's opinion of Java
was a significant factor.
But when I asked Bjarne,
I never got more concrete answers than,
"I don't like Java," and falsehoods like,
"C++ is much more performant than Java."
What happened in the end?
Amazon bought the company 
and decided to rewrite everything anyway 
to fit in with Amazon's systems.
How to hate a language less
I wouldn't say I love Java,
but my usage of it eclipses all other languages
in the last four years.
I had used a lot of Java
in the mid 2000's as a student,
but since then much had changed,
especially in how Google does Java.
I was already familiar with
the concept of many of these practices,
such as heavily favoring immutable collection types,
and the analogues of Java streams in other languages.
Dependency injection [²]
was new to me,
but that's too big of a topic for this newsletter.
Instead, I want to show some incredibly mundane
things about Java (and my particular workflow)
that have drastically improved my feelings 
about working with Java.
That is, they're solutions to stupid annoyances,
so dumb that most people don't bother to think about them,
but without which, I would be incentivized
to make poorer decisions that build up to hating everything,
and maybe giving up on Java entirely like Bjarne.
I know because I actually started "hating everything" (almost)
about the Java I worked on,
but it was a vague cloud of contempt.
After repeatedly hitting roadblocks,
reflecting on the causes,
and coming up with better ways,
I reduced the contempt to the point that now
(looking at other people's C++ and Python code),
I prefer Java.
Though I like to think that,
through the same process in any other language,
I would develop similar ways to cope.
In the end, this showcases how to
invest in your tools to alleviate pain points.
It's a force multiplier, 
and hopefully you won't find yourself 
throwing your hard work away
in the hopes that a different language will save you.
Chances are good it won't.
AutoValue
People who hate Java
often complain about the excessive amount of boilerplate code.
I largely agree.
Everything is in a class,
and if you write a class to hold some data,
then you think about equals and hashCode
and if they don't agree 
then collections of your objects (like Set or Map)
start to behave strangely and stuff breaks.
Python gets around this with dataclass 
(or namedtuple, its backwater cousin),
though I don't know many other people besides me
who use dataclasses heavily.
I didn't used to use dataclasses, 
until I discovered the Java library
AutoValue,
and realized how much better it made Java.
AutoValue is basically a compiler preprocessor that auto-generates
dataclasses (with equals and hashCode)
from a succinct(er) syntax, 
using some trickery with class inheritance
to make the succincter part you write correct Java code.
An AutoValue looks like this
@AutoValue
abstract class Animal {
  abstract String name();
  abstract int numberOfLegs();

  static Animal create(String name, int numberOfLegs) {
    return new AutoValue_Animal(name, numberOfLegs);
  }
}

And it includes a bunch of variants that allow you to
have AutoValues with builder syntax,
to use factories, and to sensibly handle types like Optional.
A typical AutoValue I write these days looks a bit more complex, like
import com.google.auto.value.AutoValue;

/** javadoc */
@AutoValue
public abstract class Animal {

  public abstract String synthesizedField();

  /** Static factory method for {@link Factory.Builder}. */
  public static Factory.Builder builder() {
    return new AutoValue_Animal_Factory.Builder();
  }

  /** Factory for {@link Animal} */
  @AutoValue
  public abstract static class Factory {

    public abstract String name();

    public abstract int numberOfLegs();

    private Animal build() {
      // this would normally be lots of complex logic
      String synthesizedField = String.format("%s, %s", name(), numberOfLegs());
      return new AutoValue_Animal(mySynthesizedField);
    }

    /** Builder for {@link Factory}. */
    @AutoValue.Builder
    public abstract static class Builder {
      public abstract Factory autoBuild();

      public abstract Builder setName(String name);

      public abstract Builder setNumberOfLegs(int numberOfLegs);

      public final Animal build() {
        return autoBuild().build();
      }
    }
  }
}

Still looks verbose? It is, 
but to avoid the tedium of typing this out every time
(I make dozens of these)
I created a snippet 
and so I just type the following
and hit <C-j>, 
and then fill in the "synthesized" parts manually.
Here's the definition of the snippet.
autovaluefactory Animal String name int numberOfLegs

With a similar one for a "plain" AutoValue.
The benefit of the Factory is to hide the setup logic
internally, and the benefit of the builder is to allow
one to add and remove new builder pieces (or set defaults for some fields)
without method overloads or forcing all clients to update a complicated argument list,
as would be required with create(String, int, ...).
Some additional benefits of this 
are in reducing test-code boilerplate.
I often have AutoValues with factories that have ten or more fields,
and so in a unit test I can set up the default data in a helper, like so
class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() {
    return Animal.builder()
       .setName("Babe")
       .setNumberOfLegs(4);
  }

  @Test
  public void myTest() {
    Animal actual = defaultBuilder().setNumberOfLegs(3).build();
    asserThat(actual.getSynthesizedField()).isEqualTo("blah");
  }
}

Each test only sets the subset of fields 
that matter for that test's assertion.
Before I started using this style of AutoValue,
I resisted changing code, 
in part because the amount of work 
required to update the tests
far exceeded the amount of work to update the business logic.
It also had the side effect of me wanting to write fewer tests,
because I knew it incurred such a large maintenance burden.
AutoValues made writing (and reading!) tests much easier.
It's not an obvious benefit at first,
but a very meaningful one.
(Functional) Interfaces
Generally, it's a good idea to design software
in terms of interfaces and guarantees provided by those interfaces.
This allows implementation details to be changed
while reducing the likelihood of breaking client code.
Java embraces this ideology,
and I particularly like it for faking in tests.
Say the factory for the Animal class above
requires an instance of another class AnimalKingdomHierarchy,
which also has complicated setup.
In the test, your tests must instantiate
the AnimalKingdomHierarchy every time you create an Animal.
Many people get around this by using mocks,[³]
which introduces a whole host of other issues
like, for example, the fact that poorly-setup mocks
throw null-pointer errors at runtime.
And mock setup is often as cumbersome as setting up real objects.
However if AnimalKingdomHierarchy is an interface,
then you can fake the interface, 
meaning create an implementation that has real code backing it, 
but is otherwise a simplified version of the real instance.
For AnimalKingdomHierarchy it might look like this
public interface AnimalKingdomHierarchy {
  public Order getOrder(int numberOfLegs);
}

And in the test you could fake it like
class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() { ... }

  @Test
  public void myTest() {
    AnimalKingdomHierarchy hierarchy = new AnimalKingdomHierarchy() {
      @Override
      public Order getOrder(int numberOfLegs) {
        return numberOfLegs == 8 ?  Order.create("Araneae") : Order.create("unknown");
      }
    };
    Animal actual = 
      defaultBuilder()
        .setName("Babe")
        .setNumberOfLegs(8)
        .setAnimalKingdomHierarchy(hierarchy)
        .build();
    asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
  }
}

Most casual Java users are surprised when they realize
that you can create 
an ad hoc implementation 
of an interface in-line.
It seems people are turned off 
from interfaces because they think
that every implementation
needs to be in its own class,
and dealing with lots of files 
is another kind of annoying boilerplate.
Well, even that 6 lines of in-line instantiation is annoying,
since only one line of it is actual logic.
For single-method interfaces, 
Java makes it even simpler 
with the annotation @FunctionalInterface,
paired with anonymous function support.
First, you annotate your interface (which must have one method).
@FunctionalInterface
public interface AnimalKingdomHierarchy {
  public Order getOrder(int numberOfLegs);
}

Then, the compiler automatically casts any object
that has a single method matching the type signature of getOrder
to an instance of AnimalKingdomHierarchy whenever needed.
The test code could be rewritten as
class AnimalTest {

  private Animal.Factory.Builder defaultBuilder() { ... }

  @Test
  public void myTest() {
    // This can be inlined in the builder, but I made it a variable here to
    // emphasize that it compiles to this type.
    AnimalKingdomHierarchy hierarchy = 
      legs -> legs == 8 ? Order.create("Araneae") : Order.create("unknown");

    Animal actual = 
      defaultBuilder()
        .setName("Babe")
        .setNumberOfLegs(8)
        .setAnimalKingdomHierarchy(hierarchy)
        .build();
    asserThat(actual.getSynthesizedField()).isEqualTo("Spider Pig");
  }
}

This also works seamlessly with dependency injection.
The process of changing 
which instance of an interface is injected
requires only changing the line that @Binds
the interface type to the implementation type.
Because of these conveniences,
I've embraced functional interfaces.
Suppose there is a calculation 
of ownership of an Animal.
It starts out simple: 
an if statement using only other fields of Animal.
A hypothetical programmer might add it as a method to Animal,
and client code would call animal.owner().
Instead, I would have the client code
depend on a functional interface AnimalOwnerFn,
whose method has the type signature Animal -> Owner.
Invariably, I expect ownership to get more complex.
It may start to depend on other types,
like, say, an OwnershipHistory and a PoundLog.
With the object-oriented style,
these might be added as new fields on Animal,
which at best requires plumbing 
and likely updating tests for Animal.
At worst it requires extra computation,
because what subset of an entire Pound's log 
"belongs" on a specific animal instance?
Surely, not the entire pound log...
With FunctionalInterface—which 
might be called the "functional style",
though that term has baggage 
I don't mean to include here—you make 
a dedicated interface,
pass the interface to clients,
and the implementation details become less jarring
when they change.
Clients are forced to only depend on
the bit of logic they actually need,
and not the entire package of data in Animal 
if they don't need it.
And it allows you to keep data classes as only holding data,
rather than mixing in business policy.
@AutoValue
class RabiesCheckImpl implements RabiesCheckFn {
  ...

  public abstract AnimalOwnershipFn ownershipFn();
  public abstract HasRabiesFn hasRabiesFn();
  public abstract VaccineSchedulerFn vaccineSchedulerFn();

  public void doStuff(Animal animal) {
    Owner owner = ownershipFn().getOwner(animal);
    if (hasRabiesFn().hasRabies(animal)) {
      AppointmentRecord record = vaccineSchedulerFn.scheduleVaccine(owner);
    }
    ...
  }
}

This is the distinction that has made my work much easier:
anything that involves business logic
goes into a functional interface,
or, if not possible, a multi-method interface.
Then clients only depend on the interfaces,
even if one class implements multiple interfaces.
And when implementations get reorganized,
or pieces get reused,
the interfaces need not change.
And that's the sort of idea that can transfer
to another language,
whereas FunctionalInterface
is just how Java makes it palatable.

This was the name of the first project I was ever paid to work on,
  and I always think of the grandiose name and laugh, because it was plain,
  boring, accounting logic. ↩

I linked to the Guice library, which is by far the most widespread
  dependency injection framework for Java, and ubiquitous at Google. However, I
  prefer Dagger because it checks the dependency graph
  structure at compile time instead of runtime. I happily use both. ↩

I actually hate mocks, and I insist they are only ever used for
  things that cannot be reasonably faked. ↩

                            Don't miss what's next. Subscribe to Halfspace: